Template Discovery & HuggingFace Integration

Syaala Platform provides seamless integration with HuggingFace Hub, giving you access to 500,000+ AI models with intelligent recommendations and auto-configuration.

Phase 18 Feature: Template-based Inference-as-a-Service (IaaS) architecture with personalized recommendations based on your use case and budget.

Overview

The Template Discovery system provides:

Personalized Recommendations: AI-powered suggestions based on your use case and budget
HuggingFace Integration: Search and deploy from 500K+ models on HuggingFace Hub
Auto-Configuration: Automatic runtime, GPU, and Docker setup
Cost Estimation: Real-time pricing for GPU types and deployment configurations
Community Sharing: Publish and discover templates from the Syaala community

Personalized Recommendations

Get AI-powered template recommendations tailored to your specific needs.

Via Dashboard

Navigate to Dashboard → Templates
Complete the quick survey (optional):
- Use Case: Text generation, image processing, etc.
- Monthly Budget: Low ($100), Medium ($500), High ($5000+)
View personalized recommendations with:
- Auto-configured GPU type
- Estimated monthly cost
- One-click deployment

Via CLI

# Get recommendations based on use case
syaala models recommend --use-case text-generation --budget medium
 
# Interactive mode with prompts
syaala models recommend --interactive

Example Output:

✓ Fetching personalized recommendations...

Recommended Templates for text-generation (Medium Budget: $500/month)

┌────────────────────────────────────────────────────────────────────┐
│ 1. Llama 3.3 70B Instruct                                          │
│    Runtime: vllm                                                   │
│    GPU: NVIDIA-A40 (1x)                                            │
│    Cost: ~$456/month (95% of budget)                               │
│    Tags: llm, instruction-following, chat                          │
│                                                                     │
│    ► syaala deployments create --template tpl_llama33_70b          │
└────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────┐
│ 2. Mistral 7B Instruct v0.3                                        │
│    Runtime: vllm                                                   │
│    GPU: NVIDIA-RTX-4090 (1x)                                       │
│    Cost: ~$234/month (47% of budget)                               │
│    Tags: llm, efficient, general-purpose                           │
│                                                                     │
│    ► syaala deployments create --template tpl_mistral_7b           │
└────────────────────────────────────────────────────────────────────┘

HuggingFace Model Discovery

Search and explore models from HuggingFace Hub directly through Syaala.

Search Models

Via CLI

# Search by keyword
syaala models discover "llama"
 
# Filter by task
syaala models discover "image classification" --task image-classification
 
# Sort by downloads or likes
syaala models discover "stable diffusion" --sort downloads

Example Output:

✓ Found 234 models matching "llama"

┌────────────────────────────────────────────────────────────────────┐
│ meta-llama/Llama-3.3-70B-Instruct                                  │
│ Downloads: 2.4M  Likes: 15.2K  Task: text-generation               │
│                                                                     │
│ Suggested GPU: NVIDIA-A40                                          │
│ Estimated Cost: ~$456/month                                        │
│                                                                     │
│ ► syaala templates create-from-hf \                                │
│     meta-llama/Llama-3.3-70B-Instruct \                            │
│     --name "Llama 3.3 70B" --category llm                          │
└────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────┐
│ meta-llama/Llama-3.1-8B-Instruct                                   │
│ Downloads: 5.7M  Likes: 23.1K  Task: text-generation               │
│                                                                     │
│ Suggested GPU: NVIDIA-RTX-4090                                     │
│ Estimated Cost: ~$234/month                                        │
│                                                                     │
│ ► syaala templates create-from-hf \                                │
│     meta-llama/Llama-3.1-8B-Instruct \                             │
│     --name "Llama 3.1 8B" --category llm                           │
└────────────────────────────────────────────────────────────────────┘

Via API

curl "https://platform.syaala.com/api/huggingface/search?query=llama&task=text-generation" \
  -H "Authorization: Bearer sk_live_..."

Response:

{
  "models": [
    {
      "id": "meta-llama/Llama-3.3-70B-Instruct",
      "downloads": 2400000,
      "likes": 15200,
      "task": "text-generation",
      "license": "llama3",
      "suggestedGpu": "NVIDIA-A40",
      "estimatedCost": 456,
      "autoConfigured": {
        "runtime": "vllm",
        "gpuType": "NVIDIA-A40",
        "gpuCount": 1,
        "dockerImage": "runpod/pytorch:2.1.0-py3.10-cuda12.1.0-devel"
      }
    }
  ],
  "total": 234
}

Creating Templates from HuggingFace

Convert any HuggingFace model into a deployable Syaala template.

Quick Start

Search for Model

syaala models discover "mistral instruct"

Create Template

syaala templates create-from-hf \
  mistralai/Mistral-7B-Instruct-v0.3 \
  --name "Mistral 7B Instruct" \
  --category llm \
  --tags "instruct,chat,efficient" \
  --visibility public

Deploy Template

syaala deployments create --template tpl_mistral_7b_xyz

Command Reference

syaala templates create-from-hf <model-id> [options]

Options:

Flag	Required	Description
`--name <name>`	Yes	Display name for the template
`--category <category>`	Yes	Template category: `llm`, `vision`, `multimodal`, `audio`, `embedding`
`--description <desc>`	No	Custom description (defaults to HuggingFace description)
`--tags <tags>`	No	Comma-separated tags (max 10)
`--visibility <vis>`	No	`public` or `private` (default: `public`)

Example:

syaala templates create-from-hf \
  meta-llama/Llama-3.3-70B-Instruct \
  --name "Llama 3.3 70B Instruct" \
  --category llm \
  --tags "chat,instruction-following,large-context" \
  --description "Meta's latest 70B parameter model with extended context" \
  --visibility public

Output:

✓ Template created successfully!

Template Details:
  ID: tpl_abc123xyz
  Name: Llama 3.3 70B Instruct
  HuggingFace Model: meta-llama/Llama-3.3-70B-Instruct
  Category: llm
  Visibility: public

Auto-Configuration:
  Runtime: vllm
  GPU Type: NVIDIA-A40
  GPU Count: 1
  Docker Image: runpod/pytorch:2.1.0-py3.10-cuda12.1.0-devel

Estimated Cost: ~$456/month

Deploy this template:
  syaala deployments create --template tpl_abc123xyz

Auto-Configuration Details

Syaala automatically configures optimal settings for HuggingFace models:

Runtime Selection

vLLM: For decoder-only LLMs (Llama, Mistral, GPT-style)
Triton: For encoder models and multi-modal inference
FastAPI: For custom models or special requirements

GPU Selection

Based on model size and task:

Model Size	Recommended GPU	Monthly Cost
< 7B params	NVIDIA-RTX-4090	~$234
7B - 13B	NVIDIA-L40S	~$312
13B - 70B	NVIDIA-A40	~$456
70B+	NVIDIA-A100 (80GB)	~$912

Docker Image

Pre-configured images with:

PyTorch 2.1+
CUDA 12.1+
Runtime-specific dependencies (vLLM, Triton, etc.)
Model-specific optimizations

Share your templates with the Syaala community.

Publishing Templates

# Create template (defaults to public visibility)
syaala templates create-from-hf \
  mistralai/Mixtral-8x7B-Instruct-v0.1 \
  --name "Mixtral 8x7B Instruct" \
  --category llm \
  --visibility public

Discovering Community Templates

# List all public templates
syaala templates list --visibility public
 
# Filter by category
syaala templates list --category llm
 
# Search by tags
syaala templates list --tags chat,instruction-following

Template Engagement Tracking

Syaala tracks template usage to surface popular configurations:

Deployment Count: How many times the template has been deployed
Success Rate: Percentage of successful deployments
Average Cost: Real-world cost data from deployments
User Ratings: Community feedback and ratings

API Integration

Integrate template discovery into your applications.

Get Recommendations

const response = await fetch(
  "https://platform.syaala.com/api/templates/recommendations?useCase=text-generation&maxBudget=500",
  {
    headers: {
      Authorization: `Bearer ${apiKey}`,
    },
  },
);
 
const { recommendations } = await response.json();

Search HuggingFace

const response = await fetch(
  "https://platform.syaala.com/api/huggingface/search?query=llama&task=text-generation",
  {
    headers: {
      Authorization: `Bearer ${apiKey}`,
    },
  },
);
 
const { models, total } = await response.json();

Create Template from HuggingFace

const response = await fetch(
  "https://platform.syaala.com/api/templates/save-from-hf",
  {
    method: "POST",
    headers: {
      Authorization: `Bearer ${apiKey}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      modelId: "meta-llama/Llama-3.3-70B-Instruct",
      name: "Llama 3.3 70B Instruct",
      category: "llm",
      tags: ["chat", "instruction-following"],
      visibility: "public",
    }),
  },
);
 
const { template } = await response.json();

Best Practices

Model Selection

Start Small: Begin with 7B models (RTX 4090) before scaling to 70B+ models
Check Licenses: Verify commercial use permissions for production deployments
Test Throughput: Use recommended GPU types for optimal performance
Monitor Costs: Track actual usage vs. estimates for budget optimization

Template Creation

Meaningful Names: Use descriptive names that indicate model size and purpose
Accurate Tags: Add relevant tags for discoverability (max 10)
Public First: Share successful templates publicly to help the community
Update Descriptions: Customize descriptions with use case specifics

Deployment

Start with Recommendations: Use AI-recommended configurations first
Validate Endpoints: Test inference endpoints before production traffic
Set Budget Limits: Configure spending alerts for cost control
Monitor Performance: Track latency, throughput, and error rates

Troubleshooting

Model Not Found

Error: HuggingFace model 'invalid/model-id' not found

Solution: Verify model ID exists on HuggingFace Hub:

Visit https://huggingface.co/{model-id}
Check for typos in organization/model name
Ensure model is public or you have access

GPU Unavailable

Error: Requested GPU type 'NVIDIA-H100' is not available

Solution: Use alternative GPU or request access:

Check available GPUs: syaala gpu-types list
Use recommended GPU from discovery results
Contact support for enterprise GPU access

Budget Exceeded

Warning: Estimated cost ($912/month) exceeds budget ($500/month)

Solution: Adjust configuration or budget:

Use smaller model variant (70B → 7B)
Reduce GPU count or type (A100 → A40)
Increase budget limit in settings

CLI Commands Reference - Complete CLI documentation
API Reference - REST API endpoints
Deployment Guide - Creating and managing deployments
Cost Optimization - Reducing GPU costs

Support

Need help with template discovery?

Discord Community - Ask questions and share templates
GitHub Discussions - Feature requests
Support Email - Enterprise assistance

Monitoring API Integration Guide

Template Discovery & HuggingFace Integration

Overview

Personalized Recommendations

Via Dashboard

Via CLI

HuggingFace Model Discovery

Search Models

Via CLI

Via API

Creating Templates from HuggingFace

Quick Start

Search for Model

Create Template

Deploy Template

Command Reference

Auto-Configuration Details

Runtime Selection

GPU Selection

Docker Image

Community Template Sharing

Publishing Templates

Discovering Community Templates

Template Engagement Tracking

API Integration

Get Recommendations

Search HuggingFace

Create Template from HuggingFace

Best Practices

Model Selection

Template Creation

Deployment

Troubleshooting

Model Not Found

GPU Unavailable

Budget Exceeded

Related Documentation

Support