Clarifai/runners-examples
Python
Captured source
source ↗Clarifai/runners-examples
Language: Python
Stars: 1
Forks: 1
Open issues: 33
Created: 2025-03-28T14:53:32Z
Pushed: 2026-05-20T09:22:09Z
Default branch: main
Fork: no
Archived: no
README: !image
Clarifai Model Deployment
Clarifai provides an easy-to-use platform to serve AI/ML models in production.
This guide walks you through deploying models on Clarifai — from scaffolding a project to running predictions in production.
Quick Start
Three commands to go from zero to a deployed model:
pip install -U clarifai # 1. Authenticate clarifai login # 2. Scaffold a model (auto-selects GPU based on model size) clarifai model init --toolkit vllm --model-name Qwen/Qwen3-0.6B # 3. Deploy to production (auto-creates all infrastructure) clarifai model deploy ./Qwen3-0.6B
The CLI handles compute cluster creation, nodepool setup, image building, and deployment automatically. When it finishes you'll see:
── Ready ────────────────────────────────────────────── Model deployed successfully! Model: https://clarifai.com/you/main/models/qwen3-0-6b Deployment: deploy-qwen3-0-6b-abc123 Instance: g5.xlarge ── Next Steps ───────────────────────────────────────── Predict: clarifai model predict you/main/models/qwen3-0-6b "Hello" Logs: clarifai model logs --deployment "deploy-qwen3-0-6b-abc123" Undeploy: clarifai model undeploy --deployment "deploy-qwen3-0-6b-abc123"
Which path is right for you?
| I want to... | Do this | |--------------|---------| | Deploy a HuggingFace LLM (Qwen, Llama, Gemma, etc.) | clarifai model init --toolkit vllm --model-name then clarifai model deploy | | Deploy an MCP tool server | clarifai model init --toolkit mcp then write your tools | | Wrap an existing API or local server | clarifai model init --toolkit openai or --toolkit ollama | | Build a fully custom model (vision, embeddings, etc.) | clarifai model init then edit model.py — see [Writing a Custom Model](#writing-a-custom-model) |
Contents
- [Quick Start](#quick-start)
- [Installation & Authentication](#installation--authentication)
- [Model Development Workflow](#model-development-workflow)
- [Step 1: Scaffold a Model](#step-1-scaffold-a-model-clarifai-model-init)
- [Step 2: Test Locally](#step-2-test-locally-clarifai-model-serve)
- [Step 3: Deploy to Production](#step-3-deploy-to-production-clarifai-model-deploy)
- [Step 4: Run Predictions](#step-4-run-predictions-clarifai-model-predict)
- [Step 5: Manage Your Deployment](#step-5-manage-your-deployment)
- [Writing a Custom Model](#writing-a-custom-model)
- [Model Folder Structure](#model-folder-structure)
- [config.yaml](#configyaml)
- [model.py](#modelpy)
- [requirements.txt](#requirementstxt)
- [Model Types](#model-types)
- [Python Model (ModelClass)](#python-model-modelclass)
- [LLM / OpenAI-Compatible Model (OpenAIModelClass)](#llm--openai-compatible-model-openaimodelclass)
- [MCP Tool Server (MCPModelClass)](#mcp-tool-server-mcpmodelclass)
- [Using the Python SDK Client](#using-the-python-sdk-client)
- [Available Examples](#available-examples)
- [Supported Data Types](#supported-data-types)
- [Advanced Topics](#advanced-topics)
---
Installation & Authentication
Install the SDK
pip install -U clarifai
Authenticate
# Interactive (prompts for your PAT) clarifai login # Non-interactive clarifai login --pat $MY_PAT # Org account clarifai login --pat $PAT --user-id my-org # Dev environment clarifai login https://api-dev.clarifai.com
The CLI saves your credentials locally. Verify with:
clarifai whoami
You can manage multiple environments (prod, staging, org accounts) using named contexts:
clarifai config ls # List all contexts clarifai config use # Switch active context
> Note: You can generate a PAT in your Clarifai account under Personal Settings → Security.
---
Model Development Workflow
init → serve (optional) → deploy → predict → manage scaffold test locally push to run status / logs / project before deploying production inference undeploy
Step 1: Scaffold a Model (clarifai model init)
Generate a ready-to-deploy model project with a single command:
# Deploy a HuggingFace LLM with vLLM clarifai model init --toolkit vllm --model-name Qwen/Qwen3-0.6B # Deploy with SGLang clarifai model init --toolkit sglang --model-name Qwen/Qwen2-7B # Deploy a HuggingFace model directly clarifai model init --toolkit huggingface --model-name google/gemma-2b # Wrap a local Ollama model clarifai model init --toolkit ollama --model-name llama3.1 # Create an MCP tool server clarifai model init --toolkit mcp my-mcp-server # Wrap an OpenAI-compatible API clarifai model init --toolkit openai my-wrapper # Blank Python model (full control) clarifai model init my-model
Available toolkits:
| Toolkit | Use Case | |---------|----------| | vllm | High-throughput LLM serving with vLLM | | sglang | Fast LLM serving with SGLang | | huggingface | HuggingFace Transformers (direct inference) | | ollama | Wrap a local Ollama model | | lmstudio | Wrap a local LM Studio model | | mcp | MCP tool server (FastMCP) | | openai | OpenAI-compatible API wrapper | | python | Blank Python model (default) |
What it creates:
my-model/ ├── config.yaml # Simplified config (auto-filled at deploy time) ├── requirements.txt # Dependencies └── 1/ └── model.py # Model implementation
Smart GPU selection: For HuggingFace models, the CLI queries the model's metadata (parameter count, quantization, architecture) and auto-selects the smallest GPU instance that fits:
$ clarifai model init --toolkit vllm --model-name Qwen/Qwen3-4B Instance: g5.xlarge (Estimated 15.9 GiB VRAM, fits g5.xlarge (22 GiB))
---
Step 2: Test Locally (clarifai model serve)
Before deploying, test your model locally:
# Run in current Python environment (fastest) clarifai model serve ./my-model # Auto-create virtualenv and install deps clarifai model serve ./my-model --mode env # Build and run inside Docker (recommended for production parity) clarifai model serve ./my-model --mode container # Standalone gRPC server (no login required, offline development) clarifai model serve ./my-model --grpc --port 9000
**What happens (default…
Excerpt shown — open the source for the full document.
Notability
notability 2.0/10Routine examples repo, minimal traction