CompactifAI/CompactifAI
Captured source
source ↗CompactifAI/CompactifAI
Description: CompactifAI official repository
Stars: 2
Forks: 0
Open issues: 0
Created: 2025-09-25T12:13:03Z
Pushed: 2025-09-25T13:49:20Z
Default branch: main
Fork: no
Archived: no
README:
👋 Welcome developer!
CompactifAI API empowers organizations and individuals with seamless access to ultra-efficient and scalable AI models that slash compute and energy costs, accelerate deployment, and fuel innovation, all without compromising performance or reliability.
This repo offers practical and scalable examples to help you build intelligent applications with our state-of-the-art language models
🎯 What’s Inside?
- Plug-and-play integrations with popular frameworks
- Real-world examples and benchmarks
- API-ready utilities for deployment
⭐ CompactifAI API models
> Use Model ID to call the model in the API
|Model| Model ID | |--|--| | DeepSeek R1 0528 Slim by CompactifAI | cai-deepseek-r1-0528-slim | | DeepSeek R1 0528 | deepseek-r1-0528 | |Llama 4 Scout Slim by CompactifAI |cai-llama-4-scout-slim| | Llama 4 Scout |llama-4-scout| |Llama 3.3 70B Slim by CompactifAI |cai-llama-3-3-70b-slim| |Llama 3.3 70B|llama-3-3-70b| |Llama 3.1 8B Slim by CompactifAI |cai-llama-3-1-8b-slim| |Llama 3.1 8B Slim Reasoning by CompactifAI|cai-llama-3-1-8b-slim-r| |Llama 3.1 8B|llama-3-1-8b| |Mistral Small 3.1 Slim by CompactifAI|cai-mistral-small-3-1-slim| |Mistral Small 3.1|mistral-small-3-1| |Openai GPT OSS 20B|gpt-oss-20b| |Openai GPT OSS 120B|gpt-oss-120b|
🛠️ CompactifAI API features
- Completions API: Generate text completions based on provided prompts
- Chat Completions API: Generate conversational responses using chat-based interaction
- Models API: List and get information about available compressed models
- Function tool compatibility is activated in all models except Mistral.
- Multimodality: CompactifAI API’s chat completion endpoint supports multi-modality, empowering you to seamlessly process and generate across text and images.
🧩 API Compatibility
The CompactifAI API is designed to be compatible with the OpenAI standard, allowing for straightforward migration and integration with existing systems. Our endpoints follow similar patterns and accept compatible parameters.
🚀 Quickstart
This guide will help you make your first request to the CompactifAI API in minutes.
Sign up
Before you begin, you’ll need a CompactifAI API key.
You can ask Sign Up here or ask for sign up using our Discord Community.
> Please see our authentication guide for more information.
First API call
This is a Python example:
import requests api_key = "YOUR_API_KEY" url = "https://api.compactif.ai/v1/chat/completions"
headers = { "Content-Type": "application/json", "Authorization": f"Bearer {api_key}" }
data = { "model": "cai-llama-3-1-8b-slim", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello, how are you?"} ] }
response = requests.post(url, headers=headers, json=data) print(response.json())
> Find all the information and Quickstart at our official documentation and in the code examples of this repository.
📁 What's in this repository
|Relative route | Description | |--|--| | src / intro | Introductory set of scripts to start using CompactifAI API | | src / benchmarks | Scripts to evaluate different models | | src / agents | Example agents using CompactifAI API and popular agents frameworks | | src / notebooks | Example notebooks using CompactifAI API | | docs | Official documentation and relevant support documents
👾 Join Our Community
Be part of the conversation and get help directly from the team and other users.
❓Support
If you encounter any issues or have questions about using our API, please check our FAQ or contact our support team.
🔒 Privacy
At CompactifAI, your data privacy is our top priority.
- Zero Data Retention (ZDR) by default: we never store your prompts or completions. Once the request is processed, the content is not retained on our systems.
- Minimal metadata only: the only information we retain is usage metadata required for billing, specifically:
- Which models are used
- Token counts (input and output)
- No training from your data: your prompts and completions are never used to train our models.
This approach ensures full privacy of your data: what you send to the API remains ephemeral, while only essential billing data persists.
About Multiverse Computing
Notability
notability 2.0/10Low stars, minimal traction.