CompactifAI/CompactifAI

Open original ↗

Captured source

source ↗
published Sep 25, 2025seen 5dcaptured 13hhttp 200method plain

CompactifAI/CompactifAI

Description: CompactifAI official repository

Stars: 2

Forks: 0

Open issues: 0

Created: 2025-09-25T12:13:03Z

Pushed: 2025-09-25T13:49:20Z

Default branch: main

Fork: no

Archived: no

README:

👋 Welcome developer!

CompactifAI API empowers organizations and individuals with seamless access to ultra-efficient and scalable AI models that slash compute and energy costs, accelerate deployment, and fuel innovation, all without compromising performance or reliability.

This repo offers practical and scalable examples to help you build intelligent applications with our state-of-the-art language models

🎯 What’s Inside?

  • Plug-and-play integrations with popular frameworks
  • Real-world examples and benchmarks
  • API-ready utilities for deployment

⭐ CompactifAI API models

> Use Model ID to call the model in the API

|Model| Model ID | |--|--| | DeepSeek R1 0528 Slim by CompactifAI | cai-deepseek-r1-0528-slim | | DeepSeek R1 0528 | deepseek-r1-0528 | |Llama 4 Scout Slim by CompactifAI |cai-llama-4-scout-slim| | Llama 4 Scout |llama-4-scout| |Llama 3.3 70B Slim by CompactifAI |cai-llama-3-3-70b-slim| |Llama 3.3 70B|llama-3-3-70b| |Llama 3.1 8B Slim by CompactifAI |cai-llama-3-1-8b-slim| |Llama 3.1 8B Slim Reasoning by CompactifAI|cai-llama-3-1-8b-slim-r| |Llama 3.1 8B|llama-3-1-8b| |Mistral Small 3.1 Slim by CompactifAI|cai-mistral-small-3-1-slim| |Mistral Small 3.1|mistral-small-3-1| |Openai GPT OSS 20B|gpt-oss-20b| |Openai GPT OSS 120B|gpt-oss-120b|

🛠️ CompactifAI API features

  • Completions API: Generate text completions based on provided prompts
  • Chat Completions API: Generate conversational responses using chat-based interaction
  • Models API: List and get information about available compressed models
  • Function tool compatibility is activated in all models except Mistral.
  • Multimodality: CompactifAI API’s chat completion endpoint supports multi-modality, empowering you to seamlessly process and generate across text and images.

🧩 API Compatibility

The CompactifAI API is designed to be compatible with the OpenAI standard, allowing for straightforward migration and integration with existing systems. Our endpoints follow similar patterns and accept compatible parameters.

🚀 Quickstart

This guide will help you make your first request to the CompactifAI API in minutes.

Sign up

Before you begin, you’ll need a CompactifAI API key.

You can ask Sign Up here or ask for sign up using our Discord Community.

> Please see our authentication guide for more information.

First API call

This is a Python example:

import requests api_key = "YOUR_API_KEY" url = "https://api.compactif.ai/v1/chat/completions"

headers = { "Content-Type": "application/json", "Authorization": f"Bearer {api_key}" }

data = { "model": "cai-llama-3-1-8b-slim", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello, how are you?"} ] }

response = requests.post(url, headers=headers, json=data) print(response.json())

> Find all the information and Quickstart at our official documentation and in the code examples of this repository.

📁 What's in this repository

|Relative route | Description | |--|--| | src / intro | Introductory set of scripts to start using CompactifAI API | | src / benchmarks | Scripts to evaluate different models | | src / agents | Example agents using CompactifAI API and popular agents frameworks | | src / notebooks | Example notebooks using CompactifAI API | | docs | Official documentation and relevant support documents

👾 Join Our Community

Be part of the conversation and get help directly from the team and other users.

👉 Join our Discord

❓Support

If you encounter any issues or have questions about using our API, please check our FAQ or contact our support team.

🔒 Privacy

At CompactifAI, your data privacy is our top priority.

  • Zero Data Retention (ZDR) by default: we never store your prompts or completions. Once the request is processed, the content is not retained on our systems.
  • Minimal metadata only: the only information we retain is usage metadata required for billing, specifically:
  • Which models are used
  • Token counts (input and output)
  • No training from your data: your prompts and completions are never used to train our models.

This approach ensures full privacy of your data: what you send to the API remains ephemeral, while only essential billing data persists.

About Multiverse Computing

Check our website

Notability

notability 2.0/10

Low stars, minimal traction.