nebius/serverless-ai-cookbook
Python
Captured source
source ↗nebius/serverless-ai-cookbook
Language: Python
License: Apache-2.0
Stars: 6
Forks: 4
Open issues: 2
Created: 2026-04-16T10:10:51Z
Pushed: 2026-06-04T05:16:52Z
Default branch: main
Fork: no
Archived: no
README:
Serverless AI Cookbook
Run GPU workloads on Nebius Serverless — no infrastructure management, per-second billing, GPU in minutes.
This repo contains runnable code samples for Serverless AI Jobs (batch workloads that auto-terminate) and Endpoints (persistent HTTP-accessible services). Examples cover model training, fine-tuning, inference serving, AI agents, and scientific simulations.
Quickstart (30 seconds)
Spin up a GPU job and verify your setup:
nebius ai job create \
--name my-first-job \
--image nvidia/cuda:13.1.1-runtime-ubuntu24.04 \
--container-command bash \
--args "-c nvidia-smi" \
--platform gpu-l40s-a \
--preset 1gpu-8vcpu-32gb \
--timeout 15m
# Get the job ID and stream logs
export JOB_ID=$(nebius ai job get-by-name --name my-first-job \
--format jsonpath='{.metadata.id}')
nebius ai logs "$JOB_ID"→ Full walkthrough: [first-job.md](./quickstarts/first-job.md)
Prerequisites
0. Create a Nebius account and set up a project 1. Install the Nebius CLI 2. Configure your CLI profile
Example catalog
Pick the section that matches your goal — each links to runnable examples:
- 🚀 [Quickstarts](#-quickstarts) — lowest-friction first runs.
- 🏋️ [Training](#️-training) — model training and fine-tuning workloads.
- ⚡ [Inference](#-inference) — endpoint serving and batch inference workloads.
- 🤖 [Agents](#-agents) — AI gateway and agent deployments.
- 🧬 [Life Science](#-life-science) — domain-specific simulation and analysis workloads.
- 🦾 [Robotics](#-robotics) — simulation, dataset generation, and robotics workflows.
🚀 Quickstarts
Lowest-friction first runs.
- [
first-job](./quickstarts/first-job.md) — runnvidia-smiin a Serverless AI job - [
first-endpoint](./quickstarts/first-endpoint.md) — deploy a quicknginxendpoint
🏋️ Training
Model training and fine-tuning workloads.
- [
axolotl-finetuning](./training/axolotl-finetuning/README.md) — get started fine-tuning with Axolotl - [
image-classifier-finetuning](./training/image-classifier-finetuning/README.md) — fine-tune an image classifier on a HuggingFace dataset in a serverless GPU job - [
train-and-serve](./training/train-and-serve/README.md) — fine-tune TinyLlama in a Job and serve it with a vLLM Endpoint
⚡ Inference
Endpoint serving and batch inference workloads.
- [
vllm-endpoint](./inference/vllm-endpoint/README.md) — serve Qwen with an OpenAI-compatible vLLM endpoint - [
nim-endpoint](./inference/nim-endpoint/README.md) — deploy an NVIDIA NIM as an endpoint, including the large-image Container Registry workaround
🤖 Agents
AI gateway and agent deployments.
- [
openclaw](./agents/openclaw/README.md) — deploy OpenClaw AI gateway on a CPU endpoint, connected to TokenFactory
🧬 Life Science
Domain-specific simulation and analysis workloads.
- [
openmm-simulation](./life-science/openmm-simulation/README.md) — run GPU-backed molecular dynamics simulations with OpenMM - [
parabricks-deepvariant](./life-science/parabricks-deepvariant/README.md) — run NVIDIA Parabricks DeepVariant genomics workflows with Nebius AI Jobs
🦾 Robotics
Robotics and physical-AI experiment loops.
- [
lerobot-finetune-job](./robotics/lerobot-finetune-job/README.md) — fine-tune a LeRobot ACT or Diffusion policy on a robotics dataset in a serverless GPU job - [
smolva-ft-norma-core](./robotics/smolva-ft-norma-core/README.md) — fine-tune SmolVLA for SO-101 with bundled trajectories
---
Awesome Community Projects
External examples and writeups from the community running serverless workloads on Nebius. Got something to add? Open a PR.
Robotics
- 🤖 Positronic + Nebius serverless workflows — Convert datasets, train ACT/SmolVLA, and serve checkpoints as endpoints — all serverless on Nebius. — *by vertix* · 💻 code
- 🦾 norma-core SmolVLA — Nebius fine-tune recipe — Upstream recipe the [
robotics/smolva-ft-norma-core](./robotics/smolva-ft-norma-core/) example mirrors. — *by norma-core* · 💻 code
MLOps / Pipelines
- 🎬 Video transcription pipeline with Prefect + Nebius — Prefect flows orchestrating S3 + ffmpeg (CPU job) + Whisper (GPU job) on Nebius. — *by Darko Mesaros* · 💻 code · 📝 post
---
Repository structure
serverless-cookbook/ ├── quickstarts/ # Lowest-friction first runs ├── training/ # Model training and fine-tuning ├── inference/ # Endpoint serving and batch inference ├── agents/ # AI gateway and agent deployments ├── life-science/ # Domain-specific simulations ├── robotics/ # Robotics and physical-AI workflows ├── CONTRIBUTING.md └── DEVELOPER_GUIDE.md
Resources
- Nebius Console
- Serverless AI docs
- CLI reference
- [Contributing guide](./CONTRIBUTING.md)
- [Developer guide](./DEVELOPER_GUIDE.md)
Acknowledgements
This repository is based on mnrozhkov/serverless-cookbook. Thanks to the original contributors: Mikhail Rozhkov, Gleb Berjoskin, Aleksandr Dzhumurat, and Re Alvarez Parmar.
See [CONTRIBUTORS.md](./CONTRIBUTORS.md) for the full list.
License
Copyright 2026 Nebius B.V.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the…
Excerpt shown — open the source for the full document.
Notability
notability 1.0/10Low star count, trivial new repo