RepoNebiusNebiuspublished Apr 16, 2026seen 5d

nebius/serverless-ai-cookbook

Python

Open original ↗

Captured source

source ↗
published Apr 16, 2026seen 5dcaptured 14hhttp 200method plain

nebius/serverless-ai-cookbook

Language: Python

License: Apache-2.0

Stars: 6

Forks: 4

Open issues: 2

Created: 2026-04-16T10:10:51Z

Pushed: 2026-06-04T05:16:52Z

Default branch: main

Fork: no

Archived: no

README:

Serverless AI Cookbook

Run GPU workloads on Nebius Serverless — no infrastructure management, per-second billing, GPU in minutes.

This repo contains runnable code samples for Serverless AI Jobs (batch workloads that auto-terminate) and Endpoints (persistent HTTP-accessible services). Examples cover model training, fine-tuning, inference serving, AI agents, and scientific simulations.

Quickstart (30 seconds)

Spin up a GPU job and verify your setup:

nebius ai job create \
--name my-first-job \
--image nvidia/cuda:13.1.1-runtime-ubuntu24.04 \
--container-command bash \
--args "-c nvidia-smi" \
--platform gpu-l40s-a \
--preset 1gpu-8vcpu-32gb \
--timeout 15m

# Get the job ID and stream logs
export JOB_ID=$(nebius ai job get-by-name --name my-first-job \
--format jsonpath='{.metadata.id}')
nebius ai logs "$JOB_ID"

→ Full walkthrough: [first-job.md](./quickstarts/first-job.md)

Prerequisites

0. Create a Nebius account and set up a project 1. Install the Nebius CLI 2. Configure your CLI profile

Example catalog

Pick the section that matches your goal — each links to runnable examples:

  • 🚀 [Quickstarts](#-quickstarts) — lowest-friction first runs.
  • 🏋️ [Training](#️-training) — model training and fine-tuning workloads.
  • ⚡ [Inference](#-inference) — endpoint serving and batch inference workloads.
  • 🤖 [Agents](#-agents) — AI gateway and agent deployments.
  • 🧬 [Life Science](#-life-science) — domain-specific simulation and analysis workloads.
  • 🦾 [Robotics](#-robotics) — simulation, dataset generation, and robotics workflows.

🚀 Quickstarts

Lowest-friction first runs.

  • [first-job](./quickstarts/first-job.md) — run nvidia-smi in a Serverless AI job
  • [first-endpoint](./quickstarts/first-endpoint.md) — deploy a quick nginx endpoint

🏋️ Training

Model training and fine-tuning workloads.

  • [axolotl-finetuning](./training/axolotl-finetuning/README.md) — get started fine-tuning with Axolotl
  • [image-classifier-finetuning](./training/image-classifier-finetuning/README.md) — fine-tune an image classifier on a HuggingFace dataset in a serverless GPU job
  • [train-and-serve](./training/train-and-serve/README.md) — fine-tune TinyLlama in a Job and serve it with a vLLM Endpoint

⚡ Inference

Endpoint serving and batch inference workloads.

  • [vllm-endpoint](./inference/vllm-endpoint/README.md) — serve Qwen with an OpenAI-compatible vLLM endpoint
  • [nim-endpoint](./inference/nim-endpoint/README.md) — deploy an NVIDIA NIM as an endpoint, including the large-image Container Registry workaround

🤖 Agents

AI gateway and agent deployments.

  • [openclaw](./agents/openclaw/README.md) — deploy OpenClaw AI gateway on a CPU endpoint, connected to TokenFactory

🧬 Life Science

Domain-specific simulation and analysis workloads.

  • [openmm-simulation](./life-science/openmm-simulation/README.md) — run GPU-backed molecular dynamics simulations with OpenMM
  • [parabricks-deepvariant](./life-science/parabricks-deepvariant/README.md) — run NVIDIA Parabricks DeepVariant genomics workflows with Nebius AI Jobs

🦾 Robotics

Robotics and physical-AI experiment loops.

  • [lerobot-finetune-job](./robotics/lerobot-finetune-job/README.md) — fine-tune a LeRobot ACT or Diffusion policy on a robotics dataset in a serverless GPU job
  • [smolva-ft-norma-core](./robotics/smolva-ft-norma-core/README.md) — fine-tune SmolVLA for SO-101 with bundled trajectories

---

Awesome Community Projects

External examples and writeups from the community running serverless workloads on Nebius. Got something to add? Open a PR.

Robotics

  • 🤖 Positronic + Nebius serverless workflows — Convert datasets, train ACT/SmolVLA, and serve checkpoints as endpoints — all serverless on Nebius. — *by vertix* · 💻 code
  • 🦾 norma-core SmolVLA — Nebius fine-tune recipe — Upstream recipe the [robotics/smolva-ft-norma-core](./robotics/smolva-ft-norma-core/) example mirrors. — *by norma-core* · 💻 code

MLOps / Pipelines

  • 🎬 Video transcription pipeline with Prefect + Nebius — Prefect flows orchestrating S3 + ffmpeg (CPU job) + Whisper (GPU job) on Nebius. — *by Darko Mesaros* · 💻 code · 📝 post

---

Repository structure

serverless-cookbook/
├── quickstarts/ # Lowest-friction first runs
├── training/ # Model training and fine-tuning
├── inference/ # Endpoint serving and batch inference
├── agents/ # AI gateway and agent deployments
├── life-science/ # Domain-specific simulations
├── robotics/ # Robotics and physical-AI workflows
├── CONTRIBUTING.md
└── DEVELOPER_GUIDE.md

Resources

Acknowledgements

This repository is based on mnrozhkov/serverless-cookbook. Thanks to the original contributors: Mikhail Rozhkov, Gleb Berjoskin, Aleksandr Dzhumurat, and Re Alvarez Parmar.

See [CONTRIBUTORS.md](./CONTRIBUTORS.md) for the full list.

License

Copyright 2026 Nebius B.V.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the…

Excerpt shown — open the source for the full document.

Notability

notability 1.0/10

Low star count, trivial new repo