RepoBasetenBasetenpublished Jun 2, 2025seen 5d

basetenlabs/SGLang-Workshop

Jupyter Notebook

Open original ↗

Captured source

source ↗
published Jun 2, 2025seen 5dcaptured 15hhttp 200method plain

basetenlabs/SGLang-Workshop

Description: Workshop materials for AI Engineer World's Fair

Language: Jupyter Notebook

License: MIT

Stars: 17

Forks: 15

Open issues: 0

Created: 2025-06-02T01:37:20Z

Pushed: 2025-06-03T16:56:19Z

Default branch: main

Fork: no

Archived: no

README:

SGLang Workshop

*AI Engineer World's Fair, June 3, 2025* | *Yineng Zhang and Philip Kiely*

Welcome to the SGLang workshop at AI Engineer World's Fair! We are very excited to have you here and to spend some time this morning talking about model performance optimization with SGLang.

Developer setup

In this hands-on workshop, you'll have the ability to deploy models with SGLang yourself. To follow along, please complete the following steps:

1. Fork and clone this repository. 2. Create a Baseten account. 1. Baseten will provide compute credits for this workshop. 2. If you're stuck in a "waiting room" for more than a couple of minutes, please flag Philip. 3. Install Truss with pip install --upgrade truss

Required: Get access to Llama 3.1 8B

These workshop examples are based on Llama 3.1 8B.

1. Accept the terms and conditions for Llama 3.1 8B. 2. Create an access token with READ permissions on Hugging Face. 3. Add it as a secret on your Baseten account with the name hf_access_token.

Recommended: Set up your Baseten API key

Create a file ~/.trussrc and paste in the following (using your actual API key):

[baseten]
remote_provider = baseten
api_key = abcdefgh.1234567890ABCDEFGHIJKL1234567890
remote_url = https://app.baseten.co

Add your API key to your environment variables in your profile of choice:

export BASETEN_API_KEY=abcdefgh.1234567890ABCDEFGHIJKL1234567890

You are now ready to complete the workshop.

Deploying models

Each folder has an example SGLang configuration along with instructions for deployment.

Calling models

Use call.ipynb to call individual deployments. Models deployed with SGLang are compatible with the OpenAI SDK -- just pass your model ID and API key.

Useful links

Notability

notability 2.0/10

Routine workshop repo with low stars