basetenlabs/Workshop-TRT-LLM
Python
Captured source
source ↗published Jun 18, 2024seen 5dcaptured 15hhttp 200method plain
basetenlabs/Workshop-TRT-LLM
Language: Python
Stars: 25
Forks: 15
Open issues: 0
Created: 2024-06-18T18:31:33Z
Pushed: 2024-06-26T04:11:03Z
Default branch: main
Fork: no
Archived: no
README:
AI Engineer World's Fair TensorRT-LLM Workshop
Welcome to *From model weights to API endpoint with TensorRT-LLM* presented at The AI Engineer World's Fair!
We're your hosts, Pankaj Gupta and Philip Kiely from Baseten, and we're thrilled to have you here today.
This workshop has three live coding components, which correspond to numbered folders:
1. Building a TensorRT engine manually with TensorRT-LLM 2. Building an engine automatically on deployment with Truss 3. Benchmarking deployed models
Specific instructions for each component are in the respective folders' READMEs.
Let's get some TPS!
— Pankaj and Philip