UpstageAI/vllm
forked from vllm-project/vllm
Captured source
source ↗UpstageAI/vllm
Description: A high-throughput and memory-efficient inference and serving engine for LLMs
Language: Python
License: Apache-2.0
Stars: 1
Forks: 0
Open issues: 19
Created: 2025-12-29T06:50:24Z
Pushed: 2026-03-26T13:43:46Z
Default branch: v0.12.0-solar-open
Fork: yes
Parent repository: vllm-project/vllm
Archived: no
README:
Easy, fast, and cheap LLM serving for everyone
| Documentation | Blog | Paper | Twitter/X | User Forum | Developer Slack |
--- Join us at the PyTorch Conference, October 22-23 and Ray Summit, November 3-5 in San Francisco for our latest updates on vLLM and to meet the vLLM team! Register now for the largest vLLM community events of the year!
---
*Latest News* 🔥
- [2025/11] We hosted vLLM Bangkok Meetup. We explored vLLM and LMCache inference and low-resource language adaptation with speakers from Embedded LLM, AMD, and Red Hat. Please find the meetup slides here.
- [2025/11] We hosted the first vLLM Europe Meetup in Zurich focused on quantization, distributed inference, and reinforcement learning at scale with speakers from Mistral, IBM, and Red Hat. Please find the meetup slides here and recording here
- [2025/11] We hosted vLLM Beijing Meetup focusing on distributed inference and diverse accelerator support with vLLM! Please find the meetup slides here.
- [2025/10] We hosted vLLM Shanghai Meetup focused on hands-on vLLM inference optimization! Please find the meetup slides here.
- [2025/09] We hosted vLLM Toronto Meetup focused on tackling inference at scale and speculative decoding with speakers from NVIDIA and Red Hat! Please find the meetup slides here.
- [2025/08] We hosted vLLM Shenzhen Meetup focusing on the ecosystem around vLLM! Please find the meetup slides here.
- [2025/08] We hosted vLLM Singapore Meetup. We shared V1 updates, disaggregated serving and MLLM speedups with speakers from Embedded LLM, AMD, WekaIO, and A*STAR. Please find the meetup slides here.
- [2025/08] We hosted vLLM Shanghai Meetup focusing on building, developing, and integrating with vLLM! Please find the meetup slides here.
- [2025/05] vLLM is now a hosted project under PyTorch Foundation! Please find the announcement here.
- [2025/01] We are excited to announce the alpha release of vLLM V1: A major architectural upgrade with 1.7x speedup! Clean code, optimized execution loop, zero-overhead prefix caching, enhanced multimodal support, and more. Please check out our blog post here.
Previous News
- [2025/08] We hosted vLLM Korea Meetup with Red Hat and Rebellions! We shared the latest advancements in vLLM along with project spotlights from the vLLM Korea community. Please find the meetup slides here.
- [2025/08] We hosted vLLM Beijing Meetup focusing on large-scale LLM deployment! Please find the meetup slides here and the recording here.
- [2025/05] We hosted NYC vLLM Meetup! Please find the meetup slides here.
- [2025/04] We hosted Asia Developer Day! Please find the meetup slides from the vLLM team here.
- [2025/03] We hosted vLLM x Ollama Inference Night! Please find the meetup slides from the vLLM team here.
- [2025/03] We hosted the first vLLM China Meetup! Please find the meetup slides from vLLM team here.
- [2025/03] We hosted the East Coast vLLM Meetup! Please find the meetup slides here.
- [2025/02] We hosted the ninth vLLM meetup with Meta! Please find the meetup slides from vLLM team here and AMD here. The slides from Meta will not be posted.
- [2025/01] We hosted the eighth vLLM meetup with Google Cloud! Please find the meetup slides from vLLM team here, and Google Cloud team here.
- [2024/12] vLLM joins [pytorch…
Excerpt shown — open the source for the full document.
Notability
notability 1.0/10Routine fork, no notable traction