ForkNous ResearchNous Researchpublished Nov 6, 2025seen 5d

NousResearch/vllm

forked from vllm-project/vllm

Open original ↗

Captured source

source ↗
published Nov 6, 2025seen 5dcaptured 11hhttp 200method plain

NousResearch/vllm

Description: A high-throughput and memory-efficient inference and serving engine for LLMs

License: Apache-2.0

Stars: 5

Forks: 2

Open issues: 0

Created: 2025-11-06T13:28:08Z

Pushed: 2025-11-06T11:04:19Z

Default branch: main

Fork: yes

Parent repository: vllm-project/vllm

Archived: no

README:

Easy, fast, and cheap LLM serving for everyone

| Documentation | Blog | Paper | Twitter/X | User Forum | Developer Slack |

--- Join us at the PyTorch Conference, October 22-23 and Ray Summit, November 3-5 in San Francisco for our latest updates on vLLM and to meet the vLLM team! Register now for the largest vLLM community events of the year!

---

*Latest News* 🔥

  • [2025/11] We hosted vLLM Beijing Meetup focusing on distributed inference and diverse accelerator support with vLLM! Please find the meetup slides here.
  • [2025/10] We hosted vLLM Shanghai Meetup focused on hands-on vLLM inference optimization! Please find the meetup slides here.
  • [2025/09] We hosted vLLM Toronto Meetup focused on tackling inference at scale and speculative decoding with speakers from NVIDIA and Red Hat! Please find the meetup slides here.
  • [2025/08] We hosted vLLM Shenzhen Meetup focusing on the ecosystem around vLLM! Please find the meetup slides here.
  • [2025/08] We hosted vLLM Singapore Meetup. We shared V1 updates, disaggregated serving and MLLM speedups with speakers from Embedded LLM, AMD, WekaIO, and A*STAR. Please find the meetup slides here.
  • [2025/08] We hosted vLLM Shanghai Meetup focusing on building, developing, and integrating with vLLM! Please find the meetup slides here.
  • [2025/05] vLLM is now a hosted project under PyTorch Foundation! Please find the announcement here.
  • [2025/01] We are excited to announce the alpha release of vLLM V1: A major architectural upgrade with 1.7x speedup! Clean code, optimized execution loop, zero-overhead prefix caching, enhanced multimodal support, and more. Please check out our blog post here.

Previous News

  • [2025/08] We hosted vLLM Korea Meetup with Red Hat and Rebellions! We shared the latest advancements in vLLM along with project spotlights from the vLLM Korea community. Please find the meetup slides here.
  • [2025/08] We hosted vLLM Beijing Meetup focusing on large-scale LLM deployment! Please find the meetup slides here and the recording here.
  • [2025/05] We hosted NYC vLLM Meetup! Please find the meetup slides here.
  • [2025/04] We hosted Asia Developer Day! Please find the meetup slides from the vLLM team here.
  • [2025/03] We hosted vLLM x Ollama Inference Night! Please find the meetup slides from the vLLM team here.
  • [2025/03] We hosted the first vLLM China Meetup! Please find the meetup slides from vLLM team here.
  • [2025/03] We hosted the East Coast vLLM Meetup! Please find the meetup slides here.
  • [2025/02] We hosted the ninth vLLM meetup with Meta! Please find the meetup slides from vLLM team here and AMD here. The slides from Meta will not be posted.
  • [2025/01] We hosted the eighth vLLM meetup with Google Cloud! Please find the meetup slides from vLLM team here, and Google Cloud team here.
  • [2024/12] vLLM joins pytorch ecosystem! Easy, Fast, and Cheap LLM Serving for Everyone!
  • [2024/11] We hosted the seventh vLLM meetup with Snowflake! Please find the meetup slides from vLLM team here, and Snowflake team here.
  • [2024/10] We have just created a developer slack (slack.vllm.ai) focusing on coordinating contributions and discussing features. Please feel free to join us there!
  • [2024/10] Ray Summit 2024 held a special track for vLLM! Please find the opening talk slides from the vLLM team…

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Routine fork by NousResearch, low traction.