Menu

Documentation Blog Events Contact Community GitHub

Theme

Docs Blog Events Contact Community

Test Post

April 30, 20261 min read

vLLM Team

This is a test post.

Share:

View Markdown Source

OldervLLM Korea Meetup 2026 Wrap-Up

Related Posts

vLLM Korea Meetup 2026 Wrap-Up

Apr 14, 2026·7 min read

Hosted by the vLLM KR Community, with support from Rebellions, SqueezeBits, Red Hat APAC, and PyTorch Korea, the vLLM Korea Meetup 2026 was held in Seoul on April 2nd.

Next-Level Inference: Why Your Single-Node vLLM Setup Needs Prefill-Decode Disaggregation

Apr 7, 2026·22 min read

TL;DR: Prefill and decode fight over the same GPUs, causing ITL spikes under load. We show how to disaggregate them on a single 8-GPU MI300X node using AMD's MORI-IO connector — achieving 2.5x...

Announcing Gemma 4 on vLLM: Byte for byte, the most capable open models

Apr 2, 2026·3 min read

With the debut of Gemma 4, vLLM introduces immediate support for Google's most sophisticated open model lineup, spanning multiple hardware backends, with first-ever Day 0 support on Google TPUs,...

© 2026 vLLM·All rights reserved.

GitHub X LinkedIn Slack Discuss