Skip to main content

Ctrl+K

Llumnix

Llumnix Documentation

Getting Started

Quick Start
Deployment Guide
Benchmark Guide

User Manual

Gateway Configuration Guide
Scheduler Configuration Guide
Llumlet Configuration Guide
Batch Inference Guide

Development

Development Guide
Build Images

Design Documents

Architecture Overview
Service Discovery
Observability
Gateway
Scheduler
Llumlet
Llumnix-KV
- Hybrid Connector
- Blade-KVT (KV Transfer)

.md

Scheduler

Scheduler#

Llumnix’s scheduler handles dynamic request scheduling for distributed LLM inference.

Contents

Policy Framework
Instant and Accurate Load
Cache-aware Scheduling
- Introduction
- Design and implementation
Predictor-Enhanced Scheduling
SLO-aware Scheduling
Adaptive PD Scheduling
Rescheduler

previous

Traffic Mirror

next

Policy Framework

By AlibabaPAI

© Copyright 2026, AlibabaPAI Team.