Skip to main content
Ctrl+K

Llumnix

  • Llumnix Documentation

Getting Started

  • Quick Start
  • Deployment Guide
  • Llumlet Configuration Guide
  • Benchmark Guide

Development

  • Development Guide
  • Build Images

Design Documents

  • Architecture Overview
  • Gateway
    • Gateway Architecture
    • PDD Forwarding Protocol
    • Batch Inference
    • Traffic Splitting
  • Scheduler
    • Policy Framework
    • Instant and Accurate Load
    • Cache-aware Scheduling
    • Predictor-Enhanced Scheduling
    • SLO-aware Scheduling
    • Adaptive PD Scheduling
    • Rescheduler
  • Llumlet
    • Llumlet and Llumlet Proxy
    • Real-time Instance Status Tracking
    • Request Migration
  • Llumnix-KV
    • Hybrid Connector
    • Blade-KVT (KV Transfer)
  • .md

Llumlet

Llumlet#

Llumlet is the engine-side agent that bridges the local engine and the global scheduler.

Contents

  • Llumlet and Llumlet Proxy
    • Architecture Diagram
    • Component Responsibilities
    • Lifecycle
  • Real-time Instance Status Tracking
    • Architecture
    • Data Sources and Collection Architecture
    • Tracked status
    • How to Use
  • Request Migration
    • Workflow
    • Request selection for migration
    • KV transfer
    • Token forwarding

previous

Rescheduler

next

Llumlet and Llumlet Proxy

By AlibabaPAI

© Copyright 2026, AlibabaPAI Team.