Skip to main content
Ctrl+K

Llumnix

  • Llumnix Documentation

Getting Started

  • Quick Start
  • Deployment Guide
  • Llumlet Configuration Guide
  • Benchmark Guide

Development

  • Development Guide
  • Build Images

Design Documents

  • Architecture Overview
  • Gateway
    • Gateway Architecture
    • PDD Forwarding Protocol
    • Batch Inference
    • Traffic Splitting
  • Scheduler
    • Policy Framework
    • Instant and Accurate Load
    • Cache-aware Scheduling
    • Predictor-Enhanced Scheduling
    • SLO-aware Scheduling
    • Adaptive PD Scheduling
    • Rescheduler
  • Llumlet
    • Llumlet and Llumlet Proxy
    • Real-time Instance Status Tracking
    • Request Migration
  • Llumnix-KV
    • Hybrid Connector
    • Blade-KVT (KV Transfer)
  • .md

Gateway

Gateway#

The gateway component handles request routing and dispatching in Llumnix.

Contents

  • Gateway Architecture
    • Overview
    • Gateway Components
    • Gateway Request Lifecycle
  • PDD Forwarding Protocol
    • Introduction
    • Design
    • Protocol Implementation Details
    • Usage
  • Batch Inference
    • Architecture
    • Service Deployment Example
    • Deployment Parameters
    • Usage Guide
    • Batch Task Statuses
    • API Reference
  • Traffic Splitting
    • Introduction
    • Design and implementation
    • Configuration
    • Deployment example

previous

Architecture Overview

next

Gateway Architecture

By AlibabaPAI

© Copyright 2026, AlibabaPAI Team.