logo

LLM Streaming

Examples / LLM Streaming

LLM Streaming Simulation

Simulate real-time AI response streaming and measure render performance. Adjust speed, jitter, and chunking to test how SvelteMarkdown handles token-by-token updates from LLMs like ChatGPT, Claude, and Gemini.

How LLM Streaming Works

  • LLMs stream tokens via Server-Sent Events. Each token appends to the markdown source.
  • SvelteMarkdown re-parses and re-renders on every source update, keeping output in sync.
  • Render times stay under 16ms (one frame budget) for typical LLM speeds of 30-80 tokens/sec.
  • Track token costs across providers with ModelPricing.ai.

Stream Controls

Configuration

Speed 30 chunks/sec
Jitter 50%
Chunk mode

Live Metrics

Last render <1ms
Average render <1ms
Peak render <1ms
Dropped frames 0

Markdown Source

2247 chars

Edit or paste your own markdown below. This content will be streamed token-by-token when you click Start.

Rendered Output

Click "Start" to begin streaming the AI response.