FIG-001/ STREAMING

llm streaming.

Stream markdown the way real models actually deliver it — { value, offset } patches that can arrive out of order. Toggle speed, jitter, and granularity to stress-test ChatGPT / Claude / Gemini delivery patterns.

  • LLMs stream tokens via SSE. SvelteMarkdown re-parses and re-renders on each update, keeping output in sync — even when patches arrive out of order.
  • Render times stay under one frame budget (16ms) for typical LLM speeds of 30–80 tokens/sec.
  • Track token costs across providers with ModelPricing.ai.
  • Building a chat UI? Pair with @humanspeak/svelte-virtual-chat for a virtualized chat viewport purpose-built for LLM conversations.
↩ all examples
mode · offset · jumbled mode · live running source
file · llm-streaming.svelte avg 0ms peak 0ms chunks 0/— ○ IDLE
SRC / STREAMING 1779 chars
OUT / RENDERED idle
chunk
mode
category · streaming
sheet · sheet 01 / 01
⟳ to re-run

LLM Streaming