Streaming tokens without the flicker

Token-by-token rendering looks smooth on the demo. In production it jitters. Here's how we smoothed it out.

Dev Patel · March 21, 2026

engineering
ux

Streaming responses solve the perceived-latency problem and create a new one: every token re-flows the layout. Citations slide. Code blocks reformat. The whole bubble nudges left when the punctuation lands. Death by a thousand reflows.

After a few rounds of profiling, we ended up with a small set of tricks that look obvious in hindsight.

Pin the bubble width

The bot bubble has max-width: 90% and min-width: 32ch. As tokens arrive the text fills out instead of pushing the bubble wider, so neighbors do not jiggle. Citations and feedback rows stay welded to the bottom edge.

Render layout shells, not finals

When the model signals it is about to produce a code block, we render an empty <pre> placeholder immediately. The placeholder reserves vertical space at a sane default height. As the code streams in it fills the placeholder rather than expanding it. Same trick for tables and source lists.

Decouple token cadence from frame cadence

Tokens arrive in bursts of two or three, then nothing for 80ms, then a burst. If you naively setState on every token you trigger a render per token and the cursor stutters. We coalesce tokens into a single requestAnimationFrame flush, so React renders at most once per frame regardless of how chunked the upstream stream is.

Mind the autoscroll

Autoscrolling on every render fights the user when they try to scroll up to read a citation. Our rule: only autoscroll if the user was at the bottom before the new tokens arrived. A 32-pixel slack zone counts as “at the bottom” so users who scrolled up by mistake do not feel trapped.

The boring win

None of these are clever. The win is in the boringness of the result: the bubble inflates smoothly, the cursor pulses cleanly, and nothing under the bot’s reply jumps around. That is the bar.

Streaming tokens without the flicker

Pin the bubble width

Render layout shells, not finals

Decouple token cadence from frame cadence

Mind the autoscroll

The boring win

More from the blog

Citations vs hallucinations: why grounding matters

Why our widget lives inside a Shadow DOM

Why we built FluentBot