Skip to content

stream: speed up async iteration over WHATWG byte streams#64291

Open
mcollina wants to merge 2 commits into
nodejs:mainfrom
mcollina:webstream-byte-iteration-perf
Open

stream: speed up async iteration over WHATWG byte streams#64291
mcollina wants to merge 2 commits into
nodejs:mainfrom
mcollina:webstream-byte-iteration-perf

Conversation

@mcollina

@mcollina mcollina commented Jul 4, 2026

Copy link
Copy Markdown
Member

for await / reader.read() loops over byte streams were ~4x slower than over default streams:

  • replace the ReflectGet(view.constructor.prototype, ...)-based ArrayBufferView getters with primordial getters (~3.5x faster at the call level, and no longer spoofable via a user-defined .constructor)
  • extend the buffered fast paths in ReadableStreamDefaultReader.read() and the async iterator to byte controllers, skipping the per-chunk read request + PromiseWithResolvers when data is queued
  • consolidate the reader predicate cascade in the byte controller's enqueue into a single pass; reuse the async iterator's read request object

benchmark/webstreams (--runs 10): readable-async-iterator type=bytes +16.3% (***), readable-read byob +9.1% (***), all other rows neutral. WPT streams/compression/encoding results unchanged.

@nodejs-github-bot

Copy link
Copy Markdown
Collaborator

Review requested:

  • @nodejs/performance

@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. web streams labels Jul 4, 2026
mcollina added 2 commits July 4, 2026 23:29
for await / reader.read() loops over byte streams were ~4x slower than
over default streams. Three per-chunk costs, none required by the spec:

- ArrayBufferViewGetBuffer/ByteLength/ByteOffset went through
  ReflectGet(view.constructor.prototype, ...), a reflective get that is
  ~3.5x slower than the original prototype getters from primordials and
  spoofable through a user-defined .constructor to boot.

- The buffered fast paths in ReadableStreamDefaultReader.read() and the
  async iterator only covered default controllers, so byte streams with
  queued data still allocated a read request and PromiseWithResolvers
  per chunk. Byte-queue dequeue is fully synchronous (it is the
  queue-filled arm of the byte controller's pull steps), so both fast
  paths now resolve directly from the byte queue.

- readableByteStreamControllerEnqueue re-ran the reader brand check and
  re-loaded the read request list four times per chunk across
  HasDefaultReader / ProcessReadRequestsUsingQueue / GetNumReadRequests
  / FulfillReadRequest; it now does a single pass. The async iterator
  also reuses its read request object across reads (at most one is ever
  in flight).

benchmark/webstreams interleaved same-day A/B, --runs 10:
readable-async-iterator bytes +16.3% (***), readable-read byob +9.1%
(***), all other rows neutral. Profiler harness: parked byte iteration
+14%, buffered byte iteration +37%, buffered byte read loop +18%,
default-stream rows at parity. WPT streams/compression/encoding
subtests identical to baseline.

Signed-off-by: Matteo Collina <hello@matteocollina.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
@mcollina mcollina force-pushed the webstream-byte-iteration-perf branch from 2782884 to fa8ad29 Compare July 4, 2026 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-ci PRs that need a full CI run. web streams

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants