# stream-json

> Micro-library of Node.js stream components for creating custom JSON processing pipelines with a minimal memory footprint. Parse JSON files far exceeding available memory using a SAX-inspired streaming token API. One dependency: `stream-chain`.

## Install

npm i stream-json

## Quick start

```js
import chain from 'stream-chain';
import {parser} from 'stream-json';
import {streamArray} from 'stream-json/streamers/stream-array.js';
import fs from 'node:fs';

const pipeline = chain([
  fs.createReadStream('data.json'),
  parser(),
  streamArray(),
  ({value}) => console.log(value)
]);
```

## API

### Parser

`parser(options)` — streaming JSON parser producing `{name, value}` tokens.

- Returns a function for use in `chain()`. Call `parser.asStream(options)` for a Node Duplex stream or `parser.asWebStream(options)` for a Web `{readable, writable}` pair.
- Options: `packKeys`, `packStrings`, `packNumbers` (default: true), `streamKeys`, `streamStrings`, `streamNumbers` (default: true), `jsonStreaming` (default: false).
- `packValues`/`streamValues` — shortcut to set all three at once.

```js
import {parser} from 'stream-json';
const pipeline = fs.createReadStream('data.json').pipe(parser.asStream());

// Web Streams substrate:
import {parser} from 'stream-json/web/parser.js';
const {readable, writable} = parser.asWebStream();
```

Every substrate-bearing component has both `stream-json/X.js` (Node + Web shapes) and `stream-json/web/X.js` (Web-only, browser-safe) entries.

### Main module

The default export is `parserStream` — an alias for `parser.asStream()` that returns a parser as a Duplex stream:

```js
import parserStream from 'stream-json';
const stream = parserStream();
fs.createReadStream('data.json').pipe(stream);
```

For the SAX-style event API on Node (`stream.on('startObject', ...)`), wrap with `emit()`:

```js
import parserStream from 'stream-json';
import emit from 'stream-json/utils/emit.js';
const stream = emit(parserStream());
stream.on('startObject', () => { /* ... */ });
```

For the SAX-style event API on Web, use the `EventTarget`-based variants from `stream-json/web/emitter.js` or `stream-json/web/utils/emit.js`; subscribe with `addEventListener(name, ev => ev.detail)`. For hot paths on either substrate, prefer `for await (const tok of readable) handlers[tok.name]?.(tok.value)` — zero per-token allocation.

### Assembler

`Assembler` — class that reconstructs JS objects from tokens. Receives a per-value callback via the `onDone` option.

```js
import Assembler from 'stream-json/assembler.js';
const asm = Assembler.connectTo(parserStream, {onDone: asm => console.log(asm.current)});
```

- `asm.tapChain` — function for use in `chain()`.
- `asm.onDone(fn)` — set/clear the callback after construction.
- Options: `reviver`, `numberAsString`, `onDone`.

### Disassembler

`disassembler(options)` — JS objects → token stream (generator). Has `asStream` (Node Duplex) and `asWebStream` (Web pair). Web-only entry: `stream-json/web/disassembler.js`.

```js
import {disassembler} from 'stream-json/disassembler.js';
chain([objectSource, disassembler(), stringer(), destination]);
```

### Stringer

`Stringer` — Transform stream converting tokens back to JSON text. Has `asStream` (Node Duplex) and `asWebStream` (Web pair). Web-only entry: `stream-json/web/stringer.js`.

```js
import {stringer} from 'stream-json/stringer.js';
chain([parser(), pick({filter: 'data'}), stringer(), destination]);
```

### Emitter

`Emitter` — sink that re-emits tokens as named events. Node version is a Writable (EventEmitter); Web version is an EventTarget with `.writable` `WritableStream` attached.

```js
// Node
import emitter from 'stream-json/emitter.js';
const e = emitter();
e.on('startObject', () => { /* ... */ });

// Web
import emitter from 'stream-json/web/emitter.js';
const e = emitter();
e.addEventListener('startObject', () => { /* ... */ });
e.addEventListener('keyValue', ev => console.log(ev.detail));
// pipe a token-producing readable into e.writable
```

`Assembler.connectTo(stream, options)` (and `FlexAssembler.connectTo`) is substrate-aware — accepts either a Node Readable or a Web ReadableStream. For hot paths, prefer `for await (const tok of readable) asm.consume(tok)` over `connectTo` — no async-closure overhead, errors propagate directly.

## Filters

All filters accept `{filter, pathSeparator, once, streamKeys}` options. `filter` can be a string, RegExp, or `(stack, chunk) => boolean`.

- **`pick(options)`** — passes only matching subobjects, discards the rest.
- **`replace(options)`** — replaces matching subobjects. Extra option: `replacement` (function, value, or array of tokens).
- **`ignore(options)`** — removes matching subobjects completely.
- **`filter(options)`** — keeps matching subobjects preserving surrounding structure.

Each ships in both substrates with `asStream`, `asWebStream`, `withParser`, `withParserAsStream`, and `withParserAsWebStream` attached. Web-only entries: `stream-json/web/filters/<name>.js`.

```js
import {pick} from 'stream-json/filters/pick.js';
import {ignore} from 'stream-json/filters/ignore.js';
import {streamValues} from 'stream-json/streamers/stream-values.js';

chain([
  parser(),
  pick({filter: 'data'}),
  ignore({filter: /\b_meta\b/i}),
  streamValues(),
  ({value}) => process(value)
]);
```

## Streamers

Assemble complete JS objects from a token stream. All produce `{key, value}` objects, generic in the assembled value type (`streamArray<T>()`, `streamValues<T>()`, `streamObject<T>()`; `value` defaults to `unknown`).

- **`streamValues(options)`** — streams successive JSON values. Use with `jsonStreaming` or after `pick`.
- **`streamArray(options)`** — streams elements of a single top-level array.
- **`streamObject(options)`** — streams properties of a single top-level object.

All support `objectFilter` for early rejection of objects during assembly. Each ships in both substrates with `asStream`, `asWebStream`, `withParser`, `withParserAsStream`, and `withParserAsWebStream` attached. Web-only entries: `stream-json/web/streamers/<name>.js`.

```js
import {streamArray} from 'stream-json/streamers/stream-array.js';
chain([parser(), streamArray(), ({key, value}) => console.log(key, value)]);
```

## Utilities

- **`emit(stream)`** — attach token events to a Node Readable. Web variant (`stream-json/web/utils/emit.js`) takes a `ReadableStream` and returns an auto-piped `EventTarget`. Zero-allocation alternative: `for await (const tok of readable) handlers[tok.name]?.(tok.value)`.
- **`withParser(fn, options)`** — create `gen(parser(options), fn(options))` pipeline. Most components export `.withParser()` and `.withParserAsStream()`.
- **`FlexAssembler`** — Assembler with custom containers (Map, Set, etc.) at specific paths. Rules: `{filter, create, add, finalize?}`. Separate `objectRules` and `arrayRules`.
- **`Batch`** — Transform stream batching items into arrays. Option: `batchSize` (default: 1000). Both `asStream` and `asWebStream` attach `_batchSize` to the returned pair/stream.
- **`Verifier`** — Writable stream validating JSON, reports exact error position. Has `asStream` and `asWebStream`.

### withParser shortcut

```js
import {withParser} from 'stream-json/streamers/stream-array.js';
const pipeline = withParser();
fs.createReadStream('data.json').pipe(pipeline);
```

## JSONL support

- **`jsonl/parser(options)`** — JSONL parser producing `{key, value}` objects. Options: `reviver`, `errorIndicator`. Has `asStream` and `asWebStream`. Web entry: `stream-json/web/jsonl/parser.js`. Named export `checkedParse(input, reviver?, errorIndicator?)` exposes the per-line parser for standalone use.
- **`jsonl/stringer(options)`** — objects → JSONL text. Options: `replacer`, `space`, `separator`. Node entry is itself a `Transform`; `jsonlStringer.asWebStream` returns a Web `TransformStream<T, string>`. Web entry (`stream-json/web/jsonl/stringer.js`) returns the `TransformStream` directly.

```js
import {parser} from 'stream-json/jsonl/parser.js';
import {stringer} from 'stream-json/jsonl/stringer.js';

chain([fs.createReadStream('data.jsonl'), parser(), ({value}) => transform(value), stringer(), destination]);
```

## JSONC support

- **`jsonc/parser(options)`** — JSONC parser (JSON with Comments). Fork of the standard parser with `//` and `/* */` comments, trailing commas, and whitespace tokens.
  - Extra options: `streamWhitespace` (default: true), `streamComments` (default: true).
  - All standard parser options are supported.
- **`jsonc/stringer(options)`** — JSONC stringer. Passes `whitespace` and `comment` tokens through verbatim.
- **`jsonc/verifier(options)`** — JSONC validator. Fork of `Verifier` accepting comments and trailing commas. Reports exact error position.

```js
import {parser as jsoncParser} from 'stream-json/jsonc/parser.js';
import {stringer as jsoncStringer} from 'stream-json/jsonc/stringer.js';

chain([fs.createReadStream('settings.jsonc'), jsoncParser(), jsoncStringer(), destination]);
```

All existing filters, streamers, and utilities work with JSONC parser output — they ignore unknown tokens.

JSONC also ships in both substrates: each has `asStream` (Node Duplex) and `asWebStream` (Web pair). Web entries: `stream-json/web/jsonc/{parser,stringer,verifier}.js`.

## Common patterns

### Stream a huge JSON array

```js
chain([
  fs.createReadStream('huge-array.json'),
  parser(),
  streamArray(),
  ({value}) => processItem(value)
]);
```

### Pick and filter nested data

```js
chain([
  fs.createReadStream('data.json'),
  parser(),
  pick({filter: 'results'}),
  streamArray(),
  ({value}) => value.active ? value : null
]);
```

### Edit JSON and write back

```js
chain([
  fs.createReadStream('input.json'),
  parser(),
  ignore({filter: /\bsecret\b/}),
  Stringer.make(),
  fs.createWriteStream('output.json')
]);
```

## Token protocol

The parser emits `{name, value}` tokens: `startObject`, `endObject`, `startArray`, `endArray`, `startKey`, `endKey`, `keyValue`, `startString`, `endString`, `stringChunk`, `stringValue`, `startNumber`, `endNumber`, `numberChunk`, `numberValue`, `nullValue`, `trueValue`, `falseValue`.

These names are the closed `TokenName` type; `Token` is a discriminated union over `name` (narrowing on `token.name` tightens `token.value`). Both are exported from `stream-json/parser.js`.

## Links

- Docs: https://github.com/uhop/stream-json/wiki
- npm: https://www.npmjs.com/package/stream-json
- Full LLM reference: https://github.com/uhop/stream-json/blob/master/llms-full.txt
