# stream-csv-as-json

> Micro-library of stream components for building custom CSV processing pipelines with a minimal memory footprint, on Node.js or Web Streams. Parse CSV files far exceeding available memory using a SAX-inspired streaming token API. Companion to `stream-json`; uses the same token protocol. ESM-only, Node 22+. Runtime deps: `stream-chain`, `stream-json`.

## Install

npm i stream-csv-as-json

## Quick start (Node.js)

```js
import fs from 'node:fs';
import chain from 'stream-chain';
import {parser} from 'stream-csv-as-json';
import asObjects from 'stream-csv-as-json/as-objects.js';

const pipeline = chain([fs.createReadStream('data.csv'), parser(), asObjects()]);

pipeline.on('data', token => console.log(token));
```

## Quick start (Web Streams)

```js
import {chain} from 'stream-chain/web';
import parser from 'stream-csv-as-json/web/parser.js';
import asObjects from 'stream-csv-as-json/web/as-objects.js';

const pipeline = chain([response.body.pipeThrough(new TextDecoderStream()), parser(), asObjects()]);

for await (const token of pipeline.readable) console.log(token);
```

## Entry points

ESM-only. Three substrate flavors per component:

- `stream-csv-as-json` (`.`) and per-component subpaths (`/parser.js`, `/as-objects.js`, `/stringer.js`) — Node-flavored; the factory carries both `.asStream` (Node `Duplex`) and `.asWebStream` (Web `{readable, writable}`).
- `stream-csv-as-json/web` and `stream-csv-as-json/web/*.js` — browser-safe (no `node:*`); factories carry only `.asWebStream`.
- `stream-csv-as-json/core/*.js` — substrate-free factory, no stream adapters attached.

## API

### parser

`parser(options)` — factory returning a flushable function that produces `{name, value}` tokens. Accepts CRLF, LF, and bare CR row terminators; strips a leading UTF-8 BOM; throws on malformed quoted values.

- `parser.asStream(options)` — Node `Duplex`. `parser.asWebStream(options)` — Web `{readable, writable}`.
- Options: `packStrings`/`packValues` (default: true), `streamStrings`/`streamValues` (default: true), `separator` (default: `','`).

```js
import parser from 'stream-csv-as-json/parser.js';
fs.createReadStream('data.csv')
  .pipe(parser.asStream())
  .on('data', token => console.log(token.name, token.value));
```

### Main module

The default export creates a parser Duplex stream with `emit()` applied (Node-only event sugar). The Web entry's default returns `parser.asWebStream`.

```js
import make from 'stream-csv-as-json';
const stream = make();
stream.on('startArray', () => { /* row start */ });
stream.on('stringValue', val => { /* field value */ });
```

### asObjects

`asObjects(options)` — factory converting array tokens to object tokens using the first row as field names. The header collector auto-detects the upstream parser's mode (stream tokens vs packed `stringValue`).

- `asObjects.asStream(options)` / `asObjects.asWebStream(options)`.
- `asObjects.withParser(options)` — pipeline with the CSV parser. `asObjects.withParserAsStream(options)` / `asObjects.withParserAsWebStream(options)`.
- Options: `packKeys` (default: true), `streamKeys` (default: true), `fieldPrefix` (default: `'field'`).
- `useStringValues` / `useValues` — deprecated no-ops kept for backward compatibility.

```js
import asObjects from 'stream-csv-as-json/as-objects.js';

chain([parser(), asObjects(), token => console.log(token)]);
```

### stringer

`stringer(options)` — factory converting a CSV token stream back to CSV text.

- `stringer.asStream(options)` / `stringer.asWebStream(options)`.
- Options: `useStringValues`/`useValues` (default: false), `separator` (default: `','`), `rowTerminator` (default: `'\r\n'` per RFC 4180; override with `'\n'` for Unix-style output).

```js
import stringer from 'stream-csv-as-json/stringer.js';
chain([parser(), stringer(), fs.createWriteStream('output.csv')]);
```

## File components (Node)

Node-only file-edge stages compose stream-chain's async block reader/writer with the core parser/stringer. Drive them with `pipe` + `drain` from `stream-chain/utils` — the writer closes its file handle on flush.

- `parseFile(options)` (`stream-csv-as-json/file/parser.js`) — turns a file path into a token stream (`gen(asyncBlockReader, parser)`). Adds `readBlockSize` (default 64 KB).
- `stringerToFile(path, options)` (`stream-csv-as-json/file/stringer.js`) — writes a token stream to a file (`gen(stringer, asyncBlockWriter)`). Adds `writeBlockSize` (default 1 MB).

```js
import pipe from 'stream-chain/utils/pipe.js';
import drain from 'stream-chain/utils/drain.js';
import parseFile from 'stream-csv-as-json/file/parser.js';
import stringerToFile from 'stream-csv-as-json/file/stringer.js';

await drain(pipe(parseFile(), stringerToFile('out.csv', {useValues: true}))('in.csv'));
```

## Token protocol

The parser emits `{name, value}` tokens: `startArray`, `endArray`, `startString`, `endString`, `stringChunk`, `stringValue`. After `asObjects`: `startObject`, `endObject`, `startKey`, `endKey`, `keyValue`. Typed as discriminated unions (`parser.Token`, `asObjects.AsObjectsToken`).

## Common patterns

### Stream a huge CSV as objects

```js
import fs from 'node:fs';
import chain from 'stream-chain';
import {parser} from 'stream-csv-as-json';
import asObjects from 'stream-csv-as-json/as-objects.js';

chain([
  fs.createReadStream('huge.csv'),
  parser(),
  asObjects(),
  token => { if (token.name === 'endObject') processRow(); }
]);
```

### Compressed CSV processing

```js
import fs from 'node:fs';
import zlib from 'node:zlib';
import chain from 'stream-chain';
import {parser} from 'stream-csv-as-json';
import asObjects from 'stream-csv-as-json/as-objects.js';

chain([fs.createReadStream('data.csv.gz'), zlib.createGunzip(), parser(), asObjects()]);
```

### CSV round-trip

```js
import fs from 'node:fs';
import chain from 'stream-chain';
import {parser} from 'stream-csv-as-json';
import stringer from 'stream-csv-as-json/stringer.js';

chain([fs.createReadStream('input.csv'), parser(), stringer(), fs.createWriteStream('output.csv')]);
```

## Links

- Docs: https://github.com/uhop/stream-csv-as-json/wiki
- npm: https://www.npmjs.com/package/stream-csv-as-json
- Full LLM reference: https://github.com/uhop/stream-csv-as-json/blob/master/llms-full.txt
