Building Streaming APIs with Node.js
Prerequisites
- Node.js 18+
- Understanding of streams (or at least a willingness to pretend)
- Basic Express or Hono knowledge
- A server you dont mind crashing a few times
- At least one mass-produced energy drink
What We’re Building
APIs that stream data efficiently. Whether its large file downloads, real-time logs, or processing millions of records, the goal is the same: stop loading everything into memory like some kind of animal.
If you’ve ever watched your Node process balloon to 4GB of RAM because you thought JSON.parse(entireDatabase) was a reasonable thing to do, this one’s for you.

The Approach
- Understand stream types
- Stream database results
- Transform data on the fly
- Handle backpressure
- Build SSE endpoints
Simple enough on paper. Let’s see how it actually goes.
Step 1: Stream Basics
The building blocks. Node gives us Readable, Writable, and Transform streams out of the box, and they’re genuinely lovely to work with once you stop fighting them.
import { Readable, Writable, Transform } from 'node:stream';
const readable = Readable.from([1, 2, 3, 4, 5]);
readable.on('data', (chunk) => console.log(chunk));
readable.on('end', () => console.log('Done'));
That works, but the modern approach with async iterators is cleaner and doesnt require you to wire up event handlers like it’s 2014:
for await (const chunk of readable) {
console.log(chunk);
}
Much better. Async iterators let you treat streams like any other iterable, which means your brain has one less paradigm to juggle.
Step 2: Stream Database Results
This is where streaming starts earning its keep. Instead of pulling every user out of the database in one enormous query and watching your server weep, we use cursor-based pagination inside a readable stream.
// Using Prisma with cursor-based pagination
import { Readable } from 'node:stream';
function streamUsers(): Readable {
let cursor: string | undefined;
const batchSize = 100;
return new Readable({
objectMode: true,
async read() {
const users = await prisma.user.findMany({
take: batchSize,
skip: cursor ? 1 : 0,
cursor: cursor ? { id: cursor } : undefined,
orderBy: { id: 'asc' },
});
if (users.length === 0) {
this.push(null);
return;
}
cursor = users[users.length - 1].id;
for (const user of users) {
this.push(user);
}
},
});
}
The key detail here is objectMode: true. Without it, Node expects buffers, and your beautifully structured user objects will make it very unhappy. We pull 100 users at a time, push them into the stream, and move the cursor forward. When there’s nothing left, we push null to signal we’re done. Elegant.
Step 3: Transform Streams
Transform streams sit in the middle of a pipeline and, well, transform things. Revolutionary naming, I know. But they’re incredibly useful for reshaping data without buffering the entire dataset.
import { Transform } from 'node:stream';
const toJSON = new Transform({
objectMode: true,
transform(chunk, encoding, callback) {
try {
const json = JSON.stringify(chunk) + '\n';
callback(null, json);
} catch (err) {
callback(err as Error);
}
},
});
const addTimestamp = new Transform({
objectMode: true,
transform(chunk, encoding, callback) {
callback(null, {
...chunk,
processedAt: new Date().toISOString(),
});
},
});
These are composable. You can chain as many transforms as you like, and data flows through each one in sequence. Think of it like a conveyor belt in a factory, except the factory isnt producing anything physical and nobody’s wearing a hard hat.
Step 4: Streaming HTTP Responses
Now we wire it all together and actually serve streaming data over HTTP. Here’s how it looks in both Express and Hono:
// Express
import { pipeline } from 'node:stream/promises';
app.get('/users/export', async (req, res) => {
res.setHeader('Content-Type', 'application/x-ndjson');
res.setHeader('Transfer-Encoding', 'chunked');
const userStream = streamUsers();
const jsonTransform = createJSONTransform();
await pipeline(userStream, jsonTransform, res);
});
// Hono
import { stream } from 'hono/streaming';
app.get('/users/export', (c) => {
return stream(c, async (stream) => {
const userStream = streamUsers();
for await (const user of userStream) {
await stream.write(JSON.stringify(user) + '\n');
}
});
});
The pipeline function from node:stream/promises is your best friend here. It handles piping streams together and, crucially, it cleans up properly if something goes wrong. If you’re manually calling .pipe() in 2026, we need to have a conversation.
Step 5: Handle Backpressure
Backpressure is what happens when your producer is faster than your consumer. Without handling it, you’ll buffer data in memory until your process falls over. The whole point of streaming is to avoid that, so pay attention to this bit.
import { Readable } from 'node:stream';
function createProducerStream(): Readable {
let count = 0;
const max = 1000000;
return new Readable({
objectMode: true,
read(size) {
let pushed = true;
while (pushed && count < max) {
const data = { id: count++, data: generateData() };
pushed = this.push(data);
}
if (count >= max) {
this.push(null);
}
},
});
}
The secret sauce is that this.push() returns false when the internal buffer is full. When that happens, we stop pushing and wait for read() to be called again. Node handles the scheduling. We just need to respect the signal.
The async generator approach is arguably even cleaner:
async function* generateData() {
for (let i = 0; i < 1000000; i++) {
yield { id: i, data: `item-${i}` };
if (i % 1000 === 0) {
await new Promise((resolve) => setImmediate(resolve));
}
}
}
const stream = Readable.from(generateData());
That setImmediate call every 1000 iterations is doing important work. It yields control back to the event loop so your server can still handle other requests instead of being entirely consumed by your million-item export. Be kind to your event loop. It’s the only one you’ve got.

Step 6: Server-Sent Events (SSE)
SSE is the underappreciated workhorse of real-time web. It’s simpler than WebSockets, works over plain HTTP, and handles reconnection automatically. For most “push data to the client” scenarios, it’s all you need.
// Express
app.get('/events', (req, res) => {
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
const sendEvent = (event: string, data: unknown) => {
res.write(`event: ${event}\n`);
res.write(`data: ${JSON.stringify(data)}\n\n`);
};
const interval = setInterval(() => {
sendEvent('ping', { time: Date.now() });
}, 1000);
const subscription = eventBus.subscribe((event) => {
sendEvent(event.type, event.data);
});
req.on('close', () => {
clearInterval(interval);
subscription.unsubscribe();
});
});
That cleanup in req.on('close') is non-negotiable. Without it, you’ll leak intervals and subscriptions every time a client disconnects, and your server will slowly accumulate ghosts.

On the client side, it’s beautifully simple:
// Client
const events = new EventSource('/events');
events.addEventListener('ping', (e) => {
console.log('Ping:', JSON.parse(e.data));
});
events.addEventListener('notification', (e) => {
showNotification(JSON.parse(e.data));
});
That’s it. The browser handles reconnection for you. No libraries, no dependencies, no faffing about.
Step 7: File Upload Streaming
Streaming uploads are where you really feel the power. Instead of buffering an entire file in memory before writing it to disc, you pipe the request body straight to the filesystem whilst computing a hash along the way.
import { createWriteStream } from 'node:fs';
import { pipeline } from 'node:stream/promises';
import { createHash } from 'node:crypto';
app.post('/upload', async (req, res) => {
const filename = req.headers['x-filename'] as string;
const fileStream = createWriteStream(`./uploads/${filename}`);
const hashStream = createHash('sha256');
hashStream.setEncoding('hex');
await pipeline(req, async function* (source) {
for await (const chunk of source) {
hashStream.update(chunk);
yield chunk;
}
}, fileStream);
hashStream.end();
const hash = hashStream.read();
res.json({ filename, hash });
});
This approach means you can handle multi-gigabyte uploads without your memory usage budging. The async generator in the middle of the pipeline is a neat trick for tapping into the data flow without disrupting it.
Step 8: Processing Large Files
The classic scenario. You’ve got a CSV with a million rows and someone needs it processed by end of day. readline combined with a read stream makes this almost trivially easy.
import { createReadStream } from 'node:fs';
import { createInterface } from 'node:readline';
async function processLargeCSV(path: string): Promise<void> {
const fileStream = createReadStream(path);
const rl = createInterface({
input: fileStream,
crlfDelay: Infinity,
});
let lineNumber = 0;
for await (const line of rl) {
lineNumber++;
if (lineNumber === 1) continue;
const [id, name, email] = line.split(',');
await processUser({ id, name, email });
if (lineNumber % 10000 === 0) {
console.log(`Processed ${lineNumber} lines`);
}
}
}
Memory stays flat regardless of file size. You could process a 50GB CSV on a machine with 512MB of RAM and it wouldn’t even blink. That’s the whole promise of streams, and here it actually delivers.

The Result
- Memory-efficient data processing
- Handle files larger than available RAM
- Real-time data streaming
- Proper backpressure handling
- Composable stream pipelines
What I’d Do Differently
Add proper error handling to streams from the start. Stream errors are easy to miss and can leave connections hanging indefinitely. Always handle the error event. I’ve been bitten by this more times than I care to admit, and every single time I think “right, this is the last time I forget.” It never is.
Streams feel like overkill until you try to export a million records and crash your server. Then they feel like the only sane option.