Building a Job Queue with BullMQ
Prerequisites
- Node.js 18+
- Redis 6+ (or a very patient hamster on a wheel)
- TypeScript knowledge
- A healthy distrust of anything that says “it just works”
- At least one mass email disaster in your past that you dont like to talk about
What We’re Building
A production-ready job queue using BullMQ. Delayed jobs, retries, rate limiting, monitoring. Basically everything your API needs so it can stop pretending to be a background worker.
If you’ve ever had a user staring at a loading spinner for 45 seconds while your server generates a PDF, this one’s for you. Emails, image processing, report generation, anything that has no business blocking an HTTP response. Shove it in a queue. Let something else deal with it.

The Approach
- Set up BullMQ
- Create typed jobs
- Build workers
- Handle failures (because they will happen)
- Add monitoring
Nothing revolutionary. Thats the point.
Step 1: Install and Configure
npm install bullmq ioredis
Two dependencies. Thats it. BullMQ handles the queue logic, ioredis gives us a proper Redis client that wont fall over when things get busy.
// lib/queue.ts
import { Queue, Worker, Job, QueueEvents } from 'bullmq';
import IORedis from 'ioredis';
const connection = new IORedis({
host: process.env.REDIS_HOST ?? 'localhost',
port: parseInt(process.env.REDIS_PORT ?? '6379'),
maxRetriesPerRequest: null,
});
export { Queue, Worker, Job, QueueEvents, connection };
That maxRetriesPerRequest: null is important. BullMQ needs it to handle blocking commands properly. Leave it out and you’ll get cryptic errors that will have you questioning your career choices.

Step 2: Define Job Types
This is where TypeScript earns its keep. We define every job type up front so the compiler can catch our mistakes before Redis does.
// jobs/types.ts
export interface EmailJob {
type: 'email';
to: string;
subject: string;
template: string;
data: Record<string, unknown>;
}
export interface ImageJob {
type: 'image:resize';
imageUrl: string;
sizes: number[];
outputPath: string;
}
export interface ReportJob {
type: 'report:generate';
reportId: string;
userId: string;
format: 'pdf' | 'csv';
}
export type JobData = EmailJob | ImageJob | ReportJob;
export type JobResult<T extends JobData['type']> =
T extends 'email' ? { messageId: string } :
T extends 'image:resize' ? { paths: string[] } :
T extends 'report:generate' ? { downloadUrl: string } :
never;
Discriminated unions doing the heavy lifting here. Each job has a type field that tells TypeScript exactly what data to expect. Add a new job type later and the compiler will scream at every place you forgot to handle it. Beautiful.
Step 3: Create Queues
One queue per job category. You could throw everything into a single queue, but then your urgent password reset email is stuck behind 400 image resizes. Dont do that.
// queues/index.ts
import { Queue, connection } from '../lib/queue';
import type { JobData } from '../jobs/types';
export const emailQueue = new Queue<EmailJob>('email', { connection });
export const imageQueue = new Queue<ImageJob>('image', { connection });
export const reportQueue = new Queue<ReportJob>('report', { connection });
export async function addJob<T extends JobData>(
job: T,
options?: { delay?: number; priority?: number; attempts?: number }
): Promise<void> {
const queue = getQueue(job.type);
await queue.add(job.type, job, {
attempts: options?.attempts ?? 3,
backoff: {
type: 'exponential',
delay: 1000,
},
delay: options?.delay,
priority: options?.priority,
});
}
function getQueue(type: JobData['type']): Queue {
if (type === 'email') return emailQueue;
if (type === 'image:resize') return imageQueue;
if (type === 'report:generate') return reportQueue;
throw new Error(`Unknown job type: ${type}`);
}
Exponential backoff on retries by default. First retry after 1 second, then 2, then 4. Gives whatever downstream service you’re calling a chance to recover before you hammer it again.
Step 4: Build Workers
Workers are where the actual work happens. Each one picks up jobs from its queue and processes them. Think of it like a very obedient employee who never takes a lunch break.
// workers/email.worker.ts
import { Worker, Job } from '../lib/queue';
import type { EmailJob } from '../jobs/types';
const emailWorker = new Worker<EmailJob>(
'email',
async (job: Job<EmailJob>) => {
const { to, subject, template, data } = job.data;
console.log(`Processing email job ${job.id}: ${subject} to ${to}`);
const messageId = await sendEmail({ to, subject, template, data });
return { messageId };
},
{
connection,
concurrency: 5,
limiter: {
max: 100,
duration: 60000,
},
}
);
emailWorker.on('completed', (job) => {
console.log(`Email job ${job.id} completed`);
});
emailWorker.on('failed', (job, error) => {
console.error(`Email job ${job?.id} failed:`, error);
});
export { emailWorker };
Concurrency of 5 means five emails processing simultaneously. The rate limiter caps us at 100 per minute so your email provider doesnt blacklist you. Learn from my mistakes on this one.

The image worker is a bit more interesting because we can report progress as it resizes each dimension:
// workers/image.worker.ts
import { Worker, Job } from '../lib/queue';
import type { ImageJob } from '../jobs/types';
import sharp from 'sharp';
const imageWorker = new Worker<ImageJob>(
'image',
async (job: Job<ImageJob>) => {
const { imageUrl, sizes, outputPath } = job.data;
const image = sharp(await downloadImage(imageUrl));
const paths: string[] = [];
for (const size of sizes) {
const resizedPath = `${outputPath}/${size}.webp`;
await image
.clone()
.resize(size)
.webp({ quality: 80 })
.toFile(resizedPath);
paths.push(resizedPath);
await job.updateProgress((sizes.indexOf(size) + 1) / sizes.length * 100);
}
return { paths };
},
{
connection,
concurrency: 2,
}
);
export { imageWorker };
Low concurrency here because image processing is CPU hungry. Two at a time keeps your server from catching fire.
Step 5: Handle Retries and Failures
This is where most queue implementations fall apart. The happy path is easy. Its the “what happens when everything goes wrong at 3am” path that matters.
// workers/report.worker.ts
import { Worker, Job } from '../lib/queue';
import type { ReportJob } from '../jobs/types';
const reportWorker = new Worker<ReportJob>(
'report',
async (job: Job<ReportJob>) => {
const { reportId, userId, format } = job.data;
if (job.attemptsMade > 0) {
console.log(`Retry attempt ${job.attemptsMade} for report ${reportId}`);
}
try {
const report = await generateReport(reportId, format);
const url = await uploadReport(report);
await notifyUser(userId, {
type: 'report:ready',
downloadUrl: url,
});
return { downloadUrl: url };
} catch (error) {
if (error instanceof TemporaryError && job.attemptsMade < 3) {
throw error;
}
await notifyUser(userId, {
type: 'report:failed',
message: 'Report generation failed',
});
throw error;
}
},
{
connection,
concurrency: 3,
}
);
export { reportWorker };
The key bit: temporary errors get retried, but we still notify the user if we’ve exhausted all attempts. Nobody wants to refresh a status page for an hour only to discover their report silently died.
Step 6: Schedule Jobs
BullMQ handles delayed and repeatable jobs natively. No need for a separate cron service.
// Delayed job - send in 1 hour
await addJob(
{ type: 'email', to: '[email protected]', subject: 'Reminder', template: 'reminder', data: {} },
{ delay: 60 * 60 * 1000 }
);
// Repeatable job
await emailQueue.add(
'daily-digest',
{ type: 'email', to: '[email protected]', subject: 'Daily Digest', template: 'digest', data: {} },
{
repeat: {
pattern: '0 9 * * *',
tz: 'Europe/London',
},
}
);
// Priority job
await addJob(
{ type: 'report:generate', reportId: 'urgent', userId: '123', format: 'pdf' },
{ priority: 1 }
);
Lower priority number means higher priority. Priority 1 jumps the queue. The daily digest uses a cron pattern, and yes, you should set the timezone unless you want your “9am digest” arriving at midnight because your server thinks it lives in UTC.
Step 7: Monitor with Events
You need to know what your queues are doing. Not just when things break, but before things break.
// monitoring/queue-events.ts
import { QueueEvents, connection } from '../lib/queue';
const emailEvents = new QueueEvents('email', { connection });
emailEvents.on('completed', ({ jobId, returnvalue }) => {
console.log(`Job ${jobId} completed with result:`, returnvalue);
metrics.increment('jobs.completed', { queue: 'email' });
});
emailEvents.on('failed', ({ jobId, failedReason }) => {
console.error(`Job ${jobId} failed:`, failedReason);
metrics.increment('jobs.failed', { queue: 'email' });
alerting.notify(`Email job ${jobId} failed: ${failedReason}`);
});
emailEvents.on('waiting', ({ jobId }) => {
console.log(`Job ${jobId} is waiting`);
});
emailEvents.on('active', ({ jobId }) => {
console.log(`Job ${jobId} is now active`);
});
Pipe these metrics into Grafana, Datadog, whatever you fancy. The important thing is that when your queue depth starts climbing, you know about it before your users do.
Step 8: API Integration
Wire it into your routes and suddenly your API responses are instant. The user gets a job ID, polls for status, and your server isnt tied up doing the actual work.
// routes/reports.ts
import { addJob } from '../queues';
import { reportQueue } from '../queues';
app.post('/reports', async (req, res) => {
const job = await addJob({
type: 'report:generate',
reportId: generateId(),
userId: req.user.id,
format: req.body.format,
});
res.json({
message: 'Report generation started',
jobId: job.id,
statusUrl: `/reports/status/${job.id}`,
});
});
app.get('/reports/status/:jobId', async (req, res) => {
const job = await reportQueue.getJob(req.params.jobId);
if (!job) {
return res.status(404).json({ error: 'Job not found' });
}
const state = await job.getState();
const progress = job.progress;
res.json({ state, progress, result: job.returnvalue });
});
Your POST returns immediately with a status URL. The client polls that URL until the job completes. Simple, stateless, and your API response times stay in the milliseconds where they belong.

The Result
- Reliable background job processing with automatic retries
- Exponential backoff so you’re not hammering failing services
- Rate limiting to keep your email provider happy
- Job scheduling and priorities out of the box
- Real-time monitoring so you can sleep at night
What I’d Do Differently
Set up dead letter queues from the start. Failed jobs need somewhere to go for later inspection and manual retry. I’ve lost count of how many times I’ve had to dig through logs to figure out what a failed job was even trying to do. A dead letter queue with the original payload and the error? That’s the kind of thing that saves you at 2am.
BullMQ makes background processing boring, and boring is exactly what you want from infrastructure. Boring means reliable. Reliable means you get to sleep.