Webhooks: Building Reliable Event Receivers in Node.js

Webhooks: Building Reliable Event Receivers in Node.js
Brandon Perfetti

Technical PM + Software Engineer

Topics:Web DevelopmentBackendDeveloper Experience
Tech:Node.jsExpressHTTP

Most webhook tutorials stop right before the part that matters.

They show a route handler, parse JSON, maybe print the payload, and call it done. That is enough to prove a request can hit your server. It is nowhere close to enough to prove the receiver is reliable.

Real webhook systems fail in ways that are boring, expensive, and surprisingly easy to trigger:

  • providers retry because your endpoint answered too slowly,
  • duplicate deliveries create duplicate side effects,
  • signature verification is skipped until an incident forces it,
  • local success collapses in production because downstream work happens inline,
  • and nobody knows whether a failed event was dropped, deferred, or already processed.

That is why the real job is not "accept a POST request." The real job is to build a receiver that behaves predictably under retries, bad payloads, queue pressure, and operational noise.

This article walks through the production-ready shape of a Node.js webhook receiver: verifying signatures, preserving raw bodies, responding fast, pushing work to a queue, making handlers idempotent, and testing the whole thing locally without fooling yourself.

By the end, you should have a clear model for building webhook endpoints that stay trustworthy once real providers start hammering them.

What a Reliable Webhook Receiver Actually Has to Do

A webhook endpoint has a deceptively small surface area. Usually it is one HTTP route. That makes it easy to underestimate the system around it.

A reliable receiver usually needs to do five things well:

  1. authenticate the sender,
  2. validate the payload shape enough to route it safely,
  3. acknowledge receipt quickly,
  4. process the event asynchronously,
  5. and ensure repeated delivery does not cause repeated side effects.

That last point is where many teams get burned. Webhooks are often at-least-once delivery systems. That means duplicates are not edge cases. They are part of the contract.

In plain English: if your receiver is not designed for retries and duplicates, it is not production-ready yet.

First Principle: Respond Fast, Do Work Later

One of the most common mistakes is doing too much work before returning a response.

A provider usually cares about whether your endpoint accepted the event, not whether every downstream effect finished before the socket closed. If you verify authenticity, store enough state to recover the event, and queue the real work, you buy reliability immediately.

A thin Express route often looks like this:

import express from "express";
import { verifyWebhookSignature } from "./lib/signature.js";
import { enqueueWebhookJob } from "./lib/queue.js";

const app = express();

app.post("/webhooks/provider", express.raw({ type: "application/json" }), async (req, res) => {
  const signature = req.get("x-provider-signature");

  if (!verifyWebhookSignature({
    rawBody: req.body,
    signature,
    secret: process.env.WEBHOOK_SECRET
  })) {
    return res.status(401).send("Invalid signature");
  }

  const payload = JSON.parse(req.body.toString("utf8"));

  await enqueueWebhookJob({
    provider: "provider-name",
    eventId: payload.id,
    eventType: payload.type,
    payload
  });

  return res.status(202).send("Accepted");
});

The key idea is not the exact library. It is the sequence:

  • preserve the raw request body,
  • verify the signature,
  • extract enough event metadata,
  • hand work off,
  • respond quickly.

That shape keeps the endpoint from turning into a fragile mini-application.

Signature Verification Is Not Optional

If a provider signs webhook requests, verify them. Every time.

A lot of teams postpone this because local development feels easier without it. That is understandable in the first hour and dangerous after that.

Signature verification usually depends on the exact raw body bytes plus a shared secret. If middleware mutates the body before verification, the signature check may fail even when the provider is legitimate.

That is why express.json() is often the wrong default for webhook routes. For signed payloads, express.raw() is usually safer because you need the original bytes.

A typical HMAC verification helper looks like this:

import crypto from "crypto";

export function verifyWebhookSignature({ rawBody, signature, secret }) {
  if (!signature || !secret) return false;

  const expected = crypto
    .createHmac("sha256", secret)
    .update(rawBody)
    .digest("hex");

  const expectedBuffer = Buffer.from(expected, "utf8");
  const providedBuffer = Buffer.from(signature, "utf8");

  if (expectedBuffer.length !== providedBuffer.length) return false;
  return crypto.timingSafeEqual(expectedBuffer, providedBuffer);
}

In a real integration, use the provider’s documented signing format rather than guessing. Some include timestamps, prefixes, or multiple signatures in a header. The implementation detail varies. The discipline does not.

In plain English: never treat a webhook as trustworthy just because it reached your route.

Preserve Raw Bodies on Purpose

This deserves its own section because it causes so many quiet failures.

Developers are used to globally enabling JSON parsing middleware. That is fine for ordinary API endpoints. It is risky for signed webhooks.

If the signature depends on the exact bytes sent by the provider, even a harmless-looking body transformation can break verification.

A clean pattern is to isolate webhook routes and attach the body parser they need instead of sharing the same parser stack as the rest of the API.

app.use("/api", express.json());
app.post("/webhooks/provider", express.raw({ type: "application/json" }), webhookHandler);

That small decision avoids a lot of brittle debugging later.

Idempotency Is the Difference Between Stable and Dangerous

A webhook provider retrying a delivery is not a bug. Your system applying the same side effect twice is the bug.

That means your receiver needs an idempotency strategy.

The usual approach is simple:

  • store a stable event identifier from the provider,
  • check whether you have already processed it,
  • skip or short-circuit duplicate work if you have.

A lightweight pattern might look like this:

export async function processWebhookEvent(event) {
  const alreadyHandled = await db.webhookEvents.findUnique({
    where: { providerEventId: event.eventId }
  });

  if (alreadyHandled) {
    return { skipped: true };
  }

  await db.$transaction(async (tx) => {
    await tx.webhookEvents.create({
      data: {
        providerEventId: event.eventId,
        eventType: event.eventType,
        receivedAt: new Date()
      }
    });

    await applyBusinessSideEffects(tx, event);
  });

  return { skipped: false };
}

The important part is that the deduplication record and the side effect live in a coordinated flow. If those two operations are disconnected, duplicates can still leak through during races.

In plain English: if a webhook can charge a card, create a record, send an email, or change a subscription state, idempotency is part of the feature, not an enhancement.

Queue the Real Work

Inline processing feels simpler until one provider retry storm teaches you otherwise.

Webhook handlers are at their best when they act like intake valves. They validate, normalize, store, enqueue, and return. The heavy work happens in a worker.

That gives you several benefits:

  • the endpoint stays fast,
  • transient downstream failures do not force the whole provider request to hang,
  • retries become controllable,
  • and processing can be observed independently from ingestion.

A simple BullMQ-style enqueue step might look like this:

import { Queue } from "bullmq";

export const webhookQueue = new Queue("webhooks", {
  connection: {
    host: process.env.REDIS_HOST,
    port: Number(process.env.REDIS_PORT)
  }
});

export async function enqueueWebhookJob(event) {
  await webhookQueue.add("process-webhook", event, {
    jobId: event.eventId,
    removeOnComplete: 1000,
    removeOnFail: 5000,
    attempts: 5,
    backoff: {
      type: "exponential",
      delay: 1000
    }
  });
}

Using the provider event ID as the job ID is a nice extra guardrail because duplicate deliveries collapse into the same logical unit of work more naturally.

Normalize Events Before They Spread

Once you support more than one webhook type, raw payload handling gets messy fast.

A clean strategy is to map provider payloads into an internal event shape early.

For example:

export function normalizeProviderEvent(payload) {
  return {
    eventId: payload.id,
    eventType: payload.type,
    occurredAt: payload.created_at,
    accountId: payload.account?.id,
    data: payload.data
  };
}

That helps because the rest of your application can operate on a stable internal contract instead of vendor-specific nesting.

It also makes testing easier. You can write processing logic against your normalized event model rather than crafting giant provider payload fixtures every time.

Handle Retries Deliberately

There are really two retry systems in a webhook architecture:

  • provider retries if your endpoint does not acknowledge correctly,
  • your own worker retries if downstream work fails after the event is accepted.

Those should not be treated as the same thing.

Provider retries are external pressure. Your best defense is fast acknowledgment plus durable intake.

Worker retries are internal policy. That is where you choose:

  • how many attempts make sense,
  • which failures are transient,
  • when an event should move to a dead-letter queue,
  • and what should page a human.

In practice, a useful failure policy often looks like this:

  • validation failures: reject or quarantine immediately,
  • temporary database/network failures: retry with backoff,
  • repeated permanent failures: move to manual review.

In plain English: not every failed event should be retried forever, and not every failed event should be dropped immediately.

Observability Is Part of Reliability

If webhook processing matters, you need to be able to answer basic operational questions quickly:

  • did we receive the event?
  • was the signature valid?
  • did we enqueue it?
  • did the worker process it?
  • if not, where did it stop?

That means logging should be structured around event identity.

At minimum, include fields like:

  • provider name,
  • provider event ID,
  • event type,
  • delivery timestamp,
  • processing status,
  • retry count.

A basic structured log helper might emit:

logger.info({
  provider: "stripe",
  eventId: event.eventId,
  eventType: event.eventType,
  attempt: job.attemptsMade,
  status: "queued"
}, "Webhook event accepted");

Once those fields are consistent, tracing a bad event becomes much less painful.

Without this, every webhook incident becomes archaeology.

Local Testing Without Lying to Yourself

Testing webhooks locally is awkward because the provider needs to reach your machine.

That is where a tunnel like ngrok is useful. It gives the external provider a public URL that forwards to your local server.

The workflow is usually:

  1. run your local app,
  2. expose it through ngrok,
  3. register the temporary HTTPS URL in the provider dashboard,
  4. trigger test events,
  5. inspect ingestion logs and queue behavior.

A lot of developers stop after they see one request arrive. Go further.

Test these cases on purpose:

  • valid signature,
  • invalid signature,
  • duplicate event delivery,
  • queue unavailable,
  • downstream processing failure,
  • malformed payload,
  • slow handler under retry conditions.

If the system only works for the first happy-path delivery, it is not actually tested.

A Production-Ready Flow in Plain Terms

A healthy Node.js webhook system often looks like this:

  1. provider sends event,
  2. route receives raw body,
  3. signature is verified,
  4. payload is parsed and normalized,
  5. event identity is recorded,
  6. job is queued,
  7. endpoint returns 202 quickly,
  8. worker processes the event,
  9. idempotency record prevents duplicate side effects,
  10. logs and metrics track success or failure.

That flow is not glamorous, but it is dependable.

And that is really the whole point. Webhooks are infrastructure disguised as application code. The teams that treat them casually usually pay for it later.

What People Usually Get Wrong

The recurring mistakes are predictable:

1. Doing business logic inline in the request handler

That makes provider retries, latency spikes, and downstream failures much harder to manage.

2. Verifying the signature against an already-parsed body

Then legitimate requests fail verification in confusing ways.

3. Ignoring idempotency

Duplicates eventually create duplicate state changes.

4. Treating retries as an afterthought

Then a temporary outage becomes an event-loss problem.

5. Logging too little context

When something fails, nobody can reconstruct the event journey.

None of these are unusual mistakes. They are what happen when a webhook receiver is treated like a normal controller instead of a reliability boundary.

Final Takeaway

Building a webhook receiver in Node.js is not mainly about wiring a POST route. It is about building a dependable intake system for external events.

If you preserve raw bodies, verify signatures correctly, acknowledge quickly, queue real work, enforce idempotency, and make the pipeline observable, the receiver becomes trustworthy.

If you skip those pieces, the endpoint may look fine in development while quietly carrying production risk.

The practical model is simple:

  • trust nothing by default,
  • respond quickly,
  • process asynchronously,
  • deduplicate aggressively,
  • and make failures visible.

After reading this, you should be able to design a Node.js webhook receiver that does not just accept events, but handles them in a way that survives retries, incidents, and real production pressure.