Powrót do bloga Engineering

What I Learned After Fixing Dozens of Broken Webhook Implementations

Thomas Richter Thomas Richter December 3, 2024 5 min czytania
What I Learned After Fixing Dozens of Broken Webhook Implementations

I have lost count of how many broken webhook setups I have debugged over the years. The story is always the same. Someone sets up an endpoint, parses the incoming data, updates a database, and moves on. Three weeks later their customers start complaining about missing tracking updates. Nobody can figure out why.

The truth is that webhooks look deceptively simple on the surface. But there are a handful of things that will absolutely bite you if you do not get them right from the start. I want to walk through the big ones - especially if you are working with shipping tracking events through our Tracking API or building something similar on your own.

Signature verification is not optional

This is the first thing I check when someone tells me their webhook integration is acting up. Are they verifying signatures? Half the time the answer is no.

Without signature verification, your endpoint will accept data from literally anyone who can guess the URL. That is a security problem, obviously. But it is also a debugging nightmare because you end up with garbage data mixed in with legitimate events and no way to tell them apart.

The way it works is straightforward. The webhook provider includes a signature header with each request - usually computed using HMAC-SHA256. Your endpoint recomputes the signature using the raw request body and a shared secret. If they match, the request is legitimate. If not, reject it.

One thing that trips people up constantly - you need the raw request body for this, not the parsed version. If your web framework automatically parses JSON before you can access the original string, the whitespace might differ just enough to produce a different hash. I have seen teams burn days on this exact problem. Days.

Webhooks arrive more than once - plan for it

This is the big one. Webhook delivery is at-least-once. Not exactly-once. Network timeouts, server restarts, retry queues - there are plenty of reasons you might receive the same event two times. Or five times.

Every tracking event from Uniship includes a unique event identifier. You need to store these and check them before processing. The approach is simple - before you do anything with an incoming event, look up its identifier in your database. If it already exists, skip it. If it does not exist, process the event and record the identifier in the same transaction.

That last part matters a lot. The deduplication check and the actual processing need to happen inside a single database transaction. Otherwise you get a race condition where two identical events arrive within milliseconds of each other, both pass the check, and both get processed. I have seen this cause duplicate customer notifications and even double refunds. Not fun.

Respond quickly and process later

Here is a mistake I see constantly. People do all their processing - database writes, external API calls, sending emails - inside the webhook handler itself. If that takes more than a few seconds, the webhook system assumes delivery failed and retries. Now you have duplicate events arriving because your handler was too slow. It is a vicious cycle.

The fix is dead simple. Receive the webhook, verify the signature, push the raw event onto a queue, and return a success response immediately. Then process the event asynchronously from the queue. Your response time drops to almost nothing. Retries basically disappear.

Think of it this way. The webhook handler's only job is to say "yes, I got it." Everything else happens separately.

Tracking status mapping is messier than you expect

Every carrier has their own way of describing what is happening with a package. One carrier says "transit" while another says "in_delivery" and a third uses something entirely different. If you are consuming raw carrier webhooks, you will need to build a mapping layer to normalize all of this into something consistent.

Our Tracking API handles this normalization for you. We map everything to a consistent set of statuses - pending, in transit, out for delivery, delivered, failed attempt, exception, and returned. But even with normalized statuses, there is a gotcha that catches people off guard.

Status transitions are not always linear. A package can go from in transit to exception and then back to in transit. Do not build your logic assuming a happy path. Always accept the latest status update based on the event timestamp rather than the order events arrive in. Events can arrive out of sequence and you need to handle that gracefully.

Build retries into your own outbound calls

Sometimes after receiving a webhook you need to call another API - maybe to fetch full shipment details from our Shipment API, maybe to update an external system. Build retries with exponential backoff for these calls.

The idea is simple. If the first attempt fails, wait a short time and try again. If that fails, wait longer. Keep increasing the delay up to a reasonable maximum. But do cap the backoff. Waiting seventeen minutes between retries rarely helps. If a service has been down that long, you need alerting rather than longer waits.

Monitor what matters

I always tell teams to track four things with their webhook implementations. First, the receive rate - a sudden drop means something is broken upstream. Second, the duplicate event rate - if this spikes, something is wrong with delivery or your deduplication is failing. Third, processing latency from receipt to database update. And fourth, the error rate broken down by event type because some status transitions might have bugs you have not encountered yet.

We expose webhook delivery logs in the Uniship dashboard so you can always compare what we sent versus what your endpoint acknowledged. That visibility alone has saved people hours of debugging.

Wrapping up

Most of these patterns are not specific to shipping. They apply to any webhook integration. But tracking webhooks are high-stakes. A missed delivery notification means a frustrated customer calling your support team.

If you would rather skip building all of this plumbing, our Tracking API handles normalization, retries, and delivery guarantees out of the box. You just bring the endpoint.

But if you are building it yourself, get idempotency and signature verification right first. Everything else is optimization. Those two are correctness. I learned that the hard way about six years ago in Berlin when I spent an entire weekend tracking down phantom duplicate notifications for a logistics client. Get the fundamentals right and everything else follows.

Zacznij wysyłać z Uniship

Dołącz do setek firm, które wysyłają mądrzej za pomocą jednego API.

Zacznij za darmo