Software today is becoming increasingly interconnected.
Instead of building things from scratch, an increasing amount of engineering work today is ‘connecting’ prefabricated parts together. We can rely more and more on third party services: businesses that expose an API - a software interface - that other systems connect to and use for services like payments, email, etc.
This accelerates software development: we can focus more on the core of our value propositions instead of building secondary (but still necessary) capabilities such as email infrastructure and server deployments. These secondary components can be delegated to external APIs.
However, it does mean that our systems need to listen for real-time events from the external services we rely on.
How should we listen for these events?
First let’s look at Polling, the traditional way for systems to offer ‘real-time’ data sharing capabilities.
Imagine that you want to find out when your favorite bookstore has a newly published book in stock.
Our friendly neighbourhood bookstore.
The naive approach is to call the book store each day to see if it’s finally in stock.
The client sends an HTTP request to server for a particular resource,
and the server responds to the request.
This repeated questioning is in essence what is happening between two services when their communication relies on polling. One service constantly asks the other for data.
You soon realize that polling (in this case, calling the store each day) is a huge waste.
Traditional REST APIs are designed for when you want to allow a synchronous read or write. This is far from being optimal when you just want your application to be told when something changed, because it would require polling at regular intervals and that just doesn’t scale.
Polling an API is generally a wasteful and messy way of trying to retrieve that piece of information. Some events may happen only once in a blue moon, so you have to figure out how often to make the polling calls, and you might miss it.
According to Zapier, 98.5% of polling requests return no new, actionable information.
Polling is bad for everyone. It’s taxing to servers on both sides, which spend most of their time and resources with requesting responses that contain nothing new.
When there is data, it’s only as new as the polling interval, which means it might not even satisfy a user’s thirst for instant data access.
Polling isn’t a solution to make the web real-time, it’s a hack! It should make your inner engineer cringe.
A much better idea is to have the store notify you when they have a sale. So, you look up the address for the store’s promo department, write to them letting them know your address, and they start sending you fliers for deals.
Our bookstore now has a mailbox for outgoing messages.
When there’s new information,
the webhook event provider will send a message with that information, by making an HTTP POST request with a JSON payload:
And clients will reply to let the provider know the message has been received.
An event-driven approach ensures you can send notifications at the right time, as it occurs.
You can think of webhooks as reverse APIs.
Webhooks in the Real World
Who uses webhooks? Pretty much everyone.
Notably, the messaging app Slack has a large collection of integrations - most of which powered by webhooks.
Reasons to use Webhooks
Webhooks are 66 times more efficient than traditional polling. Whereas only 1.5% of polling requests are actionable, with webhooks the expected figure is closer to 100%.
By doing so, you can:
- Reduce server load, allowing you to decrease the number of servers you need to support the same number of clients.
- Drop bandwidth usage by orders of magnitude.
2. User experience
Webhooks are a more idiomatic solution to receiving soft real-time updates from third-party services, compared to polling every X-minutes.
Most developers prefer webhooks to polling, because they don’t have spend time on the quirks of polling.
Webhook Design 101
Let’s take a look at how you can design your webhooks. There are three main steps in consuming a webhook: setup, subscriptions, and notifications.
Next, the webhook subscription itself should be managed as a REST resource. (This approach is called RESThooks.)
Minimally, a webhook subscription object includes:
- list of event names
- target url
- status (on / off)
Having a UI to interact with the Subscriptions API is a tremendous boon for developers.
From the Stripe dashboard, developers can create subscriptions and specify which events that endpoint will listen to.
What’s in an Event?
In a webhook system, events are the messages that are sent from the provider to the subscriber. There could be different types of events that are sent from your webhook provider.
Each event type needs at least two things:
- A name (use the noun.verb dot syntax, IE: contact.create or lead.delete).
- A payload template (simply mirror the representation from your standard API).
The payload that you build for each record in your REST API would match exactly your API’s representation of the same object. That makes it easy to map REST resources to hook resources and vice versa.
Many webhook providers follow the naming convention
namespace.noun.verb for their event types.
Event Dispatch & Delivery
The first component a webhook event provider needs is Event Dispatch: a mechanism to specify in your application code that an event is triggered.
Anything in your system could trigger events. You can write an internal library that lets you dispatch messages to a message queue component that handles the actual sending.
For example, an inline
notify(<my-event>) function might dispatch a message to your message queue.
You also need Event Delivery: a mechanism to POST or otherwise deliver the payload for each event in your message queue to the proper target URL for each matching subscription.
The Event Delivery component should handle:
- Compiling and POSTing the combined payload for the triggering resource and hook resource.
- Handling responses like 410 Gone and optionally retrying connection or other 4xx/5xx errors.
If you know you’ll need to scale your solution, use a tool specifically designed for that. You can use open source scalable queueing solutions like RabbitMQ or a service like Amazon Simple Queuing Service.
This way, your interaction is limited to adding and removing “messages,” which tell you what webhooks to call. Like the DB queue, you need a separate process to consume items from the queue and send notifications. In addition to using a tool designed for this purpose, a proper queue also saves database resources for what it does best–providing data to your primary application.
The key with securing your webhooks is to allow clients to verify if an event was sent from a legitimate source.
HTTP Responses and Retries
In essence, we should retry if the receiver did not reply with an OK HTTP error code.
There are many retry policies you can choose from depending on your use case.
📬 Get updates straight to your inbox!
Subscribe to my newsletter to make sure you don't miss anything.