Let's talk about idempotency

Let's have a brief talk about idempotency applied to computer science.


Wikipedia's definition of idempotence says a function is idempotent if the system state remains the same after one or several calls.

A Stripe blog post nicely summarizes that an idempotent API or service endpoint can be called any number of times while guaranteeing that side effects only occur once.

Idempotency then is a mechanism to help us with:

  • Makes retries safe, as else a retry can make things worse and cause a cascading effect
  • Helps dealing with duplicates. Generalizing previous point, applied to messaging and/or service interactions, it allows to get most of the benefits of an Exactly-once message delivery in a world of At-least-once.
  • Helps achieving certain data integrity without relying on distributed transactions, two-phase locking and other mechanisms which are more reliable but also incur in performance penalties


Definitions are fine, but do not talk about how to achieve idempotency. The main pillar is the idempotency-key, a way to identify a change request so that we can detect repetitions/duplications. The key might be something really simple, like a UUID the caller generates and sets as a custom header (e.g. Idempotency-Key for Stripe calls). It could be a hash of the sent data. It could also be a few fields we decide are relevant.

We'll focus on a non-trivial scenario: We have a service that can receive calls from multiple sources/clients and that internally keeps a FSM (Finite State Machine). We want to protect this service so that only processes once a request to "transition from state A to B". This isn't trivial because is very easy to protect against the same caller repeating the requests (e.g. it caller has implemented a retry system), but harder as you somehow need to differentiate when client X performs a request to transition from A to B, and then arrives a request from client Y with the same petition: transition from A to B.

We could start by defining our idempotency-key as simply <target_entity_id> + <new_state>, and that would work for simple scenarios, but what happens if our states graph allows multiple ways of reaching state B? Then comes my suggestion: If you notice, I didn't say transition to B but transition from A to B. If our idempotency-key is for example <target_entity_id> + <new_state> + <current_state>, we can now easily differentiate transitions A -> B and D -> B without problems.

And now, what do we do with this idempotency-key? we simply use it to keep track of recent calls:

  • if the key is not present in our idempotency storage (Redis or in-memory are common, but any storage you can imagine is fine), we perform action and cache the output at our idempotency storage (pretty much like a cache)
  • if the key is present, we return the cached results and do not execute anything [1]

We shouldn't keep cached responses forever, right? This is why Redis or a similar in-memory cache is such a good fit: You just set a decent TTL for the idempotency items you store, and forget about it. The value depends: for sensitive operations like a purchase I've set it in the past to one hour, but could be extended for long-running processes or batch jobs, or kept very short (e.g. a few seconds for a delete operation).

There's one remaining subtlety: what do we do if our system is designed in such a way that there can be concurrent requests to the same service (trivial scenario: you have multiple instances of it)? What if we have a slow endpoint and we get a second request to transition from A to B meanwhile the first one is executing?

Here it is true that idempotency fails a bit to fully help us, because it will only work flawlessly in one of the following cases:

  • you have a single service instance (or single point of entry for processing actions/requests)
  • you have an action queue, buffer, etc., so again, actions are processed sequentially
  • you only care about repeated requests from the same caller (like Stripe's idempotency key implementation as a unique hash id)

If we want to support concurrent execution of idempotent requests, we probably need some request management mechanism, to detect executing (yet incompleted) requests and apply some sort of strategy to them:

  • wait until original finishes?
  • re-enqueue the request at the tail?
  • return an HTTP 307 redirect or a 409 conflict?

We can incorporate this detection to the idempotency middleware/component: instead of just storing the response status and data, by including also if it's finished or ongoing (personal advice, if ongoing set a small TTL); or we can have a separate requests log (just keeping which ones have finished and which ones are ongoing); we could even implement most of the idempotency management at NGinx level with some LUA scripting, although here I advise caution because caching non-GETs is a dangerous path and you must be very careful discerning which HTTP headers to take into account as part of the idempotency "key".

Something along the lines of:

# not present
return None

# ongoing request
return {
    "status": "ongoing"

# finished request (non-error)
return {
    "status": "finished",
    "http_status": 200,
    "response": { ... }

# finished request (error)
return {
    "status" "finished",
    "http_status": 400,
    "response: None


Our previous scenario is one that, with some convention and agreement over the data sent, or via flexible configuration options, can be built for example as a Django REST Framework decorator.

A post from NParticularBus includes a handy list of some alternate approaches:

  • Message de-duplication: Explained above when done inside services. When talking about message-based communication means literally detecting repeated requests in a short time span and removing the duplicates.
  • Natural idempotency: Coding your logic to be as idempotent as possible. This is always desired and can be done for individual entities with some effort, but with complex services really hard to achieve in the upper layers.
  • Entities and messages with version information: If the data can provide some kind of version number, we can say I want to update data from entity id XXX being at version 3, then if backend detects current stored version is no longer 3 (because something else updated it), you can fail the change request. This has the drawback of needing extra communication, as the change request emitter would need to query for current data and try to apply the modifications again.
  • Side effect checks: A bit naive approach when talking complex systems, being able to detect if the side effect is already present (if our service is already in state B don't execute transition from A to B) is something you ought to be already doing.
  • Partner state machines: Having a single source that can issue a change request allows to control execution (and narrow it to exactly-once), but also creates a single point of failure and for existing complex systems might not be so easy to achieve.
  • Accept uncertainty: Embracing chaos is always an option, but one that usually doesn't ends up well 😆.

Wrapping up

Pure, raw idempotency is hard to achieve in a complex real world system. If you can, of course go for the simplest approach: Wrap your data change request into a transaction and rollback if idempotency must be honoured; you will leverage existing tools and have no unexpected side effects. As for me, the times I've implemented idempotency (at either API level or service endpoint level), a good idempotency key + TTL-limited caching of the results has proven useful.


[1] Not re-executing logic is an important remark. In a pure mathematical equivalence, where for example the number 1 is idempotent because no matter how many times you multiply by it N*1 = N, we could execute the logic every time and "just make sure the system stays the same". Now just imagine how quickly this can spiral into very complex logic, let me provide you with a simple example: ORM models usually keep modified_at timestamps, so if you want to be pure/strict, it shouldn't update more than once when running an idempotent change request twice; thus, you probably will need to use transactions everywhere, which can be a huge performance penalty. And this way we arrive to the alternative: "doing nothing". If we already know the output, the best way to not alter the system is not touching it, just returning the cached output and all is fine, we respect idempotency theory while ensuring system/data is kept exactly the same.

Tags: Development

Let's talk about idempotency published @ . Author: