Listening to a podcast the other day I learned some interesting things about MongoDB (which I haven't used). I learned about causal consistency, which I had never heard of before. It was curious to also heard that, when asked about how to handle replication issues (the dreaded (replication lag), the podcast guest (a product manager that works at MongoDB) suggested either using master pinning or, if they were using the latest MongoDB versions, then it automatically used session pinning inside to.
That a quite modern and non-relational database had the same suggested approaches to handling replication and consistency issues was peculiar. Then, on the same talk they mentioned that Mongo uses an
Operations Log for replication, which sounded really really familiar... so I did the following tiny research exercise of summarizing a few opensource storage systems and how they replicate their data:
Keeps and emits a stream of commands to the replication nodes.
Has a commit log + in-memory
memTables per node. Not having used the system I'm not totally sure, but it seems that when replicating data using the more advanced strategy (
NetworkTopologyStrategy), the designed coordinator sends a write request, and when each node has it on the commit log & memTables, it replies back with an ACK.
Note that Cassandra is a distributed system, so each data item lives at N nodes only, where N is the replication factor (number of copies of each data item).
I've left out other systems that are not meant for long term storage out, but even some of those have a similar design. For example, Kafka keeps a local command log (with the writes) to send to the partition replicas in order to achieve topic replication.
Different database systems, almost exactly the same solution: Keep a log of commands, propagate that log to replicas.