Transaction isolation guarantees

Natalia Pryntsova
6 min readAug 23, 2020

You have probably seen “transaction isolation levels” matrix multiple times in documentation for different databases. Here is one for PostgreSQL v.12:

Transaction Isolation Levels

Ever wondered, why are there so many isolation levels? And what does “serializable” actually mean except that it says it prevents “serialization anomaly”?

Let’s have a quick look into isolation implementation and talk serializability and linearizability guarantees.

Each level has its price

SQL standard provides four transaction isolation levels to fight off all the bad things that can happen in highly-loaded, highly-concurrent databases — write skews, phantoms, lost updates, anomalies — all these things which cause midnight calls and long investigations about missing or incorrect data.

Increasing the isolation level prevents certain issues from happening but it also does not come for free.

For example, Read Committed is the default level for PostgreSQL. In general, to implement Read Committed it is enough to store two versions of the row — one committed and one updated by some transaction but not committed yet.

To level up to Repeatable Read this simple-ish model is not enough anymore and we need to store multiple versions of each row marking them with which transaction has committed which version. This approach is called MVCC (multi-version concurrency control).

MVCC

Cool thing is in PostgreSQL these implementation details are easily visible via xmin and xmax columns:

xmax/xmin hidden columns

Where xmin returns transaction id which inserted the row and xmax returns transaction which updated/deleted the row. If you are interested in deeper dig into MVCC internals pageinspect extension comes in handy.

As you can see from this brief example, Repeatable Read will have extra cost compared to Read Committed: performance overhead of storing different row versions, cleaning up “deleted” rows and providing ever-increasing transaction ids.

Practical considerations

Despite being defined in SQL standard, isolation levels may have different behaviour between different databases and even between database versions. For example Repeatable Read level is implemented via MVCC in PostgreSQL but in MSSQL it is implemented via shared locks.

Do you need to care about these details? It is useful to know in any case but makes real difference if your database is highly loaded and has concurrent reads and writes. Ask yourself these questions:

  • What are the chances of any of the data corruptions actually happening? Are there spikes in writes which can create race conditions?
  • How bad would it be if some data is corrupted? Is my data transient, can it be derived from raw data ?

Serializability

Due to the unfortunate clash of terminology, “serializable” means different things in different areas of software engineering (like ISerializable interface or serializable object).

Serializability comes from the “serial execution” and it guarantees that outcome of the transactions is equivalent to some serial (without overlapping in time) execution of the transactions.

This sounds a bit theoretical but it is a very convenient property!
Think about it from a testing perspective.

If our database guarantees serializability and we have tested that our transaction completes successfully on its own we are guaranteed it will also complete successfully (or throw serialization error) in concurrent execution. Our system still needs to handle retries in case of serialization errors but otherwise it is enough to make sure the transaction works on its own.

Let’s look at classical example of how a hospital is left with no doctors on call:

On-call doctors example

As you can see these two transactions if executed serially would leave either Alice or Bob on call so logic is solid. But executed concurrently in Repeatable Read isolation T1 and T2 will leave no-one one on call because each transaction is reading from its own snapshot where on_call count is 2.

A bulletproof approach would be to track dependencies between all concurrent transactions and create a dependency graph where nodes are transactions and edges are different types of read-write dependencies. Then check this dependency graph for cycles.

On-call example dependency graph

In practice it proves too costly to have a large dependency graph so PostgreSQL uses a technique called SSI (Serializable Snapshot Isolation) to provide a serializable isolation level.

It still creates a dependency graph but only tracks one type of transaction dependencies called “rw-conflicts”. Check this out for deeper look.

Instead of searching for cycles it tracks adding of new edges and aborts a transaction with “serialization error” if it sees an edge which may cause a cycle in future. It can give false positives but is also has better performance and considered a good compromise especially compared with two-phase locking technique.

To track which transaction accessed which objects and effectively build a dependency graph PostgreSQL uses a special SIREAD type of locks. Naturally problems occur when there is nothing to lock on, like inserting into a table with unique index:

Unique index insert

T1 should not fail if executed serially (before or after T2) but it does fail in Serializable isolation and with unique constraint error not serialization error. This is because T2 does not select anything and the database has nothing to put SIREAD lock on.

It is important to understand that serializable isolation has nothing to do with actual time of the transactions and as long as there is some serial order of execution the guarantee stands. This equivalent serial order can be different from the order in which transactions were actually run.

Linearizability and strict serializability

The property which adds time element to transaction ordering is called linearizability and combined with serializability gives another guarantee called strict serializability.

Linearizability is a recency (wall-clock) guarantee on reads and writes to a single object, meaning that the moment a write has happened all subsequent reads should return a value of that last write.

As you can guess, snapshot isolation implementation cannot be made linearizable because the whole idea of it is that each transaction reads from it’s own version of data and when a write has been committed in one transaction another concurrent transaction will not see it.

Linearizability trade-offs are both performance and availability. Here is a well-known example of two data centers suffering from network partition between them:

Data centers with network partition

Client C3 connected to potentially stale database copy can either make a non-linearizable read from Follower DB or mark itself unavailable because it cannot make a linearizable read.

It is not a trivial task to create a service which provides both linearizability guarantee and tolerance to network faults and timeouts.

Linearizability property takes us away from traditional databases into the land of coordinators like ZooKeeper and to slightly different use cases like data replication or leader election. For linearizability implementation in ZooKeeper check this out.

Summary

Serializable transaction isolation level comes from serial execution idea and has no dependency on actual time transactions were executed.

There is a stronger guarantee called linearizability which is usually not provided by the majority of databases because it is non-trivial to implement and it has performance cost.

--

--