Distributed Cache System

Salem Alqahtani
3 min readMay 25, 2022

--

To start, let me remind you what is cache. A cache is a hardware or software component for storing data. The cache allows the client to find the data faster than other types of memory. For instance, the simple cache architecture for web applications is illustrated in the figure below.

A single Cache on the web application server

If the data is too large to be served by a single cache on a single machine, we need a distributed machine to serve the clients' requests in a scalable and distributed manner. This is called a distributed cache system.

Nowadays, the distributed cache is very crucial for web applications. The distributed cache is a cache that has its data spread across several nodes in a cluster around the globe. See the figure below for the illustration.

Generally, a Distributed Cache is based on a Distributed Hash Table(DHT) which is similar to a hash table but spread across multiple nodes. DHT allows a distributed cache to scale on the fly by managing the addition. To understand how distributed cache works, I would like to explain what a hash table and DHT are. You can read more about the hash table.

A distributed hash table (DHT) is a decentralized storage system that provides lookup and storage schemes similar to a hash table for storing key-value pairs. Each node in a DHT is responsible for keys along with the mapped values. Any node can efficiently retrieve the value associated with a given key. Just like in hash tables, values mapped against keys in a DHT can be any arbitrary form of data and has two functionalities put(key, value) and get(key).

The advantages of DHT are nodes form a peer system without any central authority, the system is reliable with nodes joining, leaving, and failing at all times, and the system scale efficiently by adding nodes. The most notable designs for DHT are the Consistent Hash

References:

--

--