Aerospike 101

8 min readMar 22, 2021

What is Aerospike:

Aerospike is a distributed NoSQL database supporting the key-value store and document-oriented data models. — Providing robustness and strong consistency with no downtime. Aerospike works on a “Shared Nothing” Architecture.

Scalability:- Flash and Hybrid Memory Architectures allows the aerospike database to scale Petabytes of data.
Speed:- Low Latency is maintained at a high scale (which makes a better decision in real-time).
Ease of Deployment and Management.
Low Total Cost Of Ownership:- Fueled by a hybrid memory architecture and compression, Aerospike provides significantly lower (~20%) TCO than first-generation No SQL and relational databases.

What Is “Shared Nothing” Architecture?

Enables Non-disruptive Upgrades: Instead of having a certain amount of downtime while you are upgrading infrastructure with shared resources, you can upgrade a node at a time.
Eliminates Single Point of Failure: With shared systems, a single point of failure can take down your site or app entirely.
Avoids Unexpected Downtime: This allows for some amount of self-healing that can be another line of defense against unexpected downtime.

Keywords

Namespaces: Namespaces are the top-level of the container. The namespace contains one or more Set, Records, Bins. If we compare to RDBMS, the namespace is similar to a Database Schema.
Sets: Set is more similar to a collection in MongoDB, or a table in RDBMS. It contains many records and bins.
Records: Records are more similar to rows in RDBMS. One record has one PK (key) and has one or many bins. And in one set/collection, it may have many records.
Bins: Bins in Aerospike is more similar to a column in RDBMS. We can add the index to any bin. The difference is, it’s more flexible and dynamic. It can have a lot of bins in one record. And for a single bin, it’s can store any data type ( Int, String, Byte, etc). It’s more like the column but more flexible.

Technology Behind Aerospike Database

Real-Time Transaction Engine:

Aerospike’s real-time engine delivers the maximum performance possible and can scale millions of transactions per second at sub-millisecond latency.
Responsible for Reading and Writing data upon request while providing consistency and isolation (which involves synchronous and asynchronous replication).
Requests to an alternate node if a node becomes unavailable as well as conflict/duplicate resolution after node rejoins the cluster.
Multiple Core System — Improves latency by reducing data across multiple regions.
Context Switch.
Data Structure Design — Safe and concurrent read, write and delete access to index tree without holding multiple locks.
Scheduling and Prioritization — In addition to key-value store operations, Aerospike supports batch queries, scans, and secondary index queries.
Memory Allocation.

Data Distribution:

Data Partitioning that has a uniform distribution of keys in the digest space, “Avoiding the creation of hotspots during data access” — which helps in achieving high-level scale and fault tolerance.

Application workload is uniformly distributed
The performance of database operations is predictable.

Cross-data center Replication

It supports different replication topologies, including active-active, active-passive, chain, star, and multi-hop configurations.

Load sharing
Remote cluster management
Data shipping
Pipe-lining

Storage Engine

It is not just the throughput and latency characteristic, but also the ability to store and process large swaths of data that defines the ability of a DBMS to scale up.
Aerospike has been designed from the ground up to leverage SSD technology. This allows Aerospike to manage dozens of terabytes of data on a single machine with sub-millisecond record access times. Aerospike supports three kinds of storage structures: Hybrid-Memory, All-Flash, and In-Memory.

Dynamic Data-rebalancer

The data re-balancing mechanism ensures that the transaction volume is distributed evenly across all nodes and is robust in the event of node failure happening during re-balancing itself. The system is designed to be continuously available, so data re-balancing doesn’t impact cluster behavior.

Smart Cluster Management (Self Healing Cluster Management)

Adding and removing nodes seamlessly to the cluster.

HeartBeat subsystem.
Clustering subsystem.
Exchange Subsystem.

Aerospike Client

Aerospike provides client libraries and we use these libraries to connect to the cluster and perform operations.

Aerospike client is a first-class observer of the server, it polls and checks the state of the server every 1 sec. It knows if any new node is being added or removed and knows the partition map of each namespace which gives it an overall picture of the cluster.

How aerospike distributes data randomly?

SetName + KeyType + UserKey is hashed to a 20 byte value using RIPEMD160 hash function.
12 bits of this hash are used as the partition id.
4096 partitions per namespace. Each namespace has its own partition map.
This hash and some additional data are stored as Primary Index in RAM.
The master and replica for a partition are decided when a cluster is formed.

What happens when a node fails?

In an example four-node cluster, if node #3 has a hardware failure, nodes #1, #2, and #4 automatically detect the failure. Node #3 is the master for 1/4th of the data, but those partitions also exist as replicas on nodes #1, #2, and #4. These nodes automatically perform data migration to copy the replica partitions and create data masters. For example, partition 23 is replicated on node #4 and copied to node #2, which becomes the new master for partition 23. At the same time, your application (which includes the Aerospike Smart Client) becomes aware of the node #3 failure and automatically calculates the new partition map. This process occurs in reverse when a node is added to the cluster.

Key Decisions

Persistence supported are- in Memory and Hybrid (memory + Persistence). Aerospike recommends SSD for persistent storage of data.
Each namespace (Schema) separately configured to support different persistence types.
Replication factor: depends on copies of data. Suppose replication factor: 2 means storing two copies of the data- master, and replica. Replication factor decides on nodes in cluster and data need — minimum should be 1 and maximum should be a number of nodes in the cluster.

Namespace configurations: Default

namespace <namespace-name> {# memory-size 4G           # 4GB of memory to be used for index and data# replication-factor 2     # For multiple nodes, keep 2 copies of the data# high-water-memory-pct 60    # Evict non-zero TTL data if capacity exceeds 60% of 4GB# stop-writes-pct 90       # Stop writes if capacity exceeds 90% of 4GB# default-ttl 0            # Writes from client that do not provide a TTL  will default to 0 or never expire# storage-engine memory    # Store data in memory only}

Namespace configurations: Persistent Memory Storage Engine

namespace <namespace-name> {memory-size <SIZE>G             # Maximum memory allocation for secondary indexes (if any).storage-engine device {           # Configure the storage-engine to use  # persistence. Maximum size is 2 TiB.  file /opt/aerospike/<filename>  # Location of data file on server.  filesize <SIZE>G                       # Max size of each file in   GiB.  data-in-memory true             # Indicates that all data should   also be  in memory.  }}

Strongly Consistent (SC) and Available (AP) Modes:

AP: The AP mode has the “eventual consistency” guarantee that a typical NoSQL database provides.

SC: To ensure that a reader gets only the latest committed value (no stale or dirty/uncommitted reads) and no committed data is lost (durable commits), Aerospike offers the SC mode that guarantees strong consistency. In this mode, a “linearizable” (explained below) read is always consistent across replicas. Aerospike’s strong consistency support has been independently confirmed through Jepsen testing.

Can be configured by using policy.

When to Use Aerospike vs. Redis:

Need for scalability and elasticity

To scale Redis, companies often add more nodes and DRAM because it’s a single-threaded system designed for in-memory processing. But DRAM is expensive, and managing increasingly large clusters isn’t easy.

Redis configuration requirements inhibit elasticity as well. Companies can only scale out a cluster by a multiple of the current number of shards, and they can’t remove shards from a cluster once they are created. So scaling up before peak periods or down afterward can be painful and expensive.

If we are building mission-critical applications where data consistency is a must, then Redis is not likely the right choice. Redis has not passed the Jepsen test for strong consistency (whereas Aerospike has). Redis supports eventual consistency, which can result in stale reads and even data loss under certain circumstances. The Jepsen framework works by firing client operations at a cluster while concurrently injecting some sort of chaos like network partitions, killing process, slowing disks, etc.

Some challenges in Redis:

One Master, multiple Slaves — i.e. the ‘write’ throughput is limited by the one machine on which the master is running on.
Redis is single-threaded, which means there is no vertical scalability in terms of CPU.
Real-time master-slave synchronization issues — with the huge amount of writes on the master, all the changes had to be synchronized with the slaves. This can lead to slaves having to be taken offline for synchronization because of the inability to sync huge chunks of data and serve data to incoming requests from the RTB application at the same time.
There is no handy way of storing multiple different types of data in the same database — we had to store different entities in different Redis instances, having to deal with multiple connections on different ports.

How Aerospike helps:

Partitioning — it has 4096 partitions by default, which are spread across your nodes in the cluster. This helps us with the ‘write’ throughput.
Aerospike is multithreaded — makes for the most effective usage of resources.
No downtime for master-replica synchronizations — you can configure the ‘write’ policy so that the write-request is considered ‘finished’ after the replica creation confirmation.
Namespaces — all different types of data can be stored in the same cluster under different namespaces, leading to the following hierarchy: namespace > set > record.
SSD or in-memory storage — Aerospike has two modes: SSD versus in-memory. Redis is in-memory only, which means it becomes very costly at scale, whereas Aerospike can offer competitive performance with the use of SSDs.

Thanks for reading the blog. I hope it was somewhat helpful. 😊

References: https://www.youtube.com/watch?v=PA7PGWphW8M&ab_channel=Aerospike

https://medium.com/@me.nayan/aerospike-overview-and-setup-abc1aa110f87