Easier development & No data loss
Aerospike is optimized to work with the latest in storage and database technology to squeeze as much transaction throughput as possible while still guaranteeing strong consistency (ACID). This means easier development and no data is lost.
For read/write operations on a single record, Aerospike makes strict guarantees about the atomicity of these operations:
In addition to single record operations, Aerospike supports distributed multi-key transactions using a simple and fast iteration interface where a client can simply request all or part of the data in a particular set. This mechanism is currently used for database backup and basic analytics on the data and delivers extremely high throughput. Single key transactions are serialized with respect to a multi-key transaction but two multi-key operations may not be serialized with each other.
For operations on single keys, Aerospike provides immediate consistency using synchronous replication.
Multi-key transactions are implemented as a sequence of single key operations and do not hold record locks except for the time required to read a clean copy. Thus the multi-key transaction provides a consistent snapshot of the data in the database (i.e., no “dirty reads” are done).
Aerospike’s support for relaxing consistency models gives operators the ability to maintain high performance during the cluster recovery phase after node failure. This means read and write transactions to records that have unmerged duplicates in the cluster can be sped up by bypassing the duplicate merge phase.
In the presence of failures, the cluster can run in one of two modes – Partition Tolerant or High Consistency. The difference between the two modes is seen for only a brief period during cluster recovery after a node failure. Aerospike is highly consistent when the cluster does not split.
-Partition Tolerant Mode
In Partition Tolerant mode, when a cluster splits, each faction of the cluster continues to operate. One faction – or the other – may not have all of the data, so an application reading data may have successful transactions stating that data is not found in the cluster. Each faction will begin the process of obeying the replication factor rules (replicating data) and may accept writes from clients. Application servers which read from the other faction will not see the applied writes, and may write to the same primary keys. At a later point, if the factions rejoin, data which has been written in both factions will be detected as inconsistent. Two policies may be followed. Either Aerospike will ‘auto-merge’ the data by favoring the last write (the write with the latest server timestamp), or both copies of the data will be retained. If two copies – versions – of the data are available in the cluster, a read of this value will return both versions, allowing the application to resolve the inconsistency. The client application – the only entity with knowledge of how to resolve these differences – must then re-write the data in a consistent fashion.
-High Consistency Mode
In High Consistency mode, when the cluster splits, the minority quorum could be made to halt. This action prevents any client from receiving inconsistent data, but will reduce availability.
Aerospike implements distributed isolation techniques consisting of latches and short-term record locks to ensure isolation between multiple transactions. Therefore, when a read and a write operation for a record are pending simultaneously, they will be internally serialized before completion, though their precise ordering is not guaranteed.
For enabling simple multi-record transactions, Aerospike supports an optimistic concurrency control scheme. This is based on atomic conditional operations (CAS – Check and Set), which makes the very common read-modify-write cycle safe — without the often-crippling overhead of explicit locking. In many simplistic data storage systems, reading a data element, modifying it, and then writing it back exposes a race condition that could lead to data corruption during highly concurrent access to the data item.
For multi-key operations, one of the cluster nodes anchors the iteration operation and requests data from all the other nodes in parallel. Snapshots are taken of the indexes at various points to allow minimal lock hold times. As data is retrieved in parallel from the working nodes, it is forwarded to the client, with client flow control exerting backpressure on the distributed transaction.
Any client-server system suffers from the client being potentially disconnected from the server at any time. This can result in the client being unable to distinguish whether a transaction in flight has been applied or not. Recovery from this situation may be quite complex. Aerospike supports mechanisms for retrying writes and using a client generated unique persistent transaction identifier that enables clients to properly determine if the transaction has completed properly.
-Data is never lost. In order to keep your data always available, Aerospike provides multi-server replication of your data. The cluster is configured to contain multiple namespaces (like ‘databases’ within an RDBMs), and each namespace contains configuration of the storage system, and storage policies, for the data contained in that namespace.
The basic mechanisms for providing durability in Aerospike consists of replication to multiple nodes using both DRAM and persistent storage. Therefore, every transaction update is written to multiple locations on the cluster before returning to the client. For example, in a persistent namespace that stores data in both DRAM and disk with a replication factor 2, the record is stored in four locations, two copies in DRAM on two nodes and two additional copies on disk.
-Aerospike is resilient to simultaneous hardware failures.
In the presence of node failures, clients are able to seamlessly retrieve one of the copies of the data from the cluster with no special effort. This is because, in an Aerospike cluster, the virtual partitioning and distribution of data within the cluster is completely invisible to the client. Therefore, when client libraries make calls using a lightweight Client API to the Aerospike cluster, any node can take requests for any piece of data.
If a cluster node receives a request for a piece of data it does not have locally, it satisfies the request by internally proxying the request, fetching the data from the real owner using the internal cluster interconnect and subsequently replying to the client directly. The Aerospike client-server protocol also implements caching of latest known locations of client requested data in the client library itself to minimize the number of network hops required to respond to a client request.
During the period immediately after a cluster node has been added or removed, the Aerospike cluster automatically transfers data between the nodes to rebalance and achieve data availability. During this time, Aerospike’s internal “proxy” transaction tracking allows high-consistency to be achieved by applying reads and writes to the cluster nodes which have the data, even if the data is in motion.
-Aerospike’s backup and restore enables offsite data storage and cross data center portability.
Aerospike provides online backup and restore, which, as the name indicates, can be applied while the cluster is in operation. Even though data replication will solve most real-world data center availability issues, an essential tool of any database administrator is the ability to run backup and restore. An Aerospike cluster has the ability to iterate all data within a namespace (similar to a map/reduce). The backup and restore tools are typically run on a maintenance machine with a large amount of inexpensive, standard rotational disk.
Aerospike backup and restore tools are made available with full source. The file format in use is optimized for high speed but uses an ASCII format, allowing an operator to validate the data inserted into the cluster, and use standard scripts to move data from one datastore to another. The backup tool splits the backup into multiple files. This allow restores to occur in parallel from multiple machines in the case of needing a very rapid response to a catastorphic failure event.