Flexible Data Model

Aerospike’s structured yet schema-less flexible data model supports strongly typed data for cross-language compatibility. Rows contain cells with strings, integers, blobs, lists, maps, and serialized objects.

Aerospike is fundamentally a key-value store wherein values (stored as unstructured binary objects) can be retrieved via a primary key lookup. At the highest level, data is collected into policy containers called namespaces, similar to databases in an RDBMS system. Namespaces are configured when the cluster is started and used to control expiry, replication, and storage settings for a given set of data. For example, keeping more copies of the data allows you to trade increased storage requirements for improved availability during unexpected hardware failures that take out one more nodes in a cluster.

Within a namespace, data is subdivided into sets, similar to tables, and records, similar to rows. Each record has a unique indexed key and one or more named bins, similar to columns, that hold values associated with the record. Values in the bins are strongly typed and can include strings, integers, and binary data, as well as language-specific binary blobs that are automatically serialized and deserialized by the system. Note that although the values in the bins are typed, the bins themselves are not—the same bin value in one record may have a different type than the bin value in different record. Each record also has hidden fields such as generation and time-to-life (TTL) that enable the system to efficiently implement check and set (CAS) and data expiry.

Although this structure may initially seem to align with to familiar RDBMS structures, there are important differences to note. Most importantly, unlike RDBMS systems, Aerospike is entirely schema-less. This means that sets and bins do not need to be defined upfront and can instead be added during runtime, providing maximum flexibility for applications. Having an arbitrary schema will result in increased run-time overhead for maintaining indexes, so Aerospike provides special optimization in cases where specific schema simplifications are present – e.g., single-column namespaces have been used widely in several deployments due to their enormous efficiencies in both storage and performance.