Skip to main content
Loading

Configure Namespace Storage

Overview

Aerospike stores data according to the storage engine you select for a namespace. Storage engines include solid state drives (SSD), Intel Optane Persistent Memory (PMem), and storage in memory with or without storage-backed persistence. Your choice of storage engine affects the durability, cost, and performance of your cluster.

Enterprise Edition (EE) and Standard Edition (SE) of Aerospike Server 7.0 store in-memory namespace data in shared memory, enabling fast restarts of the namespace. Community Edition (CE) and previous versions of Aerospike EE and SE store in-memory namespace data in volatile process memory, which means that namespace data must be reloaded from a persistence layer when the Aerospike server restarts, or filled over the network from other cluster nodes.

You configure storage engines in the Aerospike Server configuration file, /etc/aerospike/aerospike.conf. Each section describes the minimal configuration to enable a particular storage engine, and the storage sizing parameters used by that engine.

Setup for an SSD storage engine

The minimal configuration for an SSD backed namespace requires:

  • Setting the storage-engine parameter to device.
  • Adding a device parameter for each SSD device partition to be used by the namespace.

Each storage device must be properly initialized as an Aerospike device, including zeroizing the 8MiB header. See Initializing Solid State Drives (SSDs). The maximum size of a device is 2TiB. Larger devices must be partitioned into multiple equally-sized partitions that are less than 2TiB each.

For performance, we recommend reducing the write-block-size from the default of 1MiB to 128KiB on SSD-backed namespaces. This may vary based on the specific workload and record average size. Run benchmarks with asbench to find the right setting.

info

A RAW SSD device used by an Aerospike namespace should not have any other mount points configured. An SSD device partition must be associated with only a single namespace. How to add, replace and remove disks describes how to add and remove device partitions from namespaces.

Aerospike server 7.0 and later

The following is a configuration file snippet for a namespace with data on SSD:

namespace NSNAME {
stop-writes-sys-memory-pct 90 # (optional) stop client writes to this namespace when
# total memory consumption reaches 90% of system memory.
# memory-size SIZE # (obsolete) Do not use memory-size in server 7.0 or later
storage-engine device { # Configure the storage-engine to use persistence
device /dev/nvme0n1p1 # raw SSD device partition. Maximum size is 2TiB
device /dev/nvme0n1p2 # (optional) another raw device
write-block-size 128K # (optional) adjust block size to make it efficient for SSDs
max-record-size 124K # (optional) otherwise write-block-size dictates the max record size
}
}

Prior to Aerospike server 7.0

  • Optionally changing the memory-size from the default of 4GiB to a size appropriate for the expected primary index size. See the Sizing Guide to learn about sizing memory.
namespace NSNAME {
stop-writes-sys-memory-pct 90 # (optional) stop client writes to this namespace when
# total memory consumption reaches 90% of system memory.
memory-size 64G # Maximum memory allocation for the namespace.
storage-engine device { # Configure the storage-engine to use persistence
device /dev/nvme0n1p1 # raw device. Maximum size is 2 TiB
device /dev/nvme0n1p2 # (optional) another raw device.
write-block-size 128K # (optional) adjust block size to make it efficient for SSDs.
max-record-size 124K # (optional) otherwise write-block-size dictates the max record size
}
}

Setup for in-memory with storage-backed persistence

The persistence layer for an in-memory namespace can be configured as either one or more file, or alternatively one or more SSD device partitions.

Aerospike server 7.0 and later

To configure an in-memory namespace, set the storage-engine option to memory.

The following is a configuration file snippet for an in-memory namespace with storage-backed persistence.

namespace NSNAME {
# memory-size 64G # (obsolete) Do not use memory-size in server 7.0 or later
evict-sys-memory-pct 60 # (optional) eviction threshold as a percentage of
# total system memory.
storage-engine memory {
max-record-size 1M # (optional) write-block is 8MiB in server 7.0 or later
# so max-record-size can be used to limit the record size.
file /opt/aerospike/ns1.dat # Location of a namespace data file on server
filesize 64G # Max size of each file in GiB. Maximum size is 2TiB
stop-writes-avail-pct 5 # (optional) stop-writes threshold as a percentage of
# devices/files size or data-size.
stop-writes-used-pct 70 # (optional) stop-writes threshold as a percentage of
# devices/files size, or data-size.
evict-used-pct 60 # (optional) eviction threshold, as a percentage of
# devices/files size, or data-size.
}
}

Aerospike allocates an amount of shared memory equal to the total size of the persistence layer. The number of shared memory segments, called stripes, will be equal to the total number of files or the total number of the SSD device partitions for this namespace. The size of the persistence layer is defined as either the value of the filesize option multiplied by the number of files, or the total storage space of the SSD device partitions for this namespace. Each stripe size is also equal to the filesize option or to the size of a respective SSD device partition. This is due to the fact that starting server 7.0 the in-memory data of the namespace is mirrored to the persistence layer; your capacity planning should reflect a 1:1 ratio between the sizes of memory storage and persistent storage for the namespace.

Prior to Aerospike server 7.0

The minimal configuration for an in-memory namespace includes:

  • Setting storage-engine to device.
  • Setting data-in-memory to true.
  • Entering a list of file parameters to configure where data will be persisted. Use the file parameter with the namespace context, and not the logging context.

The filesize must be large enough to support the size of the data on disk, with a maximum allowed value of 2TiB. For common use cases, this should roughly be four times the memory-size.

You may need to change the memory-size from the default of 4GiB to a size appropriate to handle the expected primary index size, and the expected size of the data in memory. See the Sizing Guide to learn about sizing memory.

namespace NSNAME {
memory-size 64G # Maximum memory allocation for memory
high-water-memory-pct 60 # (optional) eviction threshold as a percent of memory-size
storage-engine device {
write-block-size 1M
max-record-size 1M # (optional) limit the max record size
file /opt/aerospike/ns1.dat # Location of data file on server
file /opt/aerospike/ns2.dat # (optional) Location of data file on server
file /opt/aerospike/ns3.dat # (optional) Location of data file on server
file /opt/aerospike/ns4.dat # (optional) Location of data file on server
filesize 64G # Max size of each file in GiB. Maximum size is 2TiB
data-in-memory true # Indicates that all data should also be
# in memory.
min-avail-pct 5 # (optional) stop-writes threshold as a percentage of
# devices/files size or data-size.
max-used-pct 70 # (optional) stop-writes threshold as a percentage of
# devices/files
}
high-water-disk-pct 60 # (optional) eviction threshold, as a percentage of
# devices/files size
}

Setup for in-memory without storage-backed persistence

Aerospike server 7.0 and later

Aerospike EE and SE store in-memory namespace data in shared memory, enabling fast restarts of the namespace. These editions can also cold restart from namespace data in shared memory. Shared memory can be backed up to the filesystem using the Aerospike Shared Memory Tool (asmt) before restarting the host machine.

To configure an in-memory namespace, set the storage-engine option to memory.

The following is a configuration file snippet for an in-memory namespace without storage-backed persistence.

namespace NSNAME {
evict-sys-memory-pct 60 # (optional) eviction threshold as a percentage of
# total system memory
storage-engine memory {
max-record-size 1M # (optional) write-block is 8MiB in server 7.0 or later
# so max-record-size can be used to limit the record size.
data-size 64G # Memory pre-allocated for the data of this namespace
stop-writes-avail-pct 5 # (optional) stop-writes threshold as a percentage of
# devices/files size or data-size.
stop-writes-used-pct 70 # (optional) stop-writes threshold as a percentage of
# devices/files size, or data-size.
evict-used-pct 60 # (optional) eviction threshold, as a percentage of
# devices/files size, or data-size.
}
}

Aerospike allocates an amount of memory equal to the value of the data-size. This size is split into 8 stripes equally.

Aerospike CE stores in-memory namespace data in volatile process memory, which does not survive restarts of the Aerospike daemon (asd).

Prior to Aerospike server 7.0

The minimal configuration for a namespace without persistence is to set storage-engine to memory. If your namespace requires more than the default 4GiB memory-size allocation for the primary index and data in memory, then you must also adjust memory-size. Refer to the Sizing Guide to learn about sizing memory.

namespace NSNAME {
memory-size 64G # Maximum memory allocation for the namespace
storage-engine memory # Does not use persistence
max-record-size 1M # (optional) limit the max record size
stop-writes-pct 90 # (optional) stop-writes threshold as a percent of memory-size
high-water-memory-pct 60 # (optional) eviction threshold as a percent of memory-size
}

Setup for data on Intel Optane Persistent Memory (PMem)

The minimal configuration for the persistent memory storage namespace requires setting two parameters in aerospike.conf for each PMem storage file to be used by this namespace:

  • storage-engine
  • file Use the file parameter with the namespace context, and not the logging context.

The filesize must be large enough to support the size of the data, up to the maximum allowed value of 2TiB.

In server 5.1 and later, persistent memory namespaces are treated equivalently to data-in-memory namespaces for the purpose of computing the default number of service-threads. The value of service-threads will default to the number of CPUs, unless there is at least one SSD namespace.

On systems with hyperthreading, only physical cores are counted. In multi-socketed systems, if non-uniform memory access (NUMA) pinning is enabled, each Aerospike instance counts only the CPU cores on the socket it is servicing.

Aerospike server 7.0 and later

namespace NSNAME {
storage-engine pmem { # Configure the storage-engine to use
# persistence. Maximum size is 2 TiB.
file /mnt/pmem/ns1.dat # Location of pmem data file on server, where /mnt/pmem is the
# mount point of an EXT4 or XFS file system that resides in pmem
# and has been mounted with the DAX option.
file /mnt/pmem/ns2.dat # (optional) Location of pmem data file on server.
filesize 64G # Max size of each file in GiB. Maximum size is 2TiB
}
}

Prior to Aerospike server 7.0

You may need to change the memory-size from the default of 4GiB to a size appropriate for the expected primary index size. See the Sizing Guide to learn about sizing memory.

namespace NSNAME {
memory-size 32G # Maximum memory allocation for the namespace
storage-engine pmem { # Configure the storage-engine to use
# persistence. Maximum size is 2TiB.
file /mnt/pmem/ns1.dat # Location of pmem data file on server, where /mnt/pmem is the
# mount point of an EXT4 or XFS file system that resides in pmem
# and has been mounted with the DAX option.
file /mnt/pmem/ns2.dat # (optional) Location of pmem data file on server.
filesize 64G # Max size of each file in GiB. Maximum size is 2TiB
}
}

Setup for a data-in-index storage engine

danger

As of server 6.4, this configuration is not supported; the single-bin and data-in-index configuration settings are removed.

A data-in-index configuration is a highly-specialized namespace for niche use cases such as counters. Use the data-in-index engine if your data is single-bin, fits in 8 bytes, and you need the performance of an in-memory namespace but do not want to lose the fast restart capability provided in Aerospike Enterprise Edition.

The minimal configuration for a data-in-index namespace includes:

  • Setting single-bin to true.
  • Setting data-in-index to true.
  • Setting data-in-memory to true.
  • The storage-engine must be device
  • The file or device parameters must be configured to map to the persisted storage device to be used by this namespace. Use the file parameter with the namespace context, and not the logging context.

You may need to change the memory-size from its default of 4GiB to a size that can accommodate the primary index, and filesize from its 16GiB default to the size of the data on disk, with a maximum allowed value of 2TiB. See the Sizing Guide to learn about sizing memory.

namespace NSNAME {
memory-size 64G # Maximum memory allocation for the namespace
single-bin true # Required true by data-in-index
data-in-index true # Enables in index integer store
storage-engine device { # Configure the storage-engine to use
# persistence.
file /opt/aerospike/ns1.dat # Location of data file on server
file /opt/aerospike/ns2.dat # (optimal) Location of another data file on server
filesize 64G # Max size of each file in GiB. Maximum size is 2TiB
data-in-memory true # Required true by data-in-index
}
}

Setup for shadow device

The shadow device storage model is designed for cloud environments with extremely high-performance SSDs that are ephemeral (not persistent), and where the persisted devices are not providing the necessary performance.

Shadow devices act as persisted stores, and must be greater than or equal to the size of the primary device. The primary device receives all read/write operations as usual, and all writes are duplicated to a shadow device. This creates a persisted data volume with lower input/output operations per second (IOPS) requirements, while gaining the IOPS benefit of the non-persisted volume without using large amounts of RAM. The shadow device only needs to satisfy the write IOPS requirements of your workload, not reads.

note

The shadow device storage model is an extension of the SSD storage engine.

When you are using network-attached shadow devices, for example on Amazon Elastic Block Store, or re-assigning shadow devices to a different instance, we recommend initially configuring the node-id across all the nodes in the cluster. This preserves the node-id on any potential new instances that will be re-attached to the shadow device of an instance, and avoids re-distribution of the partitions in the cluster.

To use shadow devices, add the persisted volume after the declaration of the non-persisted volume on the same line in aerospike.conf.

namespace NSNAME {
storage-engine device {
device /dev/sdb /dev/sdf # sdb is the fast ephemeral volume,
# and sdf is the slower persisted volume
}
}

In the example, /dev/sdb is the fast non-persisted device. /dev/sdf is the persisted device. The device order is important, with the fast non-persisted device named first, and the shadow device named second. The two devices must be listed on the same line.

You may configure multiple shadow devices, with each device pair on its own line. Each shadow device must be paired with only one primary device:

    storage-engine device {
device /dev/sdb /dev/sdf
device /dev/sdc /dev/sdg
device /dev/sdd /dev/sdh
}
danger

When configuring a namespace to use persistence of any form, a given file or device partition must be associated with a single namespace only. Two namespaces cannot share the same file or partition. Configuration of the same file or partition for multiple namespaces could cause issues with the node starting, and possibly damage existing data in that file or partition.

Instance recovery

  • If the ephemeral device is damaged - missing header information for example - and there is a valid shadow device, the server will load data from the EBS shadow device into the ephemeral disk and into memory (primary index, secondary index, data-in-memory). See the instance failure section of the AWS deployment guide.

  • If the ephemeral device fails, when the instance restarts, it populates the data from the shadow device and the server reads transactions as usual once the node has rejoined the cluster.

Updating the filesize parameter

If you need to change the filesize parameter of a namespace with storage-engine set to device, use the following procedure.

Increase filesize

Perform the following steps on your cluster one node at a time.

  1. Change the filesize parameter in the configuration file. Ensure that the relevant partition has sufficient free disk space. To add a new file to the configuration, place it as the last entry in the Aerospike storage configuration.

    note

    You do not have to delete and recreate the file when increasing the configured size of the file.

  2. Restart Aerospike:

    /etc/init.d/aerospike restart
  3. Wait for port 3000 to open and for the node to rejoin the cluster.

    The following shell command is useful for discovering whether a node has started successfully:

    cat /var/log/aerospike/aerospike.log | grep -i 'cake'
  4. Proceed with the other nodes in the cluster one by one, repeating the above steps. To avoid data inconsistency, wait for migrations to complete between each restart.

Reduce filesize

Reducing the size of an existing data file may result in the loss of data. Proceed with caution.

To avoid data inconsistency, delete the data file and update the configuration file on one node at a time, allowing the data to migrate to other nodes before proceeding to the next node.

  1. Stop Aerospike:
/etc/init.d/aerospike stop
  1. Delete the file and update the configuration file with the new filesize.

  2. Start Aerospike:

/etc/init.d/aerospike start

Where to next?