Automated Aerospike All Flash Setup (lightning)

Ken Tune, Senior Solutions Architect Blog, Developer

Introduction

Aerospike is a key value database maximising SSD/Flash technology in order to offer best in class throughput and latency at petabyte scale.

Standard Aerospike usage will have the primary key index in DRAM and the data on SSD. Although Aerospike’s usage of DRAM is very low at 64 bytes per object, for very large numbers of objects (100bn+) users might wish to consider the all-flash mode in which the primary key index is also placed on disk. More detail at all flash usage.

There are a number of non-trivial steps to go through to set up all flash. For that reason I’ve extended aerospike-ansible to allow automation of this process. This article walks through the automated process. It’s envisaged that this will be useful for those evaluating the feature, or looking to get up and running with it quickly.

A working knowledge of aerospike-ansible is assumed. This introductory article may also be useful.

All Flash Calculations

In order to correctly configure a system for all flash, you need to know the number of partition-tree-sprigs that are appropriate for the object count you will have in your database. You can think of a partition tree sprig as a mini primary key index — we use these in order to have a lower depth primary key tree, allowing us to lookup record location more rapidly. More detail at sprigs.

It’s important for all-flash because we size the system so the sprigs fit inside single disk blocks, minimising read and write overhead.

You can find details of the calculation here, but to make life easier a spreadsheet can be found in aerospike-ansible at assets/all-flash-calculator.xlsx.

All flash calculator spreadsheet

Populate the yellow cells — # of objects, replication factor and object size.

The spreadsheet will calculate required partition-tree-sprigs.

It will also determine the fraction of available disk space that should be given over to the primary key index, based on the object size. In the screenshot, we can see that for 100m records, replication factor 2, average record size 1024 bytes, the overhead per record is 172 bytes and the overall record footprint is 2220 bytes, so approx 1/13 of the disk space should be allocated to the index.

Using Aerospike-Ansible

In vars/cluster-config.yml

  • Set partitions_per_device to the value given in the spreadsheet – 13 in the example. The first partition on each device is used for the all flash index to ensure the correct index:data disk space ratio.
  • Add partition_free_sprigs: YOUR_VALUE – YOUR_VALUE would be 1024 for this example

You will also need to

  • Set all_flash: true
  • Set enterprise: true
  • Provide a path to a valid Aerospike feature key using feature_key: /your/path/to/key. You must therefore be either a licensed Aerospike customer, or running an Aerospike trial.

Having done that

ansible-playbook aws-setup-plus-aerospike-install.yml

You should check that the aggregate disk space across your cluster exceeds the amount recommended in the spreadsheet.

Verification

Once the setup process is complete, log into one of your cluster nodes

./scripts/cluster-quick-ssh.sh

then access asadm (admin tool) followed by info command

asadm (admin tool) followed by info command

The index type comes up as ‘flash’ as per the highlight.

Data Load

You can follow the instructions in benchmarking to quickly load some data into the new configuration.

As before, we can use asadm to examine the (highlighted) disk footprint of the primary key index for (in this case) 10m records (20m includes replicas).

use asadm to examine the (highlighted) disk footprint of the primary key index

Conclusion

The aerospike-ansible tooling makes it easy to set up all flash for Aerospike and benefit from the DRAM saving it offers.

 

Cover image Michał Mancewicz

Share:

About Author

mm

    Ken Tune, Senior Solutions Architect

    All posts by this author
    Ken Tune is a Senior Solutions Architect at Aerospike. He joined the company in June 2018. He advises major companies across the EMEA region on the unique value Aerospike can bring to their business, guiding them from introduction to adoption. Prior to that he spent eight years as a consultant with MarkLogic, a document database, reaching Senior Principal level, being responsible for guidance and implementation of over 20 separate deployments. He additionally has a wealth of experience in finance, having worked for Hambros Bank, HSBC and Markit Group with experience including risk management and major system integration. He has a BA in Maths from Cambridge University and an MSc. in Computer Science from Imperial College, London.