About a year ago, Aerospike published a “recipe” describing how a database can be tuned to deliver 1 M TPS on a $5k server. This year we applied many of the changes to Aerospike and doubled performance in YCSB tests. Here is a link to the independent benchmark study.

Here is how you can recreate these tests.

  • Get the right hardware
    • Aerospike server – You will need a bare-metal (not virtualized) server with:
      • 2 sockets (2 physical CPUs) 2.0+ GHz.
      • 10 Gb ethernet with multiple queues.
      • Recent Linux version (newer versions have new performance optimizations).
    • Enough clients to generate 1 M transactions per second.
      • This is something many people underestimate. It is actually hard to generate 1 M transactions per second on the client side. In general, we find that you will need 4 hosts with specs like the server above to create that much traffic. If you have more basic hosts, you may need 10 or more to generate that much traffic. These client hosts should not be the Aerospike server hosts.
  • Download and install the latest edition of Aerospike Community Edition and the Aerospike Monitoring Console (AMC) on the Aerospike server host.
  • Run afterburner.sh in the directory /opt/aerospike/bin to configure service threads in the conf file. This requires a network controller with multiple queues.
  • Download and run the YCSB tool. The Aerospike version of the YCSB includes tooling to allow you to run the YCSB from many client hosts from a central location.
  • Fire the load and see 1 M TPS in the Aerospike Monitoring Console (AMC).

For more on the three design points we kept in mind and five bottlenecks we worked around to achieve this performance, read my other article “The Quest for Database Scale – the 1 M TPS”.

You can download Aerospike, get the updated YCSB test framework and can run this test yourself following the same instructions listed in Thumbtack’s paper. We used the same server hardware and loaded data into Aerospike using the “load” phase of the YCSB tool. Because only a single database node was used, we used a replication factor of 1 (unreplicated).

Server Hardware

Server Software

Data Set

– 8 core dual Socket Intel motherboard- Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz- 32GB DRAM 1333- 1 NIC with 16 queues- 10G network (to avoid network related bottlenecks)– CentOS 6.3- Aerospike Server- File backed (ext4 filesystem) in-memory namespace- 2X replication- 4 node Aerospike cluster– Record contains 10 string fields, each field is 10 bytes long and has a 2-byte name- Key: the word “user” followed by a 64-bit Fowler-Noll-Vo hash[1] (in decimal notation)- Record size: ~ 120 bytes- Hashed index size: 64 bytes- Number of Records: 50,000,000- Total raw data: ~ 10GB

For each of the workloads, records were selected using a random Zipfian distribution. This distribution selects a small subset of popular records very frequently while the majority are hit infrequently with a roughly long tail. The system was preloaded with all the records before the tests are run.

The only difference from the original tests were the version of Aerospike used, the network was upgraded from 1 Gb to 10 Gb to eliminate network bottlenecks and tests were run with data in DRAM with added persistence using the ext4 file system.

Aerospike ran at 1 M TPS for balanced 50/50 reads and writes and 1.6 M TPS for 95/5 read-heavy workloads.

1_6MTPS95_5readheavyworkload-1024x460

We updated the open source YCSB test framework to support the latest version of Aerospike 3.x and published the tool on github (https://github.com/aerospike/ycsb). We repeated the tests that Thumbtack Technologies published in “Ultra-High Performance NoSQL Benchmark” and the results showed a 100% increase in performance.

MaxThrougput_inMemoryDataset