Predictable High Performance
Developing innovative new web-based apps hinges on one key factor: speed. Aerospike Database is the best solution available, delivering consistent, predictable high performance. In real-life situations, you can expect 99% of queries to receive a response in under 1 millisecond. (Check out the independent YCSB benchmark.) In fact, 99.9% of queries return within 5 milliseconds.
You may be skeptical — that’s a pretty big claim. But we’ve got the track record to prove that we deliver. Many databases claim great performance under laboratory conditions. In the real world, hardware will fail. Aerospike was designed from the beginning as a fault-tolerant database.
Testing our Performance
We created a demo video, using real servers, to measure what happened when a server went down — how it affected response times. In linear systems, when a server goes down it affects query response time. However because of Aerospike’s automatic load balancing, you’ll see in the video how one server stopping doesn’t affect the response time, the other servers re-balance the load and the users aren’t affected.
In the video below, we show the performance of a 4-cluster server handling 1 million transactions per second. The big graphs at the top show the aggregate cluster performance. Each server’s performance is shown in a smaller graph at the bottom.
Speed under Different Loads
We put heavy loads (mixed reads and writes, immediate consistency, replication factor 2) through the cluster to show you the normal throughput/latency. However in the real world, disks fail, servers crash and power goes out or there are rolling upgrades to be done. So for this demo, we pulled the plug to bring a node down, to show you what would happen. Note that no one had to touch the clients or the other Aerospike servers in the cluster — the servers automatically handled the failure of one server.
We simulated the read/write balance that is fairly typical in web-based applications. Client machine ran one test client written in Java 6, used 16 threads and issued a load of 95% reads/5% writes, with each transaction reading or writing 10 byte strings.
Cluster Performance During the Test
The two graphs show aggregate cluster performance during the sequence described above.
Consistent Throughput – at 100TPS, 400TPS and 1Million TPS
The graph at the right shows cluster throughput during the experimental sequence. Note how the Aerospike cluster throughput of 400k TPS barely budges when node 3 goes down and comes back up, in spite of all the re-balancing and data migration that Aerospike Database does automatically.
Predictable Low Latency – well under 1ms even at 1Million TPS
The other issue of concern is latency — when node 3 goes down, how does it affect the response time to clients?
As you can see from the graph on the right, response times are very consistent – critical when revenue is tied to response. The graph on the right shows response times of 0.3ms with a momentary uptick to 5ms when node 3 goes down, well within the 5-10ms SLA that most of our customers must meet 99.9% of the time.
The independent YCSB benchmark showed results of 99% responses in under a millisecond and 99.9% of responses in under 5 milliseconds. The graph on the right demonstrates how that actually works, even with 25% of the cluster nodes offline.
Performance of Individual Servers
The graphs below show what happens to individual server performance. The graph on the left shows Node 3, which we unplugged at around 20 seconds and kept offline until around 60 seconds. The graph on the right shows Node 1 picking up extra traffic to compensate for Node 3′s failure. The graphs for Nodes 2 and 4 are identical to the graph for node 1 — they show each server picking up a higher load, and the load rebalanced evenly over nodes 1, 2 and 4. Note that no human intervention was required, the nodes automatically re-balanced.
In summary then, with traffic from 100,000 to 1,000,000 TPS, Aerospike Database maintained high throughput and fast response times — consistently and predictably — the most essential requirement of big data applications.