BlueKai does the Math: Flash/SSD Most Cost-Effective Method for Speed at Scale

BlueKai

If you are considering flash/Solid-State Drives (SSDs) as a storage solution, you’ll be happy to know that much of the work has already been done for you! While conventional rotational drives may be cheaper than SSDs at first glance, a closer look reveals that the long-term overhead of rotational drives makes it an unwise choice for companies looking to scale quickly without concerns of latency or equipment failure.

During ‘The BlueKai Playbook’ webinar (video below), Ted Wallace, VP of Data Delivery at Data Management Platform BlueKai, touched on the economics of flash/SSDs, and how BlueKai concluded they were the most economical choice for processing upwards of 10 trillion data transactions a month:

“The economics of SSDs versus memory are brain dead simple”

We found early on, especially with Cassandra, we had a lot of smaller flash drives. It was a crazy number of 64 gig flash drives. When you start thinking about failure rates, if you have ten of those machines and you’ve got twenty 64 gig flash drives, that’s 200 flash drives. And if you have a failure rate in a flash drive that’s 1-3% per year, you’re talking about getting a couple failures every month or every other month, you’re talking about a lot of overhead work. Early on we saw that kind of problem with our deployment, so we wound up skewing towards larger drives and now have a small handful of 800 gig [Intel] drives in each of our machines. We’ve certainly found over the course of the past two years, their reliability seems to be pretty good. They fail slightly more frequently than we’ve seen rotational fail in our deployments, but not with such alarming frequency that we spend a whole lot of time spinning servers in and out. Overall, we found both the performance to be great and the reliability to be pretty solid.”

Obviously our profile store boxes with SSDs are quite a bit more expensive than a pizza box that has a good fast CPU on it, a little rotational disk, and 8 gigs of ram. So it’s something we have to think about, because if you were to move all that data into memory because you needed response times, that would be crazy expensive. I think you’re talking on the order of a couple gig per dollar for SSDs versus five or six dollars per gig of memory. As a result, the economics of doing it on SSDs versus memory are brain dead simple.

The next question is, could I keep economics down on rotational? The answer is no. Because with rotational, if you want to do really well, you’ve got to go for much higher input/outputs. You’re looking at higher rotational disks that are more expensive and then you have to add enough of them so you have lots of spindles. Even once you do that, your performance will not be nearly as good.

We actually did an experiment where we moved our batch ingest systems off of SSD onto rotational because we had a system where we look it up on a per-customer basis – something like once or twice a day. So if one or two of your hits to us was 30-40 milliseconds worse than what it would be ordinarily, it’s probably not a huge deal. What we found was that when you’re trying to ingest billions of records of data every day into a system with rotational disks, you need so much rotational disk to keep on top of that ingest, that it becomes super costly. We just couldn’t keep up with the amount of data we were trying to ingest from our customers, from a batch perspective, with rotational disks. So we had to go back to SSDs. In the end, it saved us a little bit of money to be on rotational disks, but when we did that we didn’t have a system that kept up to the demand. Once you do the math it made a lot of sense just to stay on SSDs.”


Interested in discovering which SSDs will work for your system? Aerospike has partnered with several of the leading SSD vendors to benchmark their drives with our open-source Aerospike Certification Tool (ACT). Try it now