Aerospike is a row-oriented database. It’s a client-server solution, which allows you to build an app server cluster that attaches over TCP to your database cluster using Aerospike’s provided libraries. There are a number of NoSQL use cases where Aerospike is very well suited.
Many internet applications are designed around large, fast key-value stores. Internet applications like Twitter, Weibo, Uber, Pinterest are built primarily with large key-value systems instead of SQL. Key-value is predictable, and in-memory systems like Redis are very fast.
An application like Twitter has user records, where each user record might have a list of people followed or followers. Each user has the list of messages they’ve created – and maybe a list of sponsored messages they should read. With a NoSQL programming model, you’ll need to think about your application differently, but you’ll reap the benefits of speed and scale.
For high scale internet applications, a key-value database needs persistence, sharding, object operations like list and map manipulation, and to have great operational characteristics. A variety of different application languages need to be supported – most teams use a combination of languages, not just Java or Node but also perhaps Python, Ruby, or even components in C or Go.
Aerospike’s low latency and high throughput make it an excellent cache replacement. A cache is best when you have static data. However, if your data is dynamically changing, you are forced to either deal with inconsistencies between the cache and the database or to overwhelm the database with writes.
Also, compared to Memcache and Redis, Aerospike has built-in clustering, and can use high performance Flash (SSDs). Benchmarks show Aerospike as comparable in single server speed to both Redis and Memcache, but Aerospike includes an automatic clustering and transparent resharding. This gives you the ability to add capacity by simply starting a node. Continuous defragmentation and eviction, and Memcache’s check-and-set operation, give you familiar and necessary functionality.
Although Aerospike has complex operations like queries, aggregations, and in-database compute, we continue to maintain a high performance “fast path” — “single bin” mode, and can be configured for a RAM-only single data copy mode. If you need a cache with more reliability, you can turn on data replication, but it’s not necessary.
User Profile Store
When building any form of advertising or marketing application, you’ll need to store user profiles. Those profiles will often have recent user behavior, segments loaded from an analytics system, partner cookies, and a variety of other data. Smaller sizes – like 1 KB to 10 KB – per profile is common. Besides pure profiles, you’ll need cookie matching, campaign budgets and status, and other front-end data.
Aerospike shines in this use case, and has formed the core user store for many successful advertising companies like AppNexus, BrightRoll, Exelate. Optimized for Flash, Aerospike is far cheaper to operate at large-terabyte scale than other databases. Critical features such as automatic expiration, in-place “rolling updates” for maintenance without service downtime, and years in service with minimal downtime.
A recommendation engine uses innovative math, combined with domain-specific knowledge, to increase online engagement. If you’re designing one, you’ll need a data layer that’s fast – to support multiple requests per recommendation – and flexible, since you’ll need either more throughput or more data as your system evolves. You’ll want one that supports high write throughputs when ETLing data from your data scientists, or if you are recording recent behaviors that your algorithms will use.
Aerospike is an excellent database for recommendation engines. Key features are large lists ( for efficiently recording behavior), optimized Flash support to handle datasets from terabytes to petabytes, queries and aggregations for real-time reporting, and strong support for languages such as Python and Go. To get you started, we open sourced a real-time behavioral engine that uses recent cosine similarities.
Fraud Detection and Intervention
Detecting fraud, and hopefully intervening, is a goal every time money is at stake. In some industries, like advertising, the most effective algorithms can be quite simple: record recent ads placed, only pay once for clicks on that ad. That simple case needs a few hours of recent ads served, a high read-write rate, and accuracy. In financial services, large monolithic SQL systems execute rules created by data analytics, creating “fraud scores” that can be applied to trigger a greater level of verification.
When working with more advanced algorithms – like machine learning – the best tech stacks will use advanced libraries. They won’t be able to easily push compute into the database using SQL – SQL just gets in the way. Streaming frameworks – like Spark, Impala, Storm, and Akka – need shared stores of recent behavior. Aerospike is an excellent database for these projects.