A real-time transactional-analytical system needs to combine transactional and streaming data in a single high-performance database that can operate as fast as the inbound data streams in. It also needs to work with various analytics frameworks including machine learning and artificial intelligence. Aerospike Connect for Spark addresses these requirements by combining streaming data with historical data for enhanced real-time decisioning and insights.

Have questions before you get started?

How it works

Aerospike Connect for Spark (Figure 1) supports streaming APIs that leverage Structured Spark Streaming to provide very low latency for both reads and writes, enabling AI/ML use cases that leverage Aerospike as a system of engagement in their Spark Streaming pipeline. Aerospike Connect for Spark coupled with the Aerospike Database scan-by-partition capability, predicate filtering and mapping of Aerospike partitions to Spark partitions allows massive parallelization.

Aerospike Connect for Spark diagram

Figure 1. Aerospike Connect for Spark

Why Aerospike Connect for Spark

Aerospike Connect for Spark enables the creation of data-intensive applications such as AI/ML, ETL, etc. with familiar and easy to use Spark tools.

Real-time Analytics
Drastically reduce time to insight by combining massively parallel computation in Spark with the massively parallel reads from Aerospike.
Gain Closed-loop Business Insights
Gain closed-loop business insights by operating on transactional data and streaming insights based into the DB using Spark connect.
Rapid Development
Save time developing analytics and AI/ML applications that use data in Aerospike by using a Spark supported language of your choice and the rich ecosystem of libraries that are already available with Spark.
Lower TCO
Lower TCO by enabling analysis of massive, larger datasets yet with a smaller storage cluster footprint.

Get started with Aerospike

Have questions before you get started?