A real-time transactional-analytical system needs to combine transactional and streaming data in a single high-performance database that can operate as fast as the inbound data streams in. It also needs to work with various analytics frameworks including machine learning and artificial intelligence. Aerospike Connect for Spark addresses these requirements by combining streaming data with historical data for enhanced real-time decisioning and insights.
How it works
Aerospike Connect for Spark (Figure 1) supports streaming APIs that leverage Structured Spark Streaming to provide very low latency for both reads and writes, enabling AI/ML use cases that leverage Aerospike as a system of engagement in their Spark Streaming pipeline. Aerospike Connect for Spark coupled with the Aerospike Database scan-by-partition capability, predicate filtering and mapping of Aerospike partitions to Spark partitions allows massive parallelization.
Figure 1. Aerospike Connect for Spark
Why Aerospike Connect for Spark
Aerospike Connect for Spark enables the creation of data-intensive applications such as AI/ML, ETL, etc. with familiar and easy to use Spark tools.