Widely adopted in the demanding AdTech space, Aerospike is diversifying its customer base by catering to real-time, mission-critical use cases in other industries, including financial services, e-commerce, telecommunications, and retail. This expansion brings with it a new set of requirements. For instance, financial companies require the ability to filter input data by a given set of predicates. Thus, Aerospike is moving beyond it origins as a key-value store and adding database functionality such as predicate filtering in order to serve these enterprises over the widest range of use cases.
Accordingly, we are excited to announce the release of Aerospike 3.12. This release builds on our core mission of providing customers a highly scalable and performant database. In v3.12, we’ve added exciting features such as in-database filtering and several other heavily requested operational features, boosted scalability, and further optimized our already impressive database performance.
While Aerospike focuses on predictable, high performance key-value access, its query and scan facilities provides many necessary features for maintenance and analytics. To enhance these capabilities, Aerospike now contains in-database filtering, which is a key internal database component necessary for more extended analytics.
This heavily requested feature enables you to write arbitrary expressions and to filter both query and scan results based on contents of individual columns. The expressions are executed in C for high performance.
Our initial programing interface to this logic—a basic, internal API representing our low-level network protocols—can be found in the PredExp Java documentation. Aerospike is working to provide a more friendly “high-level” query interface, but in the interest of satisfying developers who need access to this functionality now, we are making this low-level interface available immediately.
The greatest speed increase will be seen for metadata operations like “find most recently changed items” or “items that will expire soon”. While speed is difficult to judge because it’s dependent on the complexity of the operation and on potential network bottlenecks, many simple operations will see speed increases greater than 2x—with more complex operations seeing a greater benefit.
With filters, there is no need to learn Lua, nor to write, debug, manage, or deploy user-defined functions (UDFs). Supported filter data include metadata—such as when the record was most recently written—values of individual bins, and data inside lists and sets. Operations include standard equality and inequality operations, as well as AND and OR operations and regular expressions.
By building this core functionality, Aerospike takes a key step on its way to providing even more flexibility for high-performance transactional analytics. This feature is generally available (GA) immediately in both our Community Edition (CE) and our Enterprise Edition (EE).
Performance Optimizations with CPU Pinning and NUMA Enhancements
Aerospike continues to extend its lead in database performance with optimizations related to data locality and CPU pinning in both Aerospike editions—the CE and the EE. These changes are targeted to DRAM configurations, where database performance is not limited by storage devices.
By simply using the new “auto-pin” configuration, performance can increase between 10% and 45%. This feature should be used in conjunction with the “sprigs” feature introduced in Aerospike’s 3.11 version, which greatly decreases memory bus bandwidth consumption for DRAM indexes.
By setting the auto-pin configuration variable to “cpu”, the machine’s network interrupts will be configured to route to the same core as the threads servicing that port. This provides additional performance whenever Aerospike is the only process on a server. The auto-pin variable can also be set to “numa”, which can provide even greater performance boosts; however, this setting must be used with care with a rack-aware configuration in order to provide correct reliability guarantees. If you wish to use this feature, we recommend that you contact Aerospike support for further details.
In order to support new devices and higher-density RAM configurations, we have increased the number of objects per namespace per server from 4 billion to 32 billion. Each namespace can thus consume 2 TB of DRAM in each node.
This feature has been requested by our highest-scale customers, who had been working around the limitation of 4 billion by creating multiple namespaces. With the new limit of 32 billion objects per server and 128 nodes per cluster, assuming a replication factor of two (two copies of each record), a practical limit of 2 trillion objects per namespace can be achieved.
This feature is available in the Aerospike Enterprise Edition.
Namespace and Set Truncation
One of the most requested features as of late is the ability to remove data quickly from individual sets and namespaces. Indeed, the ability to truncate a set or namespace has a wide variety of applications. While in the past, it was possible to truncate using scan and delete APIs, these were slow and cumbersome to use.
With the 3.12 release, in both our Community Edition and our Enterprise Edition, truncating data from sets and namespaces can be accomplished quickly and efficiently with a single administration (asinfo) command or a single API call. These calls rapidly remove entries from the in-DRAM index. Also, in the Enterprise Edition, an entry is added in the SMD (System Meta Data) subsystem so that a full restart does not cause the data to return.
The API and ‘asinfo’ commands also take an optional timestamp. This allows the removal of the data that was most recently updated in a certain time range. As always with time-based operations, please make sure your database clocks are synchronized to an appropriate level of precision.
Incremental Backup and Restore
Aerospike now supports incremental backup and restore in both the Community Edition and the Enterprise Edition. Using the core filtering capabilities introduced in Aerospike server v3.12, it is now possible to only back up the records that have changed since a certain time. Documentation about this feature is available here; please see the new “data filtering options” regarding backup time, which allow you to specify a time range. Naturally, you still have access to the data filtering options such as the bins, set, and namespace you seek to back up.
As this feature uses timestamps, its results are only as precise as the synchronization of the clocks on your servers. Aerospike does not include time synchronization, so industry-standard ntpd or similar should be configured and maintained. The time on the server running the backup tool must also be synchronized.
In supporting this feature, Aerospike realizes that with both greater scale-up and scale-out technology, single point-in-time backups have become impractical for most commercial deployments. By adding this feature, we encourage more deployments to back up data more frequently.
Aerospike Management Console v4.0
The Aerospike Management Console (AMC) is a web-based tool to monitor and manage an Aerospike cluster. It provides live updates to the current status of a cluster and includes features to let you see at a glance the throughput, storage usage, and configuration of a cluster.
Although not strictly part of Aerospike Server v3.12, the release of AMC v4.0 coincided with the 3.12 release; and as it marked a complete overhaul of AMC, we feel compelled to mention it in this blog post. AMC version 4.0 was rewritten in Go to bring many performance and stability improvements. It uses less memory, makes configuring and using AMC in non-root mode much easier, and utilizes strictly one and only one connection to a database node. Consequently, it is easier to deploy, snappier, and can handle large Aerospike clusters. Its new features include the ability to do the following:
- Enable basic HTTP authentication (CE and EE).
- Configure clusters that are always permanently monitored (CE and EE).
- Connect to a TLS-enabled secure Aerospike cluster (EE only).
- Persist notifications and show them even after you close your browser (EE only).
- Configure emails to receive the Aerospike cluster alerts (EE only).
Reminder: Large Data Type (LDT) Deprecation
The large data type functionality has been deprecated in both the Community Edition and the Enterprise Edition; it will be removed in a future release. Please migrate any application functionality from this interface; contact Aerospike if you need any advice regarding the best application design for meeting your need for large data types.
Other Notable Improvements and Fixes
Release 3.12 incorporates other improvements, such as:
- Leak removal: The removal of a server file descriptor leak on queries and of a memory leak during data migration.
- Intracluster message transport improvement: We separated out the delivery of different message types, such as control, transaction, and migration, resulting in more reliable message delivery.
- PredExp filter APIs in Java and C.
- A new version of Aerospike Admin (an interface for Aerospike users to view the stats of their Aerospike cluster): Asadm v0.1.8, which now has health check capabilities.
With our next releases, we intend to introduce several new features. Please consider the below list as a preview, not a commitment!
- Improved Paxos clustering algorithms for larger clusters and greater stability in public cloud environments.
- Addition and removal of namespaces without downtime.
If you’d like to download Aerospike 3.12, visit our download page.
As always, we look forward to your input and help to continue to improve and enhance Aerospike. Feel free to contribute your feedback, ideas, and questions to our user forum, file Github issues, or create a pull request for the next great feature you’d like to contribute to the Aerospike user community!