Aerospike Connect for Spark Release Notes

  • 2.3.1
    Release Date: July 16, 2020
    • Fixed a broken API to create AerospikeConfig instance.

  • 2.3.0
    Release Date: June 19, 2020
    • Nested updateByKey support and prioritizing __digest, __ttl, __generation filters.

    Known Issues

    • Aerospike Connect for Spark version 2.0 and above is only compatible with Aerospike server version 4.9 and above.
    • updateByKey only supports keys which are accepted by the Java client.

    New Features

    • Record insertion can be done by nested updateByKey.
    • Spark Filters are rearranged such that __digest, __ttl, __generation are always in the beginning, if present.

  • 2.2.0
    Release Date: May 12, 2020

    Known Issues

    • Aerospike Connect for Spark version 2.0 and above is only compatible with Aerospike server version 4.9 and above.
    • The default value of aerospike.partition.factor has changed to 12 from 0.
      • Previous to version 2.2, the number of aerospike partitions were computed by 4096 >> f, where f is the aerospike.partition.factor.
      • From version 2.2 onwards, the number of aerospike partitions will be computed by 2^f, where f is the aerospike.partition.factor.

    New Features

    • Ability to extend aerospike partitions up to 32768 (2^15).
    • Ability to specify the target set for spark write operations through the aerospike.writeset flag.

  • 2.1.0
    Release Date: April 28, 2020

    Known Issues

    • Aerospike Connect for Spark version 2.0 and above is only compatible with Aerospike server version 4.9 and above.

    New Features

    • Added capability of streaming writes to Aerospike.

  • 2.0.0
    Release Date: April 15, 2020

    Known Issues

    • Aerospike Connect for Spark version 2.0 and above is only compatible with Aerospike server version 4.9 and above.

    New Features

    • Ability to fine tune up to 4096 scan partitions concurrently.
      • This can be further tuned by setting the aerospike.partition.factor value appropriately.
    • TLS and LDAP support.
    • Ability to query multiple primary keys through connector.

    Improvements

    • Query engine improvements.
    • Ability to specify seed nodes through Aerospike configuration.
    • Ability to specify feature file from configuration or HDFS.
    • Improved error handling in case of write/save failure.
    • Ability to enable client-server compression in spark connector.
    • Ability to set records per second for scans.
    • Fixed issue of duplicate data accumulation in primary key call.

  • 1.1.2
    Release Date: October 21, 2019

    Known Issues

    • Primary key call will fetch mutiple copies of record, hence accumulating duplicate data.

    New Features

    • Added explicit schema for saves.

  • 1.1.0
    Release Date: March 26, 2019
    • Initial Standalone Connector General Availability release.
    • Embedded Spark update.

    New Features

    • Spark 2.4.0 support.
    • Added dataset aeroIncrease function which enables dataset send add/increment operations to Aerospike server.

  • 1.0.0
    Release Date: March 12, 2019
    • Initial Embedded Spark General Availability release.

    New Features

    • Reading from Aerospike to a DataFrame/Dataset.
    • Saving a DataFrame/Dataset to Aerospike.
    • Spark SQL multiple filters pushed down to the Aerospike cluster.
    • Support for Geo points-within-region query using Aerospike.
    • Join a Spark Dataset that contains record keys to record data stored in Aerospike.