Drone-nado: Building Scalable Geospatial Applications with Aerospike

Today’s Geospatial Problem

Alvin RichardsWe are accustomed to Geolocation-aware applications like Foursquare, Yelp & Uber. These applications rely on performing proximity searches based on your current location to provide what’s interesting, open or fun near you. You can “check-in” to provide a feedback loop to your friends or group about your current location and what you are up to. Advertisers can then use your location information to serve you targeted offers and advert campaigns. But like everything else, the world of Geospatial data is changing, both in variety and velocity. Just like the Internet age grew data by orders of magnitude with e-Commerce and other applications, we are going from human-generated data (such as a check-in at a restaurant) to use cases where machines and algorithms are generating this Geospatial data – and at a much faster pace. Geospatial data is no longer a static set of points, regions or shapes to inquire about; it’s now dynamic and rich in content. We are, in a very real way, entering the age of the Internet of Moving Things.

Drone-nado

To build engaging applications and experiences, it is critical for you to also be able to respond to – and integrate with – a broader, changing set of Geospatial data. You need a database that can scale and maintain its throughput and latency, not only with traditional structured and unstructured data, but now, with Geospatial data. At Aerospike, our marching orders are “Speed at Scale”. We think Geospatial data deserves the same treatment.

Introducing Aerospike’s Geospatial Features

Aerospike can now store GeoJSON objects and execute various types of queries. Internally, we use Google’s S2 library and Geohashing to encode and index these points and regions. Aerospike can store Points & Polygons (including holes) and support the following types of queries:
  • Points within a Region
  • Points within a Radius
  • Regions a Point is in
This can be combined with a User Defined Function (UDF) to filter the results – i.e., to further refine them to only include, for instance, Bars, Restaurants or Places of Worship near you – even ones that have such granular attributes as being currently open or having availability. Additionally, finding the Region a point is in allows, for example, an advertiser to figure out campaign regions that a mobile user is in, and hence serve the user a geospatially targeted advertisement. Internally, the same storage mechanisms are used, which enables highly concurrent reads and writes to the Geospatial data or to other data held on the record.

The Aerospike Drone Delivery Service (ADDS)

We have put the demo code on Github so you can play along at home and see how easy it is to build rich, complex applications with the simple APIs that Aerospike delivers for all the popular languages. Imagine that you are building a drone delivery system. The goal of your Drone Delivery Service (DDS) is to track – in real time – fast-moving objects while your Drone fleet scurries around the city picking up and dropping off letters, packages and other essential items for modern-day living. It’s like Uber for Drones, but Drones are just WAY cooler than cars – just ask my 10-year-old son. To be fair, his Drone is stuck up a tree, clearly showing the need for better mapping of dangerous objects and places to avoid!

The Problem

The mission of the Drone Delivery Service is to efficiently pick up and deliver letters, packages and other items with our modern Drone fleet. To accomplish this, we must be able to do the following:
  • Accept new delivery requests, capturing the pickup and drop-off latitude/longitude coordinates
  • Provide the current state of the item – whether the package is waiting to be picked up, a Drone is en route to pick it up, it’s currently on the delivery route, or it has been delivered
  • Allow each Drone to understand which items requiring delivery are closest, and reserve one for itself (so that another Drone will not try to pick up the same item)
  • Allow our customers to track, in real time, where their item is currently located
  • Allow our NOC to monitor all the outstanding packages and Drone locations and determine if we need to add further Drone capacity
We needed a database solution that not only provides the critical Geospatial features, but will maintain Speed at Scale as the business grows. And this is where Aerospike came in: enter the Aerospike Drone Delivery Service (ADDS).

The Algorithm

The full code base is on Github; below are the key functional flows and code snippets to illustrate how the Geospatial data is manipulated.

Accepting new items for delivery

We track each new item that needs to be delivered in a Jobs set, with a timestamp of when the request was received, its current state (e.g. awaiting pickup) and the lat/long for pickup and delivery. New requests need to be received in real time, and the information must immediately be available for the Drone fleet to act upon. A record looks like this:
aql> select * from demo1.jobs
[
 {
   "id": 1,
   "state": "Waiting",
   "Waiting": "\"{\"type\":\"Point\",\"coordinates\":[-0.09999999999999998,-0.3625]}\"",
   "destination": "\"{\"type\":\"Point\",\"coordinates\":[-0.43374999999999997,-0.59125]}\"",
   "driverid": 0
   ...
]
Here’s the Java code snippet for the ingestion of new Jobs:
public boolean put(Job job) {
   Key key = new Key(database.namespace, setName, job.id);
   

   String originBinName = job.state.name(); // location is stored in a bin by this name
   Bin idBin          = new Bin("id",        job.id);
   Bin stateBin       = new Bin("state",     originBinName);
   Bin originBin      =     Bin.asGeoJSON(originBinName,   
                                          job.getOrigin().toGeoJSONPointDouble());
   Bin destinationBin =     Bin.asGeoJSON("destination",
                                          job.getDestination().toGeoJSONPointDouble());
   Bin locationBin    =     Bin.asGeoJSON("location",
                                          job.getLocation().toGeoJSONPointDouble());
   Bin candidateBin   = new Bin("candidate", job.isCandidate());
   Bin droneIdBin    = new Bin("droneid",  job.droneid);
   // Plus other attributes...
 

   WritePolicy writePolicy = makePutWritePolicy(job);
 

   try {
     ++Metering.jobPuts;
     database.client.put(writePolicy, key, idBin, stateBin, originBin,
                         destinationBin, locationBin, candidateBin, droneIdBin);
     ++((Metadata)job.metadata).generation;
     return true;
   } catch (AerospikeException e) {
     e.printStackTrace();
     return false;
   }
}

What item should be picked up next?

When a Drone becomes available (it could have come online or have just dropped off an item), it needs to figure out which package needs to be picked up next. This is achieved by:
  • Performing a WithinRadius query
  • If no item is available for pickup:
    • Increasing the radius of the search
    • If the maximum radius has been exceeded,
      • resetting to the minimum radius
      • sleeping for a while
  • If an item is available for pickup:
    • reserving the item so it can’t be picked up by another Drone
    • moving to the pickup location from the current position “as the crow flies”, periodically updating the current position
    • updating the status of the item to “InProcess”
    • moving to the drop-off location from the current position “as the crow flies”, periodically updating the current position
    • updating the status of the item to “Delivered”
Here’s a code snippet for performing the WithinRadius query:
private void queryCircle(Location droneLocation, double radius, Predicate<? super Job> action) {
   Statement stmt = new Statement();
   stmt.setNamespace(database.namespace);
   stmt.setSetName(setName);
   String binName = Job.State.Waiting.name();
   stmt.setFilters(Filter.geoWithinRadius(binName, droneLocation.x, droneLocation.y, radius));
 

   QueryPolicy policy = new QueryPolicy();
   ++Metering.jobQueryWithinRadius;
   RecordSet rs = database.client.query(policy, stmt);
   try {
     while (rs.next()) {
       ++Metering.jobRadiusResults;
       Job job = get(rs.getKey(), rs.getRecord());
       if (!action.test(job)) {
         break;
       }
     }
   } finally {
     rs.close();
   }
}
We need to ensure that one and only one drone gets assigned to pick up the Job. To do this, we use the concept of Generation of the record, an internal piece of metadata that Aerospike maintains. The generation count gets incremented each time a record is changed. We can create a WritePolicy that automatically compares the Generation number from the record we read to that on the Aerospike server. This ensures that another Drone has not reserved the same Job between the current thread querying and updating the record. This method gets called from the put method defined above.
private WritePolicy makePutWritePolicy(Job job) {
   WritePolicy writePolicy = new WritePolicy();
   writePolicy.recordExistsAction = RecordExistsAction.REPLACE;
   if (((Metadata)job.metadata).generation != 0) {
     // We care only when we're changing the job's state,
     // but it doesn't hurt to leave this enabled.
     writePolicy.generationPolicy = GenerationPolicy.EXPECT_GEN_EQUAL;
     writePolicy.generation = ((Metadata)job.metadata).generation;
   }
   return writePolicy;
}
We can now use this mechanism to claim the Job record and update the State:
public boolean putWithNewState(Job job, Job.State from, Job.State to) {
   Database.assertWriteLocked(job.lock);
   if (from == to && from != Job.State.Init) {
     return false;
   }
   if (job.getState() != from) {
     return false;
   }
   ((Metadata) job.metadata).previousState = job.state;
   job.state = to;
   boolean success = put(job);
 

   if (!success) {
     job.state = from;
   
   return success;
}

Scaling

Naturally, as the number of items that need to be picked up and dropped off rises and our Drone fleet gets larger, our Drone Delivery Service needs to be able to scale. This means that we have to deal with increased reads and writes, especially as regards the Geospatial indexes that are used to support the query operations. For the application code, no changes are required. This simply becomes an operational task to:
  • Provision additional hardware and install Aerospike, and
  • Configure Aerospike and start the daemon process
When new node(s) join the existing cluster, an automatic migration will occur as the system rebalances the data across the Smart Partitions™. Built into each Smart Client™ is the ability to understand the topology of the cluster and ensure that operations are routed to the appropriate node as the cluster changes.

Demo

We have put together this short video to give some background on the technology and use cases, and to show off the code of the Aerospike Drone Delivery Service. It’s no Steven Spielberg epic, but we kinda like it! Click here to watch the video

Where to Get More Information

For more information about Aerospike’s Geospatial capabilities, please refer to the following:
  • Aerospike’s Geospatial features are described in our documentation
  • The source code for this example is on Github
  • Tell us about your awesome project that uses Aerospike’s Geospatial feature by:
Posting your story in our user forum