Computer Science behind Geospatial Databases

Simon Grätzer

Playlists: 'froscon2019' videos starting here / audio / related events

Modern databases generally implement some level of support for geospatial queries. In this talk I want to explore the techniques and frameworks that are commonly used by a variety of NoSQL and SQL databases.

Starting with the mass-market availability of GPS enabled Smartphones, Fitness-Watches and continuing with IoT devices, self-driving cars: Data with Geospatial information is generated and used in an increasing number of applications. Some of the most popular Apps rely on the availability to access geospatial data in an instant, just look at Google Maps, Uber, Tinder and many more.
Searching on single-dimensional data is essentially a solved problem, but indexing and searching multidimensional geospatial data requires some specialized algorithms and data-structures.
At ArangoDB we have gained some experience about this Topic as we have recently implemented Geospatial capabilities in our distributed database product.

In this talk I want to discuss the foundations of geospatial datastores. Topics are:

- A short introduction into QuadTrees, Space filling curves and more

- Introduce Google's widely used S2 geometry library

- Comparison between different implementation strategies in NoSQL datastores like MongoDB, RethinkDB and ArangoDB

- Characteristics of different geo-query types and use-cases, and where typical performance traps are that the average deeveloper / database-user might run into.