A Tensor Based Framework For Large Scale Spatio-Temporal Raster Data Processing

Ulrich Leopold, Christian Braun and Dr. Sukriti Bhattacharya

Playlists: 'foss4g2019' videos starting here / audio / related events

In this paper, we address the course of dimensionality and scalability issues while managing vast volumes of multidimensional raster data in the renewable energy modeling process in an appropriate spatial and temporal context. Tensor representation provides a convenient way to capture inter-dependencies along multiple dimensions. In this direction, we propose a sophisticated approach of handling large-scale multi-layered spatio-temporal data, adopted for raster-based geographic information systems (GIS). Moreover, it can serve as an extension of map algebra to multiple dimensions for spatio-temporal data processing. We use the multidimensional tensor framework to model such problems and apply computational graphs for efficient execution of calculation processes. In this approach, spatio-temporal data can be represented as non-overlapping, regular tiles of 2-D raster data, stacked according to the time of data captured. As a case study, we quantify the spatio-temporal dynamics of solar irradiation calculations and 2.5-D shadow calculations for cities at very high space-time resolution using the proposed framework. For that, we chose Tensorflow, an open source software library developed by Google using data flow graphs and the tensor data structure. We provide a comprehensive performance evaluation of the proposed model against r.sun based on GRASS GIS. Benchmarking shows that the tensor-based approach outperforms r.sun by up to 60%, concerning overall execution time for high-resolution datasets and fine-grained time intervals for daily sums of solar irradiation [Wh.m-2.day-1]. Precisely, the main characteristics of the proposed framework include defining, optimizing and efficiently calculating mathematical expressions involving multi-dimensional arrays (tensors); Transparent use of GPU computing such that the same code can be run either on CPUs or GPUs; Implicit parallelism and distributed execution with high scalability offered by data-flow based implementation. Moreover, the Python implementation of the proposed model makes it GRASS GIS ‘Add-on’ compatible.