Wangjun He
Chinese Academy of Surveying and Mapping, China
Title: A spatial overlay method for massive vector data based on Spark
Biography
Biography: Wangjun He
Abstract
With the growing geographical data, typical spatial overlay methods for vector data in current GIS platform were unable to adapt to voluminous vector data. Thus, this paper presents a novel spatial overlay method for vector data based on the distributed memory computing framework. Firstly, according to the principle of distributed computing, i.e., map and reduce the vector data were divided into several grids. In this way, several partitions were made for the vector data with the aim of parallel computing. Moreover, with this method, unnecessary calculations between the apart spatial objects can be avoided. Secondly, STRtree data structure was constructed in each grid to solve the problem of the uneven distribution in each grid. Meanwhile, with the STR-tree data structure, the efficiency of overlay operation in the same grid can be improved, and the data unevenly distributed problem can be solved by this way. The final comparison between this method and other typical methods shows that this method can significantly improve the overlay operation’s performance for the large-scale vector data.