The geospark from tociek

geospark's Introduction

Stable	Latest	Source code

GeoSpark@Twitter || GeoSpark Discussion Board || || (since Jan. 2018)

GeoSpark is listed as Infrastructure Project on Apache Spark Official Third Party Project Page

GeoSpark is a cluster computing system for processing large-scale spatial data. GeoSpark extends Apache Spark / SparkSQL with a set of out-of-the-box Spatial Resilient Distributed Datasets (SRDDs)/ SpatialSQL that efficiently load, process, and analyze large-scale spatial data across machines.

GeoSpark contains three modules:

Name	API	Spark compatibility	Dependency
GeoSpark-core	RDD	Spark 2.X/1.X	Spark-core
GeoSpark-SQL	SQL/DataFrame	SparkSQL 2.1 and later	Spark-core, Spark-SQL, GeoSpark-core
GeoSpark-Viz	RDD	Spark 2.X/1.X	Spark-core, GeoSpark-core

Core: GeoSpark SpatialRDDs and Query Operators.
SQL: SQL interfaces for GeoSpark core.
Viz: Visualization extension of GeoSpark core.

Please visit GeoSpark website for details and documentations.

News!

~~GeoSpark 1.1.1 is released~~. GeoSpark 1.1.2 is released. This release contains several bug fixes. Thanks for the patch from Lucas C.! Release notes || Maven Coordinate.
GeoSpark 1.1.0 is released. This release contains new SQL functions, custom Quad-Tree/R-Tree index serializers and bug fixes. GeoSpark 1.1.0 supposrt Apache Spark 2.3. Note, GeoSparkSQL Maven Coordinate changed Release notes || Maven Coordinate (Thanks for the index serializer patch contributed by Zongsi Zhang!)
GeoSpark wiki is now moved to GeoSpark new website! Users are welcome to contribute your tutorials and stories by making a PR!

Recommend Projects