Git Product home page Git Product logo

mosaic's Introduction

Databricks

mosaic-logo

An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.

Mosaic provides:

  • easy conversion between common spatial data encodings (WKT, WKB and GeoJSON);

  • constructors to easily generate new geometries from Spark native data types;

  • many of the OGC SQL standard ST_ functions implemented as Spark Expressions for transforming, aggregating and joining spatial datasets;

  • high performance through implementation of Spark code generation within the core Mosaic functions;

  • optimisations for performing point-in-polygon joins using an approach we co-developed with Ordnance Survey (blog post); and

  • the choice of a Scala, SQL and Python API.

    mosaic-logo Image1: Mosaic logical design.

Getting started

Requirements

The only requirement to start using Mosaic is a Databricks cluster running Databricks Runtime 10.0 (or later) with either of the following attached:

  • (for Python API users) the Python .whl file; or
  • (for Scala or SQL users) the Scala JAR.

Both the .whl and JAR can be found in the 'Releases' section of the Mosaic GitHub repository.

Instructions for how to attach libraries to a Databricks cluster can be found here.

Releases

You can access the latest artifacts and binaries here.

Ecosystem

Mosaic is intended to augment the existing system and unlock the potential by integrating spark, delta and 3rd party frameworks into the Lakehouse architecture.

mosaic-logo Image2: Mosaic ecosystem - Lakehouse integration.

Example notebooks

This repository contains several example notebooks in notebooks/examples. You can import them into your Databricks workspace using the instructions here.

Project Support

Please note that all projects in the databrickslabs github space are provided for your exploration only, and are not formally supported by Databricks with Service Level Agreements (SLAs). They are provided AS-IS and we do not make any guarantees of any kind. Please do not submit a support ticket relating to any issues arising from the use of these projects.

Any issues discovered through the use of this project should be filed as GitHub Issues on the Repo. They will be reviewed as time permits, but there are no formal SLAs for support.

mosaic's People

Contributors

sllynn avatar milos-colic avatar edurdevic avatar dependabot[bot] avatar mjohns-databricks avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.