Git Product home page Git Product logo

delta-live-tables-notebooks's Introduction

Delta Live Tables Example Notebooks


Delta Live Tables is a new framework designed to enable customers to successfully declaratively define, deploy, test & upgrade data pipelines and eliminate operational burdens associated with the management of such pipelines.

This repo contains Delta Live Table examples designed to get customers started with building, deploying and running pipelines.

Getting Started

  • Connect your Databricks workspace using the feature to this repo

  • Choose one of the examples and create your pipeline!

Examples

Wikipedia

The Wikipedia clickstream sample is a great way to jump start using Delta Live Tables (DLT). It is a simple bificating pipeline that creates a table on your JSON data, cleanses the data, and then creates two tables.

This sample is available for both SQL and Python.

Running your pipeline

1. Create your pipeline using the following parameters

  • From your Databricks workspace, click Jobs, then Delta Live Tables and click on Create Pipeline

  • Fill in the Pipeline Name, e.g. Wikipedia

  • For the Notebook Libraries, fill in the path of the notebook such as /Repos/[email protected]/delta-live-tables-notebooks/SQL/Wikipedia

  • To publish your tables, add the target parameter to specify which database you want to persist your tables, e.g. wiki_demo.

2. Edit your pipeline JSON

  • Once you have setup your pipeline, click Edit Settings near the top, the JSON will look similar to below

3. Click Start

  • To view the progress of your pipeline, refer to the progress flow near the bottom of the pipeline details UI as noted in the following image.

4. Reviewing the results

  • Once your pipeline has completed processing, you can review the data by opening up a new Databricks notebook and running the following SQL statements:

    %sql
    -- Review the top referrers to Wikipedia's Apache Spark articles
    SELECT * FROM wiki_demo.top_spark_referers
    
  • Unsurprisingly, the top referrer is "Google" which you can see graphically when you convert your table into an area chart.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.