Git Product home page Git Product logo

a-day-full-of-azure-data-factory's Introduction

A Day Full of Azure Data Factory

Data Factory Icon

Hello friends and welcome to this full day workshop on Azure Data Factory. Today we will be all becoming advanced factory workers!... And we completely recommend this description when describing your job to family members. But be warned, if you go on to tell them that the factory is in the cloud you are likely to be branded as crazy. However, here and now that is ok. You are amongst like-minded geeky friends that all want to become cloud factory workers as well :-)

On a more serious note; throughout our day of training you will quickly notice, like with most technologies, there are an awful lot of different ways you can implement this Azure orchestration service and understanding the best way to do something is often the biggest challenge. That said, if you only take away one thing from today I would ask that you have an appreciation of this fact. Then when delviering solutions you take a step back from the requirements and think about the overall technical design and how Azure Data Factory should fit into your platform as a core component.

All too often with new and shiny services we start playing around then try to make the technology fit our solution. Rather than thinking about the solution requirements and which technology meets our needs. This is true of all developers, I don't want to preach, so am simply asking for a little bit of mindfulness.


Session Abstract

A Day Full of Azure Data Factory

To achieve any data processing in Azure you need an umbrella service to manage, monitor and schedule your solution. For a long time when working on premises, the SQL Agent has been our go-to tool, combined with T-SQL and SSIS packages. It’s now time to upgrade our skills and start using cloud native services to achieve the same thing on the Microsoft Cloud Platform. Within a PaaS-only Modern Data Platform, the primary component for delivering that orchestration is Azure Data Factory, combined with various other compute resources.

In this full day of training you’ll start with the basics and learn how to orchestrate your Azure Data Platform end to end. You will learn how to build Azure ETL/ELT pipelines using all Data Factory has to offer. Plus, consider hybrid architectures, dynamic design patterns, think about lifting and shifting legacy packages, and explore complex bootstrapping to orchestrate everything within your solution.

We’ll breakdown the content for this rich Azure PaaS resource as follows:

  • Azure Data Factory fundamentals. What is it and why use it?
  • Uploading data from on-premises to Azure.
  • Using SSIS packages in Azure.
  • Data Factory Mapping & Wrangling Data Flows.
  • Dynamic metadata driven pipelines.
  • Data Factory alerting, security and monitoring.
  • Pipeline pricing.
  • Data Factory CI/CD using Azure DevOps.
  • Using Azure Data Factory in production.

If that's not enough content for one day, you will also get access to a set of hands-on labs that you can work through at your own pace. Whether you are new to Azure Data Factory or have some experience, you will leave this workshop with new skills and ideas for your projects.


Full Agenda

  • Module 1: Data Factory Fundamentals

    • What is it and why use it?
    • Resource Components
    • Common Activities
    • Execution Dependencies
  • Module 2: Uploading Data to Azure

    • Integration Runtimes
      • Azure IR
      • Hosted IR
    • Hosted IR Patterns
      • Demo - Linked IR's
      • Demo - Simple Data Upload
    • Private Endpoints
  • Module 3: Using SSIS Packages in Azure

    • SSIS Integration Runtime
    • Packages Running on PaaS
    • Scaling Out Package Execution
      • Demo - Scale Out Execution of Anything
  • Module 4: Data Flows

    • Mapping Data Flows
      • Demo - Building a Mapping Data Flow
    • Wrangling Data Flows
      • Demo - Using a Wrangling Data Flow
    • Configuration
    • Use Cases
  • Module 5: Metadata Driven Pipelines

    • Expressions
    • Dynamic Pipelines
      • Demo - Data Discovery and Upload
      • Demo - Simple Metadata and Upload
      • Demo - Lazy SQLDB Replication
    • Orchestration Framework - procfwk.com
      • Demo - Framework Failure Handling & Restart
  • Module 6: Monitoring Alerting Security

    • Logging
    • Alerting
      • Demo - How To Build Alerting
    • Using Azure Key Vault
    • Access & Permissions
  • Module 7: Pricing & Limitations

    • Cost
      • Activities
      • Data Integration Units
      • Data Flow Compute
      • Wider Platform Orchestration
    • Resource Limitations
  • Module 8: CI/CD with Azure DevOps

    • Source Control vs Developer UI
    • ARM Template Deployments
      • Demo - Basic Deployment via Azure DevOps
    • PowerShell Deployments
  • Module 9: Data Factory in Production

    • Testing
      • Demo - Running NUnit Tests
    • Bootstrapping
    • Best Practices
  • Module 10: Wrap Up

    • Conclusions
    • Questions
    • Homework

Training Day Contributors

Paul Andrew

Principal Consultant - Solution Architect & Data Platform MVP @ Altius Consulting Ltd

Paul is a Microsoft Data Platform MVP with 15+ years’ experience working with the complete on premises SQL Server stack in a variety of roles and industries. Now as an industry leading consultant has turned his keyboard to big data solutions on the Microsoft cloud platform. Specialising in all things data engineering (Data Factory, Data Bricks, Data Lake and Stream Analytics). Paul is also a STEM Ambassador for the networking education in schools’ programme, PASS chapter leader, a member of the Data Relay committee, SQL Bits, SQL Saturday, SQL Day, SQLGLA, PASS Summit speaker and helper.

You can contact Paul via:

Richard Swinbank

Senior Data Engineer @ Boomin

Richard is the author of the lab materials provided as part of the workshop. He is an experienced data engineer specialising in the Microsoft Azure and SQL Server data platforms. An active member of the Microsoft data platform community, Richard is a speaker, blogger, volunteer and Data Relay event organiser. His book Azure Data Factory by Example will be published by Apress in early 2021.

You can contact Richard via:


a-day-full-of-azure-data-factory's People

Contributors

mrpaulandrew avatar richardswinbank avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.