Git Product home page Git Product logo

data-engineering-and-dataops's Introduction

Data Engineering and DataOps

Data Engineering and DataOps Course: IDS 706, Fall 2023 by Noah Gift

maui

Quick Links

Weekly Rust Mini-Project References

Course Description

Data Engineering is applied software engineering. It is not data science or computer science. As a result, this course is focused on building software systems within the domain of data. This course covers servers as a method of encapsulating many courses in the program. Students learn to apply Data Engineering to a real-world project. This manifests itself through several goals: development of non-linear life-long learning, community building, portfolio development, and software engineering best practices including using AI Pair Programming assistances, DevOps, and Cloud Computing.

Course Goals

Upon course completion, you'll be able to:

  • Create data engineering solutions using Rust and the Linux environment.
  • Design binary executables for interfacing with SQL systems such as Snowflake, DataBricks, and BigQuery.
  • Construct robust, efficient, and safe systems with low carbon footprints, leveraging the inherent properties of Rust for scalable efficiency.
  • Use AI Pair Programming tools like GitHub Copilot, ChatGPT, AWS CodeWhisperer, and Google Bard for building sophisticated, reliable systems.
  • Cultivate non-linear, lifelong learning skills.
  • Assemble, share, and present persuasive portfolios using platforms like GitHub, YouTube, and LinkedIn.
  • Obtain certifications in Cloud Computing and a Big Data SQL solution.

Prerequisites

Basic programming skills as well as basic Linux skills. See optional readings/media to self-learn before class starts. You will also be required to do a 5-week Rust bootcamp.

Pedagogy

Expect to spend between 10-20 hours per week in this class including the five-week bootcamp. This class is a required class and teaches material that prepares you for a job doing software engineering tasks in the field of data and machine learning. It is challenging and time-intensive, so please plan Fall schedule accordingly. The reason this class uses weekly demos is that they are common in the software industry and this class prepares you to hit the ground running in a high-pressure demanding tech job. Additionally, by doing doing demos you increase your metacognition ability, i.e. you learn what you know and what do you don't know. Increasing your metacognition skills is a shortcut to mastery in real-world software engineering.

Finally, at the end of class you will have 5 substantial projects, and 15 mini-projects. This means you will have a robust portfolio of work to share with a future employer. This amount of work we do in this class is very similar to a real-world job doing software engineering, but you have guard rails of tremendous support from the faculty and TAs at a world class institution.

Tech Stack

Answers on why this course uses Rust, GitHub Codespaces and Copilot:

Diversity Statement

We, as educators and students, are dedicated to fostering diversity and equity, ensuring everyone's full participation by eliminating educational obstacles. This course values the diverse experiences, backgrounds, identities, learning styles, and academic interests of each individual. The array of perspectives from our students enriches all, and we aim to approach each with openness and respect.

Required Readings & Media

The primary resources for this course are the following Coursera Specializations by Noah Gift:

These specializations provide comprehensive coverage of the key concepts and skills needed for this course. They include a combination of video lectures, readings, quizzes, and hands-on projects to reinforce learning and build practical skills.

Optional Supplementary Readings & Media

Assignment Overview and Grading Breakdown

Course Technology

This course will involve a number of different types of interactions. These interactions will take place primarily through Microsoft Teams, GitHub, and Zoom. Please take the time to navigate through the course and become familiar with the course syllabus, structure, and content and review the list of resources below.

Required Technical Skills

Students in an online program should be able to do the following:

  • Communicate via Teams discussion forums.
  • Use web browsers and navigate the World Wide Web and use tools like ChatGPT.
  • Use the learning management system Teams.
  • Use GitHub.
  • Create demo videos.
  • Write Rust code.
  • Use Cloud Computing and Cloud Computing Labs.

Course Outline

The course is structured into several key sections, each designed to provide you with the skills and knowledge necessary to excel in the field of data engineering. The sections are as follows:

  1. Introduction to Data Engineering and DataOps
  2. Rust Programming for Data Engineering
  3. SQL Systems and Data Engineering
  4. Cloud Computing and Data Engineering
  5. AI Pair Programming and Data Engineering
  6. DevOps and Data Engineering
  7. Final Project and Course Review

Throughout the course, you will engage in hands-on projects, both individually and in groups, that will reinforce the concepts covered in the lectures and readings. These projects will also provide you with valuable experience in designing and implementing data engineering solutions.

Conclusion

Data engineering is a rapidly growing field that plays a crucial role in the modern data-driven world. By the end of this course, you will have gained a solid foundation in data engineering principles and practices, as well as the ability to apply these skills to real-world problems. Whether you are looking to advance your career in data engineering or simply want to broaden your understanding of this important field, this course will provide you with the tools and knowledge you need to succeed.

data-engineering-and-dataops's People

Contributors

noahgift avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.