Git Product home page Git Product logo

appscript-bigquery-csv's Introduction

Loading CSVs into BigQuery from Google Drive

OVERVIEW

This provides an easy method to look for CSV files in a specified Google Drive directory, then parse them and load into their relative/correct BigQuery table. These CSVs can be different types, intended to be loaded into different tables.

Premise: You want an easy method to allow someone to drop CSVs into a directory and have them loaded into BQ without having to worry about dataflow or dataprep, or cloud functions, or pubsub. You just want a low-code way to get data into BQ.

ASSUMPTIONS:

  • The CSVs are stored in Google Drive or Google Shared Drives (the was a limitation before, but now it works with both)
  • This likely won't work for massive files, or things over a certain (unknown at this time) file size.
  • You will need to update the code / table functions if the CSVs change structure at any point.
  • Error handling needs to be improved of course, but that's for a later phase. Most of the time, the expectation is that you'd check the job history, or the execution history, and default error notifications to determine if you have an issue with the process described here.

INSTRUCTIONS

  • Create a folder in Google Drive to store your CSV files that are pending processing, and then another folder to store processed CSV files.
  • Create a new project in google appscript (script.google.com)
  • Copy the files from this repo appscript
  • Make a copy of config.sample.gs and rename as config.gs...then make all your variable/customizations for your BQ project in there. You will NOT make any changes to any other files.
  • Update/write some functions to set up the tables you want. Follow the examples provided, ie: table_characters() and table_places(). Just change what's there, and copy/paste to create new table definitions based on what the CSV file(s) will contain.
  • Look in the Jobs.gs file -- it is the one you'll use to run the jobs / schedule executions for loads.
  • Run the create_tables_one_time() function once, which will prompt for permissions, and then create your BQ tables (if you did the above steps correctly).
  • Accept the permissions (asking for access for your script to read/write to google drive, bigquery etc)
  • Make sure there are some CSV files into the PENDING google drive folder you set up.
  • Run the process_all_pending_csv_files() function (once, or set a trigger)
  • Look in BigQuery at the jobs, and then the datasets and tables, and you should see data in there pretty quickly.

RECOMMENDATION:

  • If you run this using a triggered schedule, then all you need to do, is move any CSVs into your pending folder, and they'll automatically get aggregated.
  • This is then easy to add as a source to a Data Studio Dashboard.

appscript-bigquery-csv's People

Contributors

usaussie avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.