rendybjunior / barbeque Goto Github PK
View Code? Open in Web Editor NEWIncremental processing framework on BigQuery
License: MIT License
Incremental processing framework on BigQuery
License: MIT License
Automate sample data generation to csv file.
When generate_data.py
executed, it will produce a barbeque_sales.csv
with schema as shown at #1 :
Automatically load data to BQ using script
Create bigquery sql builder from config and cmd param
Given an object of job config and cmd param, necessary sql could be generated
Example:
See command and yaml at #1
It will produce BigQuery Standard SQL as per below:
SELECT
DATE(_PARTITIONTIME) AS day,
brand_id,
SUM(amount) AS amount_sum,
COUNT(*) AS cnt
FROM `barbeque.sales`
WHERE _PARTITIONTIME >= "2018-06-17 00:00:00" AND _PARTITIONTIME < "2018-06-30 00:00:00"
GROUP BY 1, 2
Stitching all components created on a main flow
Equal to #1
Nice error handling and logging is still out of scope
Read config from file.
Given config in yaml file format, it could be read when user execute barbeque sales.yml
Handling daterange parameter
Enable integration to BQ
Given certain SQL, we could fire the SQL to BQ and load the SQL result to a destination table with partition config, in replace manner
Keep track of job status in bq
MVP of barbeque, creating partition preserving summary, with assumptions:
Given this command: bbq sales.yml --start_dt="2018-06-17" --end_dt="2018-06-30"
barbeque will read source table and write into target table (in replace mode) with name equal to job name, with partition preserved.
sales.yml :
name: sales_count_by_brand_id_day
type: day_partition_preserving
table: barbeque.sales
keys:
- brand_id
aggr:
- field: amount
func: sum
Sample data source: (timestamp is partition time)
id | brand_id | amount | timestamp |
---|---|---|---|
1 | super_bbq | 123 | 2018-06-17 08:00:00 |
2 | super_bbq | 456 | 2018-06-17 14:00:00 |
3 | super_bbq | 142 | 2018-06-18 13:00:00 |
4 | just_ok_bbq | 542 | 2017-06-20 09:00:00 |
Sample data result:
day | brand_id | cnt | amount_sum | _PARTITIONTIME |
---|---|---|---|---|
2018-06-17 | super_bbq | 2 | 579 | 2018-06-17 00:00:00 |
2018-06-18 | super_bbq | 1 | 142 | 2018-06-18 00:00:00 |
2018-06-20 | just_ok_bbq | 1 | 542 | 2018-06-20 00:00:00 |
Additional feature planned (written here for self-note purpose):
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.