Git Product home page Git Product logo

Comments (8)

imamdigmi avatar imamdigmi commented on May 29, 2024 1

UPDATE SOLUTION:
creating table is not necessary, you can use parameter time_partitioning={'type': 'DAY'}, to avoiding this error

from airflow-tutorial.

tuanavu avatar tuanavu commented on May 29, 2024

Hi @imamdigmi,

I just re-ran the bigquery_github_trends dag, and all the tasks ran through without any errors. Can you check if you use the correct Airflow version?

Maybe trying this command to rebuild the image for the bigquery tutorial:
docker-compose -f docker-compose-gcloud.yml up --build

Let me know if you still have the error.

from airflow-tutorial.

imamdigmi avatar imamdigmi commented on May 29, 2024

Hi @tuanavu thanks for your fast response, actually I use the latest stable version of Airflow (1.10.2) and I use Airflow on my local machine with Anaconda, which means that without docker container, and I wrote the DAG and other files by myself, I can make sure there is no problem with my DAG or my Airflow, I confirmed it in the following way:

python $DAGS_FOLDER/bigquery_github_trends.py

there is no error, and I test every single tasks with the following command:

$ airflow test bigquery_github_trends bq_check_githubarchive_day 2018-12-02
$ airflow test bigquery_github_trends bq_check_hackernews_full 2018-12-02

eventually, I stuck on testing the third tasks with the above errors, but, I follow this instruction from how-to-aggregate-data-for-bigquery-using-apache-airflow, which is I should make empty table on BigQuery with the following command on Google Console Shell:

$ bq mk --time_partitioning_type=DAY my-project:github_trends.github_daily_metrics
$ bq mk --time_partitioning_type=DAY my-project:github_trends.github_agg
$ bq mk --time_partitioning_type=DAY my-project:github_trends.hackernews_agg
$ bq mk --time_partitioning_type=DAY my-project:github_trends.hackernews_github_agg

After that, I re-ran the third test with:

airflow test bigquery_github_trends bq_write_to_github_daily_metrics 2018-12-02

and there is no error appears, bu those tables still empty which which means that the data is not stored in the destination table, although all the tasks run successfully.

Any advice I'll appreciate it! thanks

from airflow-tutorial.

tuanavu avatar tuanavu commented on May 29, 2024

Hi @imamdigmi,

I understand your frustration. That is problem with open source project, and why I have to use docker for version control. Because different Airflow version may have a different version of the Operators that makes the tasks fail.

The Airflow version used in this tutorial is 1.10.1. I will take a look and see if I can reproduce your errors in the new Airflow version 1.10.2.

from airflow-tutorial.

imamdigmi avatar imamdigmi commented on May 29, 2024

Thank you very much @tuanavu for your willingness to help me

from airflow-tutorial.

imamdigmi avatar imamdigmi commented on May 29, 2024

Hi @tuanavu I just re-ran the DAGs using Airflow version 1.10.1, it's successfully create partition table automatically, bu unfortunately, the data still not stored in destination table (empty)

from airflow-tutorial.

tuanavu avatar tuanavu commented on May 29, 2024

Hi @imamdigmi,

I believe the reason your query result is not stored in the destination table is because of this setting partition_expiration_days=3. This means all partitions older than 3 days should expire and be deleted. So when you try to run a test on 2018-12-02, which is older than 3 days ago, the data expired immediately after inserted to the table.

Try to delete and recreate the partitioned table without the partition_expiration_days. Or pick a date in the date range of your partition_expiration_days, and you should see the output in the destination table.

from airflow-tutorial.

imamdigmi avatar imamdigmi commented on May 29, 2024

SOLVED! thanks @tuanavu ! but I still wondering, why in Airflow 1.10.2 I have to create table first, while in 1.10.1 version this automatically created when Airflow execute a task.

from airflow-tutorial.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.