Git Product home page Git Product logo

nifi-bigquery-bundle's Introduction

NiFi Bigquery Bundle

Bigquery bundle for Apache NiFi

Processors

PutBigQuery

Save Flow File content on a BigQuery table. The format of data must be JSON.

Required Properties
Required Properties
  • Read Time Out: the time to wait for a response from BigQuery service
  • Connection Time Out: the time to wait during connection establishment with BigQuery service
  • Project Id: Google CLoud project id. If not specified, the process try to obtain it from provided credentials mentioned above.

Deploy Bundle

Clone this repository

    git clone https://github.com/theShadow89/nifi-bigquery-bundle

Build the bundle

    cd nifi-bigquery-bundle
    mvn clean install

Copy Nar file to NIFI_HOME/lib

    cp nifi-bigquery-nar/target/nifi-bigquery-nar-0.1.nar NIFI_HOME/lib/

Start/Restart Nifi

    NIFI_HOME/bin/nifi.sh start

TODO

  • Add Get Processor

nifi-bigquery-bundle's People

Contributors

allanbatista avatar theshadow89 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nifi-bigquery-bundle's Issues

putBigQuery - Failed to Invoke onScheduled Task

Hi,

Unable to Push the data, may be the problem in invoking the OnSchedule task

Images shows about the error
Screenshot 2019-03-12 at 9 39 32 PM

Images shows about configuration
Screenshot 2019-03-12 at 9 43 14 PM

Please look into this and let me know where the things went wrong

Add Get Processor

Add Get Processor to read data from Bigquery and use it into NiFi flows

Write a batch of flow files to improve performance

Problem

Actually the put processor take one file at a time and write it to BigQuery. This can lead the performance.

Solution

Take a batch of files (configurable?) from nifi session and insert in BigQuery table as a batch.

Compiling fails

"mvn clean install" command fails because of inconsistent version of the nifi-bigquery-processors dependency in the pom file of the nifi-bigquery-nar module. The part below causes an error and compilation fails.

org.apache.nifi nifi-bigquery-processors 0.1-SNAPSHOT

To fix the issue just remove the "-SNAPSHOT" part from the version.

Creating tables automatically using template tables

A common usage pattern for streaming data into BigQuery is to split a logical table into many smaller tables to create smaller sets of data.

To use a template table via the PutBigQuery, add table suffix composition inside the Processor configuration options.

Not writing to BigQuery

Hi
I am consuming JSON from KAFKA with ConsumeKafka_0_11 JSON TOPIC and than transfer it to PutBigquery, I create table and have permissions but I do not see any record in the table.

At NiFi Data Provenance I see only DROP TYPE

Any Idea how to solve it ?

Thank you

Ensuring data consistency

BigQuery ensures data consistency on streaming load using an insertId for each inserted row.

Add configuration field for Put Processor in order to extract the unique id from incoming data. We can use JSON Path to extract the id from incoming data.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.