Git Product home page Git Product logo

http's Introduction

HTTP

cm-available cdap-batch-sink License Join CDAP community

Introduction

A collection of HTTP Source and Sink Plugins.

Getting Started

Prerequisites

CDAP version 4.0.x or higher.

Building Plugins

You get started with http plugins by building directly from the latest source code::

   git clone https://github.com/data-integrations/http.git
   cd http
   mvn clean package

After the build completes, you will have a JAR for each plugin under each <plugin-name>/target/ directory.

Deploying Plugins

You can deploy a plugin using the CDAP CLI::

load artifact <target/plugin-jar> config-file <resources/plugin-config>

load artifact target/http-plugin-.jar
config-file target/http-plugin-.json

You can build without running tests: mvn clean install -DskipTests

Limitations

  • UI doesn't support schema's with hyphens (-), so the plugin currently transforms all the schemas with - into underscores (_). This change will be reverted after this is fixed: https://issues.cask.co/browse/HYDRATOR-1125

Mailing Lists

CDAP User Group and Development Discussions:

The cdap-user mailing list is primarily for users using the product to develop applications or building plugins for appplications. You can expect questions from users, release announcements, and any other discussions that we think will be helpful to the users.

IRC Channel

CDAP IRC Channel: #cdap on irc.freenode.net

License and Trademarks

Copyright © 2017 Cask Data, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Cask is a trademark of Cask Data, Inc. All rights reserved.

Apache, Apache HBase, and HBase are trademarks of The Apache Software Foundation. Used with permission. No endorsement by The Apache Software Foundation is implied by the use of these marks.

.. |(Hydrator)| image:: http://cask.co/wp-content/uploads/hydrator_logo_cdap1.png

http's People

Contributors

ajainarayanan avatar albertshau avatar aonischuk avatar aryan-verma avatar bdmogal avatar curiousvini avatar dli357 avatar elfenheart avatar flakrimjusufi avatar itsankit-google avatar liu-joe avatar mlozbin-cybervisiontech avatar mrahanjam avatar mrrahulsharma avatar nikitapaliwal123 avatar nitinmotgi avatar psainics avatar rmstar avatar roni98 avatar sgarg-cs avatar shashankmoghe avatar sheivin avatar shubhangi-cs avatar sumengwang avatar taljaards avatar vikasrathee-cs avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

http's Issues

Limitation: CSV delimiter

The settings for csv delimiters could use a custom option.
Use case: European standard is semicolon, not a comma.

JSON Parser can't handle the root of the response being a JSON List

new JSONPages try to parse the response entity as a JSON object rather than as arbitrary JSON

could solve sorta like:

   super(httpResponse);
    this.config = config;
    JsonElement jsonElement = JsonParser.parse(httpResponse.getBody());
    if(!json.isJsonArray()) {
        json = jsonElement.toJsonElement();
       JSONUtil.JsonQueryResponse queryResponse = JSONUtil.getJsonElementByPath(json, config.getResultPath());
       insideElementJsonPathPart = queryResponse.getUnretrievedPath();

      jsonElement = queryResponse.get();
    }
    if (jsonElement.isJsonArray()) {
      iterator = queryResponse.getAsJsonArray().iterator();
    } else if (jsonElement.isJsonObject()) {
      iterator = Collections.singleton(jsonElement).iterator();
    } else {
      throw new IllegalArgumentException(String.format("Element found by '%s' json path is expected to be an object " +
                                                         "or an array. Primitive found", config.getResultPath()));
    }

    fieldsMapping = config.getFullFieldsMapping();
    schema = config.getSchema();

Starting 's'(s) are dropped from custom message field names

I was using http sink plugin and I had a 'startInterval' property in my custom message. Running the pipeline was failed because of throwing Field tartInterval doesnt exist in the input schema. exception. I checked the Matcher and found out it would consider field names with starting 's's dropped. Maybe there are some better regex to use.

private static final String REGEX_HASHED_VAR = "#s*(\\w+)";

Doesn't support patch method

The http plugin doesn't support PATCH method. And i see the codes, use HttpURLConnection, it doesn't support PATCH method, maybe need to use another package, another question is that URL parameter doesn't support Macro.
Hope we can support PATCH method in the feature soon, thanks.

Cant able to send the form data in POST http call

Can you please tell using this plugin how we can send the form data .i want to send username and password to my custom auth endpoint , as the form param. is it possible to do this using the plugin

Can the doc on limitations be updated?

In the Github page we find the following:

https://github.com/data-integrations/http#limitations

which says:

UI doesn't support schema's with hyphens (-), so the plugin currently transforms all the schemas with - into underscores (_). This change will be reverted after this is fixed: https://issues.cask.co/browse/HYDRATOR-1125

If we follow the issue link, we see that the issue was flagged as resolved in 2016. Can this limitation documentation now be removed?

Maven build Could not resolve dependencies - Failed to collect dependencies at

Hi guys, I really need to be able to build this but i get this issue every time i try:

BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 06:09 min [INFO] Finished at: 2021-09-06T00:32:28+01:00 [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal on project http-plugins: Could not resolve dependencies for project io.cdap:http-plugins:jar:1.4.0-SNAPSHOT: Failed to collect dependencies at io.cdap.cdap:hydrator-test:jar:6.1.1 -> io.cdap.cdap:cdap-unit-test:jar:6.1.1 -> io.cdap.cdap:cdap-explore:jar:6.1.1 -> org.apache.hive:hive-exec:jar:1.2.1 -> org.apache.calcite:calcite-core:jar:1.2.0-incubating -> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Failed to read artifact descriptor for org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Could not transfer artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for repositories: [datanucleus (http://www.datanucleus.org/downloads/maven2, default, releases), glassfish-repository (http://maven.glassfish.org/content/groups/glassfish, default, disabled), glassfish-repo-archive (http://maven.glassfish.org/content/groups/glassfish, default, disabled), apache.snapshots (http://repository.apache.org/snapshots, default, snapshots), central (http://repo.maven.apache.org/maven2, default, releases), conjars (http://conjars.org/repo, default, releases+snapshots)] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException

Please help!

Batching is not working correctly in HttpSink

I used the HttpSink plugin version 1.2.5 in Cloud Data Fusion (version 6.7.2) in Google Cloud. While running my pipeline, data is getting sent out in batches but data that does not fit into a batch is never sent out.

For example:
If I have 100 records and if my batch size is 23, 92 records are sent out while the pipeline is running. The remaining 8 records are not sent out.
Having gone through the code I think the destroy() is not getting invoked.

Question: Using HTTP plugin to fetch multiple REST endpoints and sink data to BigQuery Multi table sink

I would like to use Cloud Data Fusion to load data from multiple REST endpoints and store it in BigQuery in multiple tables (per endpoint).

I made it working using HTTP plugin as source and BigQuery sink. However I have to define pipeline for each endpoint, which is overkill, I suppose.

I noticed that Data Fusion has BigQuery Multi table sink, so I was expecting to connect multiple HTTP sources to it so that BigQuery create tables per each endpoint and load data into them. However when I run pipeline I am having error "Two different input schema were set for the stage BigQuery Multi Table". Apparently every endpoint has different schema.

Questions are: Is BigQuery Multi table sink appropriate to solve my problem? If yes, how should I configure it correctly to make it working?. If not, are there any other ways to do it other than defining pipeline per endpoint?

Bug: Http Batch Source, Macro schema not working

When we parametrize / use macro for output schema in Http Batch Source plugin it throws an error

Output schema cannot be empty Provide valid value for config property 'schema'.

Using Plugin version: v1.3.3
DataFusion version: v6.8.2

Error not occurring in older version for example in v1.2.7
Might be caused by #105

Cannot resolve maven build dependencies

Hello! We followed the exact instructions for creating the jar file using maven, but we get the following error. We obviously tried to resolve it and have failed. We are on the latest version of maven.

Failed to execute goal on project http-plugins: Could not resolve dependencies for project io.cdap:http-plugins:jar:1.4.0-SNAPSHOT: Failed to collect dependencies at io.cdap.cdap:hydrator-test:jar:6.1.1 -> io.cdap.cdap:cdap-unit-test:jar:6.1.1 -> io.cdap.cdap:cdap-explore:jar:6.1.1 -> org.apache.hive:hive-exec:jar:1.2.1 -> org.apache.calcite:calcite-core:jar:1.2.0-incubating -> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.