Git Product home page Git Product logo

dataflowsdk-examples's Introduction

Google Cloud Dataflow Examples

Google Cloud Dataflow is a service for executing Apache Beam pipelines on Google Cloud Platform.

Getting Started

We moved to Apache Beam!

Apache Beam Python SDK and the code development moved to the Apache Beam repo.

If you want to contribute to the project (please do!) use this Apache Beam contributor's guide

Contact Us

We welcome all usage-related questions on Stack Overflow tagged with google-cloud-dataflow.

Please use the issue tracker on Apache JIRA to report any bugs, comments or questions regarding SDK development.

Additional Resources

For more information on Google Cloud Dataflow, see the following resources:

dataflowsdk-examples's People

Contributors

aaltay avatar davorbonaci avatar dhalperi avatar francesperry avatar kennknowles avatar robertwb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dataflowsdk-examples's Issues

Ambiguous method call

When i use the examples, I have met the problem :Ambiguous method call. Both via(SimpleFunction<KV<String,Long>,String>) in Mapelements and
via (SerializableFunction) in Mapelements math
so why the examples have that problem?

Quickstart Using Java and Apache Maven error

Following get start tutorial, got error whether JDK or OpenJDK with Cloud Shell:

mvn compile exec:java
-Dexec.mainClass=com.google.cloud.dataflow.examples.WordCount
-Dexec.args="--project=alpha-life-line
--stagingLocation=word-count
--runner=BlockingDataflowPipelineRunner"

[INFO] --- exec-maven-plugin:1.5.0:java (default-cli) @ first-dataflow ---
[WARNING]
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:294)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Failed to construct instance from factory method BlockingDataflowPipelineRunner#fromOptions(interface com.google.cloud.dataflow.sdk.options.PipelineOptions)
at com.google.cloud.dataflow.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:233)
at com.google.cloud.dataflow.sdk.util.InstanceBuilder.build(InstanceBuilder.java:162)
at com.google.cloud.dataflow.sdk.runners.PipelineRunner.fromOptions(PipelineRunner.java:57)
at com.google.cloud.dataflow.sdk.Pipeline.create(Pipeline.java:134)
at com.example.WordCount.main(WordCount.java:193)
... 6 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.google.cloud.dataflow.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:222)
... 10 more
Caused by: java.lang.IllegalArgumentException:
at com.google.api.client.json.JsonParser.parseValue(JsonParser.java:889)
at com.google.api.client.json.JsonParser.parseArray(JsonParser.java:648)
at com.google.api.client.json.JsonParser.parseArray(JsonParser.java:628)
at com.google.api.client.json.JsonParser.parseArray(JsonParser.java:597)
at com.google.api.client.json.JsonParser.parseArray(JsonParser.java:577)
at com.google.api.client.googleapis.auth.oauth2.CloudShellCredential.executeRefreshToken(CloudShellCredential.java:88)
at com.google.api.client.auth.oauth2.Credential.refreshToken(Credential.java:489)
at com.google.api.client.auth.oauth2.Credential.intercept(Credential.java:217)
at com.google.cloud.hadoop.util.ChainingHttpRequestInitializer$2.intercept(ChainingHttpRequestInitializer.java:98)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:859)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.hadoop.util.ResilientOperation$AbstractGoogleClientRequestExecutor.call(ResilientOperation.java:166)
at com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:66)
at com.google.cloud.dataflow.sdk.util.GcsUtil.bucketExists(GcsUtil.java:325)
at com.google.cloud.dataflow.sdk.util.GcsUtil.bucketExists(GcsUtil.java:312)
at com.google.cloud.dataflow.sdk.util.DataflowPathValidator.verifyPathIsAccessible(DataflowPathValidator.java:79)
at com.google.cloud.dataflow.sdk.util.DataflowPathValidator.validateOutputFilePrefixSupported(DataflowPathValidator.java:62)
at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.fromOptions(DataflowPipelineRunner.java:255)
at com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner.fromOptions(BlockingDataflowPipelineRunner.java:82)
... 15 more
Caused by: java.lang.IllegalArgumentException: expected numeric type but got class java.lang.String
at com.google.api.client.json.JsonParser.parseValue(JsonParser.java:844)
... 35 more
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
at java.lang.reflect.Method.invoke(Method.java:606)
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 10.595 s
[INFO] Finished at: 2016-06-10T16:52:53+08:00
[INFO] Final Memory: 14M/33M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.5.0:java (default-cli) on project first-dataflow: An exception occured while executing the Java class. nu
ll: InvocationTargetException: Failed to construct instance from factory method BlockingDataflowPipelineRunner#fromOptions(interface com.google.cloud.dataflow.sdk.options.Pip
elineOptions): IllegalArgumentException: expected numeric type but got class java.lang.String -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

POM breakage due to latest commit

SDK version 1.9.0 hasn't been pushed to maven central yet so commit 5a81bf1 breaks the build. Is there another repository where I can find dataflow-java-sdk in the mean time?

503 Service Unavailable Backend Error

Hi there,
I'm trying to run the DatastoreWordCount from the cookbook examples and encountered this exception.
I was hoping to get some help with this issue.

Stack Trace:

2016-02-15T17:43:36.674Z: Error: (2edb046603c29179): com.google.api.services.datastore.client.DatastoreException: Backend Error
at com.google.api.services.datastore.client.RemoteRpc.makeException(RemoteRpc.java:115)
at com.google.api.services.datastore.client.RemoteRpc.call(RemoteRpc.java:81)
at com.google.api.services.datastore.client.BaseDatastoreFactory$RemoteRpc.call(BaseDatastoreFactory.java:41)
at com.google.api.services.datastore.client.Datastore.commit(Datastore.java:85)
at com.google.cloud.dataflow.sdk.io.DatastoreIO$DatastoreWriter.flushBatch(DatastoreIO.java:768)
at com.google.cloud.dataflow.sdk.io.DatastoreIO$DatastoreWriter.write(DatastoreIO.java:726)
at com.google.cloud.dataflow.sdk.io.DatastoreIO$DatastoreWriter.write(DatastoreIO.java:668)
at com.google.cloud.dataflow.sdk.io.Write$Bound$2.processElement(Write.java:168)
Caused by: com.google.api.client.http.HttpResponseException: 503 Service Unavailable
Backend Error
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1054)
at com.google.api.services.datastore.client.RemoteRpc.call(RemoteRpc.java:78)
at com.google.api.services.datastore.client.BaseDatastoreFactory$RemoteRpc.call(BaseDatastoreFactory.java:41)
at com.google.api.services.datastore.client.Datastore.commit(Datastore.java:85)
at com.google.cloud.dataflow.sdk.io.DatastoreIO$DatastoreWriter.flushBatch(DatastoreIO.java:768)
at com.google.cloud.dataflow.sdk.io.DatastoreIO$DatastoreWriter.write(DatastoreIO.java:726)
at com.google.cloud.dataflow.sdk.io.DatastoreIO$DatastoreWriter.write(DatastoreIO.java:668)
at com.google.cloud.dataflow.sdk.io.Write$Bound$2.processElement(Write.java:168)
at com.google.cloud.dataflow.sdk.util.DoFnRunner.invokeProcessElement(DoFnRunner.java:189)
at com.google.cloud.dataflow.sdk.util.DoFnRunner.processElement(DoFnRunner.java:171)
at com.google.cloud.dataflow.sdk.runners.worker.ParDoFnBase.processElement(ParDoFnBase.java:213)
at com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.process(ParDoOperation.java:53)
at com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.sdk.runners.worker.ParDoFnBase$1.output(ParDoFnBase.java:174)
at com.google.cloud.dataflow.sdk.util.DoFnRunner$DoFnContext.outputWindowedValue(DoFnRunner.java:333)
at com.google.cloud.dataflow.sdk.util.DoFnRunner$DoFnProcessContext.output(DoFnRunner.java:487)
at com.google.cloud.dataflow.examples.cookbook.DatastoreWordCount$CreateEntityFn.processElement(DatastoreWordCount.java:148)
at com.google.cloud.dataflow.sdk.util.DoFnRunner.invokeProcessElement(DoFnRunner.java:189)
at com.google.cloud.dataflow.sdk.util.DoFnRunner.processElement(DoFnRunner.java:171)
at com.google.cloud.dataflow.sdk.runners.worker.ParDoFnBase.processElement(ParDoFnBase.java:213)
at com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.process(ParDoOperation.java:53)
at com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.sdk.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:223)
at com.google.cloud.dataflow.sdk.util.common.worker.ReadOperation.start(ReadOperation.java:169)
at com.google.cloud.dataflow.sdk.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:69)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.executeWork(DataflowWorker.java:254)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.doWork(DataflowWorker.java:191)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:144)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.doWork(DataflowWorkerHarness.java:180)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:161)
at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:148)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Thanks!

Examples demonstrate code, but not project configuration

(this is filed as BEAM-442 for Beam but the scope & timing of the issue probably means it deserves an issue here as well)

From http://stackoverflow.com/questions/38317428/how-do-i-resolve-java-lang-nosuchmethoderror-com-google-api-services-dataflow-m

The above StackOverflow question illuminates the fact that this repository is not a demonstration of proper project configuration, but only code. It is probably quite useful to have a starter project that actually has the dependencies set up correctly, not referencing the Dataflow SDK parent pom, so users can clone it and get started.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.