Git Product home page Git Product logo

koupler's Introduction

Koupler

This project provides TCP, HTTP, UDP and Pipe interaces for Amazon's Kinesis. Underneath the covers, it uses the Kinesis Producer Library (KPL). The daemon listens on TCP/UDP/HTTP, or takes input from a pipe. Regardless of the mode, it handles the stream line-by-line, splitting the line based on the delimiter supplied, and then uses the specified field as the Kinesis partition key and queues the message with KPL.

Koupler also tracks metrics using Coda Hale's most excellent metrics library. Those metrics are then published up to Amazon's cloudwatch, allowing you to see per host behavior and throughput information. For more information, see the metrics section below.

Building

Koupler uses gradle as its build system. To build kouple with gradle, run the following:

   gradle clean dist

This will build a zip-file artifact in build/distributions.

Unzip that file with:

   unzip build/distribution/*.zip

Possibly-convenient docker run command to build the artifact:

docker run --name koupler --mount type=bind,source=$(pwd),target=/home/gradle --rm --entrypoint /usr/bin/bash gradle:6-jdk8 /home/gradle/build.sh

And you are ready to use koupler.

Usage

After a successful build, simply run the following to get usage information:

  $./koupler.sh 

You should see the following:

   $ ./koupler.sh
   Must specify either: udp, tcp or pipe
   Must specify stream name.
   usage: java -jar koupler*.jar
    -delimiter <arg>          delimiter between fields (default: ',')
    -partitionKeyField <arg>  zero-based index of field containing partition key (default: 0)
    -format <arg>             format for which partitionKey will be extracted (default: split)
    -pipe                     pipe mode
    -port <arg>               listening port (default: 4242)
    -propertiesFile <arg>     kpl properties file (default: ./conf/kpl.properties)
    -streamName <arg>         kinesis stream name
    -tcp                      tcp mode
    -http                     http mode
    -udp                      udp mode

The parameters are fairly straight-forward, but be sure to have a look at conf/kpl.properties. Also, you can control logging levels by changing conf/log4j2.xml.

The Consumer

To kick the tires a bit, you can start the built-in consumer. The built-in consumer will output messages from the stream to the console.

   $ ./koupler.sh -consumer -streamName  boneill-dev-test
   [INFO] 2015-10-14 23:36:43,254 producer.KinesisProducerConfiguration.fromPropertiesFile - Attempting to load config from file ./conf/kpl.properties
   [2015-10-14 23:36:43.583341] [0x00007fff7120e000] [info] [metrics_manager.h:148] Uploading metrics to monitoring.us-east-1.amazonaws.com:443
   [INFO] 2015-10-14 23:36:43,915 producer.KinesisProducerConfiguration.fromPropertiesFile - Attempting to load config from file ./conf/kpl.properties
   ...
   INFO: Initializing shard shardId-000000000000 with TRIM_HORIZON

TCP

Next, fire up the TCP server and throw some data at it! The following is an example command-line.

   $ ./koupler.sh -tcp -streamName boneill-dev-test

You can sling data at the TCP listener with the following:

   $ telnet localhost 4242
   Trying ::1...
   Connected to localhost.
   Escape character is '^]'.
   lisa
   collin
   owen

And in the consumer you should see:

   [DEBUG] 2015-10-14 23:50:24,456 koupler.KinesisEventConsumer.processRecords - Received [lisa]
   [DEBUG] 2015-10-14 23:50:24,456 koupler.KinesisEventConsumer.processRecords - Received [collin]
   [DEBUG] 2015-10-14 23:50:24,456 koupler.KinesisEventConsumer.processRecords - Received [owen]

UDP

Next, fire up the UDP server! The following is an example command-line.

   $ ./koupler.sh -udp -streamName boneill-dev-test

You can sling data at the UDP listener with the following:

   $ nc -u localhost 4242
   murphy
   bailey

HTTP

Next, fire up the HTTP server! The server takes a POST, and queues the body of the HTTP request. The following is an example command-line.

   $ ./koupler.sh -http -streamName boneill-dev-test

You can sling data at the HTTP listener with the following:

   $ curl -d "drago" http://localhost:4567/event
   ACK

Pipe

Finally, for those that like pipes, we have the always versatile pipe version:

   $ printf "hello\nworld\n" | ./koupler.sh -pipe -streamName boneill-dev-test
   [INFO] 2015-10-15 00:18:05,031 producer.KinesisProducerConfiguration.fromPropertiesFile - Attempting to load config from file ./conf/kpl.properties
   [INFO] 2015-10-15 00:18:05,058 producer.KinesisProducer.extractBinaries - Extracting binaries to /var/folders/2f/wqb5702967s58rtsgb5kzd940000gp/T/amazon-kinesis-producer-native-binaries
   [2015-10-15 00:18:05.360559] [0x00007fff7120e000] [info] [metrics_manager.h:148] Uploading metrics to monitoring.us-east-1.amazonaws.com:443
   [INFO] 2015-10-15 00:18:05,699 koupler.KinesisEventProducer.<init> - Firing up pipe listener
   [DEBUG] 2015-10-15 00:18:05,703 koupler.Koupler.call - Queueing event [hello]
   [DEBUG] 2015-10-15 00:18:05,704 koupler.Koupler.call - Queueing event [world]

Metrics

Koupler keeps track of following metrics. These metrics are available in CloudWatch under 'Custom Metrics', and lets you see status by host. Use the "-metrics" switch to enable.

Metric Description
BytesPerEvent Average bytes per event / message
CompletedEventsPerSecond Events per second successfully ack'd by Kinesis
QueuedEventsPerSecond Events per second queued with the Kinesis Producer Library (KPL)
EventQueueCount The size of the queue/backlog within KPL

koupler's People

Contributors

boneill42 avatar gajohnson avatar hanleyhansen avatar jcantara-work avatar jjpersch avatar xaxiomatic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

koupler's Issues

Clarification on fields and partition key usage

Hi there, and thanks for koupler - it is just what we were looking for.

A question on usage:

    -delimiter <arg>          delimiter between fields (default: ',')
    -partitionKeyField <arg>  zero-based index of field containing partition key (default: 0)

Might be just terminology confusion on my part, but the docs say Kinesis message data is a just a blob of data (which, in my case, will be a JSON-encoded string). What is the fields concept here?

Additionally, if I wanted to use asdf123 as my partition key, and a JSON blob as my data, would the call (over HTTP) look like:

 curl -d "asdf123,{\"key\": \"value\"}" http://192.168.99.123:4321/event

I appreciate the clarification here.

Any metrics around errors?

Hi there.

Reading through the metrics section of the README, I noticed there were no metrics around errors. Surely there are error conditions in Koupler, such as with issues writing to a stream (I suppose throttling is a separate concern), and having error metrics exposes a method for monitoring the health of what's happening within Koupler.

Since this metric is not currently instrumented, how do you (or any other users of Koupler) ensure the health of Koupler, other than a basic monit entry?

I appreciate your time and feedback.

Intermittent build failure

I am by no definition a java developer, nor to I have any experience with gradle, but I noticed an intermittent build (test?) failure, it's pretty uncommon, but by simply looping gradle clean build copyRuntimeLibs I've been able to recreate it several times.

I'll quantify what I mean by "uncommon". I had it happen twice within about 5 builds, so I set out to forcibly recreate it; the next one didn't happen until after over an hour of continuous builds (averaging 12 seconds a piece), but it does always eventually happen.

(this is what I used, pretty simple really)

while gradle clean build copyRuntimeLibs; do echo 'AGAIN!'; done
java.lang.AssertionError: Did not queue all records! expected:<5> but was:<3>
    at org.junit.Assert.fail(Assert.java:88)
    at org.junit.Assert.failNotEquals(Assert.java:743)
    at org.junit.Assert.assertEquals(Assert.java:118)
    at org.junit.Assert.assertEquals(Assert.java:555)
    at com.monetate.koupler.PipeKouplerTest.test(PipeKouplerTest.java:29)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
    at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:105)
    at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:56)
    at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:64)
    at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:49)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
    at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
    at org.gradle.messaging.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
    at org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
    at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
    at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:106)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
    at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
    at org.gradle.messaging.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:360)
    at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:54)
    at org.gradle.internal.concurrent.StoppableExecutorImpl$1.run(StoppableExecutorImpl.java:40)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

And that number at the top: <5> but was:<3>, the 3 is not constant, I've gotten it with 4 or 2 as well (I guess that might be obvious, but it might not be, I'm not too familiar with the project).

I'm on OSX, 10.11.3, but I've also recreated the same issue (identical stack trace) in Linux Mint, and with the java docker container (which I think is debian jessie?).

How can we change the URL of the Kinesis server?

Where in this repository/project can we modify the Kinesis endpoint similar to this:

client = new AmazonKinesisClient();
client.setEndpoint(endpoint, serviceName, regionId);

Thanks in advance!

Outdated documentation

I'm trying to build/run koupler, and following the steps from the readme, I ran
gradle clean build copyRuntimeLibs, which succeeded, but running ./koupler.sh fails, due to a missing lib directory:

~/g/k/sh ❯❯❯ ./koupler.sh
ls: ./lib/*.jar: No such file or directory
Error: Could not find or load main class com.monetate.koupler.Koupler

How should I go about running this? Is the script outdated, or is some setup failing to create a lib directory? Any help would be appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.