Git Product home page Git Product logo

kafka-connect-marklogic's Introduction

I am Sanju Thomas. I grew up in a southern Indian state named Kerala. I moved to the United States in 2013. Now, I live in a town called Westfield in NJ. I am an avid learner, scrum-er, delivery-focused, highly disciplined, and business-savvy software craftsman who writes clean, loosely coupled, robust, and maintainable design/code through practicing agile engineering. My latest interests include but are not limited to Reactive Programming, Event Sourcing Architecture, CQRS, and Kafka Centric ETL/Streaming solutions.

kafka-connect-marklogic's People

Contributors

dependabot[bot] avatar sanjuthomas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

kafka-connect-marklogic's Issues

Add .editorconfig file

EditorConfig is a plugin available for many common editors that will ensure a variety of settings match those used in your project. For instance, this can help make sure that indentation is the way you want it. Example

On starting the app with Kafka

On starting the app with Kafka, it is throwing the below errors:

[2018-05-30 16:13:46,276] INFO [Consumer clientId=consumer-8, groupId=connect-marklogic-sink] Revoking previously assigned partitions [] (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:411)
[2018-05-30 16:13:46,276] INFO [Consumer clientId=consumer-8, groupId=connect-marklogic-sink] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:442)
[2018-05-30 16:13:46,276] INFO [Consumer clientId=consumer-9, groupId=connect-marklogic-sink] Revoking previously assigned partitions [] (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:411)
[2018-05-30 16:13:46,279] INFO [Consumer clientId=consumer-9, groupId=connect-marklogic-sink] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:442)
[2018-05-30 16:13:46,289] INFO [Consumer clientId=consumer-8, groupId=connect-marklogic-sink] Successfully joined group with generation 23 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:409)
[2018-05-30 16:13:46,290] INFO [Consumer clientId=consumer-9, groupId=connect-marklogic-sink] Successfully joined group with generation 23 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:409)
[2018-05-30 16:13:46,290] INFO [Consumer clientId=consumer-9, groupId=connect-marklogic-sink] Setting newly assigned partitions [] (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:256)
[2018-05-30 16:13:46,290] INFO [Consumer clientId=consumer-8, groupId=connect-marklogic-sink] Setting newly assigned partitions [quote-request-0] (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:256)
[2018-05-30 16:13:46,298] INFO [Consumer clientId=consumer-8, groupId=connect-marklogic-sink] Resetting offset for partition quote-request-0 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher:561)
[2018-05-30 16:13:46,303] INFO Received 1 records. kafka coordinates from record: Topic - quote-request, Partition - 0, Offset - 0 (kafka.connect.marklogic.sink.MarkLogicSinkTask:43)
[2018-05-30 16:13:46,310] INFO (withForestConfig) Using [localhost] hosts with forests for "Documents" (com.marklogic.client.datamovement.impl.WriteBatcherImpl:745)
[2018-05-30 16:13:46,310] INFO Adding DatabaseClient on port 8000 for host "localhost" to the rotation (com.marklogic.client.datamovement.impl.WriteBatcherImpl:759)
[2018-05-30 16:13:46,311] INFO threadCount=8 (com.marklogic.client.datamovement.impl.WriteBatcherImpl:259)
[2018-05-30 16:13:46,311] INFO batchSize=100 (com.marklogic.client.datamovement.impl.WriteBatcherImpl:260)
[2018-05-30 16:13:46,311] ERROR WorkerSinkTask{id=marklogic-sink-7} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted. (org.apache.kafka.connect.runtime.WorkerSinkTask:544)
java.lang.NullPointerException
at java.util.HashMap.putMapEntries(HashMap.java:501)
at java.util.LinkedHashMap.(LinkedHashMap.java:384)
at kafka.connect.marklogic.MarkLogicBufferedWriter.lambda$flush$2(MarkLogicBufferedWriter.java:74)
at java.util.ArrayList.forEach(ArrayList.java:1257)
at kafka.connect.marklogic.MarkLogicBufferedWriter.flush(MarkLogicBufferedWriter.java:73)
at kafka.connect.marklogic.MarkLogicBufferedWriter.write(MarkLogicBufferedWriter.java:59)
at kafka.connect.marklogic.sink.MarkLogicSinkTask.put(MarkLogicSinkTask.java:47)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:524)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:302)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:205)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:173)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2018-05-30 16:13:46,311] INFO Flush - Topic quote-request, Partition 0, Offset 0, Metadata (kafka.connect.marklogic.sink.MarkLogicSinkTask:100)
[2018-05-30 16:13:46,312] ERROR WorkerSinkTask{id=marklogic-sink-7} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:172)
org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:546)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:302)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:205)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:173)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at java.util.HashMap.putMapEntries(HashMap.java:501)
at java.util.LinkedHashMap.(LinkedHashMap.java:384)
at kafka.connect.marklogic.MarkLogicBufferedWriter.lambda$flush$2(MarkLogicBufferedWriter.java:74)
at java.util.ArrayList.forEach(ArrayList.java:1257)
at kafka.connect.marklogic.MarkLogicBufferedWriter.flush(MarkLogicBufferedWriter.java:73)
at kafka.connect.marklogic.MarkLogicBufferedWriter.write(MarkLogicBufferedWriter.java:59)
at kafka.connect.marklogic.sink.MarkLogicSinkTask.put(MarkLogicSinkTask.java:47)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:524)

ML9 version - No ability to define ML endpoint to write to

In the ml8 version of the connector there is a ml.connection.url property that let's you specifically define the endpoint you'd like to write to.

However, the ml9 version instead has ml.connection.host and ml.connection.port properties, which by default uses the /v1/documents endpoint. The documentation says:
"By default the /v1/documents endpoint at port 8000 is used. You may change that in the marklogic-sink.properties file. You may use your own REST/Service extension instead of the out of the box document API to do any transformation on the way in."

But unless I'm missing something, I can't see a way of defining a REST extension to use instead of the default?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.