Git Product home page Git Product logo

Comments (4)

lefthandmagic avatar lefthandmagic commented on August 17, 2024

I guess you can write your custom parser and override configs to use it. Maybe also submit a PR so that it can be included in the default parser's secor ships with if it's reusable I guess.

from secor.

 avatar commented on August 17, 2024

I've looked into building a custom parser using Avro's GenericRecord. We need a Schema repository for associating Kafka topics with Avro Schemas in order to deserialize the records within a single record parser. Camus uses the same concept and Avro has a very active ticket for implementing the Schema Repo https://issues.apache.org/jira/browse/AVRO-1124

To get something up and running now, I think the best approach right now would be to introduce an interface for the Schema Repo and just have an implementation set up by local configuration. I'll work on a PR for this.

from secor.

silasdavis avatar silasdavis commented on August 17, 2024

I am looking at exactly this problem and have run Camus. I would like to flush to S3 based on an size-based upload policy and also partition based on properties of the (Avro) message, which it looks like Secor allows whereas Camus does not.

In our case we have a static in-memory repository of Avro schemas indexed by the an 8-bit schema fingerprint. We use this fingerprint as the first 8-bits of the Kafka message and can use this to look up the schema to decode the rest of the message. Since schema creation and migration is linked to our source control process this works for us. Clearly a dynamically updateable schema repository is necessary if you want to aggregate arbitrary Avro messages without redeploying code.

@upio I am about to look into implementing and Avro message parser for Secor. If you get in touch soon we may be able to avoid duplicating effort.

from secor.

 avatar commented on August 17, 2024

@silasdavis It sounds like you're further along with this than I am. I haven't used Camus, only looked at how they use the Schema Repositories. So far all i've done is hook up a basic "repository" that uses a hard-coded configuration to associate a Kafka topic with a Schema. Basically an interface and a HashMap ;) I'd be interested in seeing what you have learned from using Camus.

from secor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.