Git Product home page Git Product logo

kafkaclient's People

Contributors

awr avatar ievsiukovi avatar jroland avatar kichristensen avatar ligu avatar micahzoltu avatar pzang avatar randa1 avatar sixeyed avatar warrenfalk avatar wsimmonds avatar xtofs avatar yoniabrahamy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

kafkaclient's Issues

Commandline tool

This can be included as part of the nuget package (in tools), and ideally would mimic the canonical tool from a parameter point of view (at least where it makes sense to do so).

Max count configuration to avoid buffer overruns

When decoding objects received from the server, it is possible to run into scenarios where we allocate too much memory (because of bad transmission, a bug on the server, man-in-the-middle, etc). There should be a way for the user to specify a maximum size for arrays that are allocated when decoding.

This could be a single value across the board, or could be more specific (perhaps message numbers are much larger than topic response numbers, for example).

Stress testing

Simulation of load with a multi server environment is necessary to ensure that the client doesn't fall down.

Producer retries

Producer.SendBatchWithCodecAsync should retry when failing because of metadata changes.
It should likely take a similar approach to BrokerRouterExtensions.GetTopicOffsetsAsync

Publish to NuGet

Currently, this is just some code that you can build. Not very discoverable.

Support .Net standard

This only works on .net 4.5.2 at this point.

The simple solution is to move to .net 4.6 and core 1.1. Note that this will require adopting a versions of AsyncEx.* projects.

SSL support

It seems this lib lacks the SSL support.
I need it, so I would hack it & contribute.
Any suggestions, ideas, maybe some guidance?

Topic Management

The purpose is to be able to add/remove topics and consumer groups.
This can be used for tooling, but more also for test setup (currently integration tests fail on first time through because of the autocreation lag).

Ideally this is done with the "internal" apis so it's legit and there are no other external dependencies for tests and operational usage.

Coverage reports are not working

The goal is to keep coverage for tests on CI >= 90% in unit + integration tests, and >= 80% for unit tests.

To verify that this is happening, we need coverage running from CI.

Currently, opencover is failing with the new pdb format -- see OpenCover/opencover#732. Once they update the nuget package available and we adopt it, we should start seeing coverage numbers.

Performance tuning

The library needs to be tuned from a performance point of view, especially wrt sending and receiving messages at high volume.

There is likely some low hanging fruit if profiled. It would also be good to benchmark against a couple other c# libraries, at least for the simple scenarios.

Support for updated Protocol

There are a number of new APIs (21-33) as well as many new versions of existing APIs. The on-the-wire representation needs to be fleshed out for these, along with tests.

Improve performance for ProduceRequest and FetchResponse

A dependent of #16

Encoding of ProduceRequest

Using random, consistent message bytes.

BenchmarkDotNet=v0.10.1, OS=Windows
Processor=?, ProcessorCount=8
Frequency=2435767 Hz, Resolution=410.5483 ns, Timer=TSC
dotnet cli version=1.0.0-preview2-1-003177
  [Host]     : .NET Core 4.6.24628.01, 64bit RyuJIT
  DefaultJob : .NET Core 4.6.24628.01, 64bit RyuJIT

Benchmark Baseline (CodecNone)

Note that speed is important, but so is memory allocation and byte size. Byte size will directly impact network traffic costs (especially in terms of overall speed).

Method Messages MessageSize Codec Mean StdDev Gen 0 Gen 1 Gen 2 Allocated
Encode 100 1 CodecNone 82.9727 us 0.8513 us 7.4056 - - 63.63 kB
Encode 100 1000 CodecNone 1,211.5400 us 9.5125 us 31.7708 31.7708 31.7708 429.46 kB
Encode 10000 1 CodecNone 8,695.0426 us 46.7663 us 702.0833 47.9167 45.8333 6.43 MB
Encode 10000 1000 CodecNone 131,300.6830 us 845.3136 us 1454.1667 641.6667 641.6667 46.84 MB
Codec Level Messages MessageSize Bytes
CodecNone - 100 1 2746
CodecNone - 100 1000 102646
CodecNone - 10000 1 270046
CodecNone - 10000 1000 10260046
CodecGzip Fastest 100 1 147
CodecGzip Optimal 100 1 132
CodecGzip Fastest 100 1000 2272
CodecGzip Optimal 100 1000 1816
CodecGzip Fastest 10000 1 1718
CodecGzip Optimal 10000 1 782
CodecGzip Fastest 10000 1000 100861
CodecGzip Optimal 10000 1000 56002

Benchmark Gzip (CodecGzip with byte[])

Method Messages MessageSize Codec Level Mean StdDev Gen 0 Gen 1 Gen 2 Allocated
Encode 100 1 CodecGzip Fastest 108.3579 us 0.5767 us 7.0313 - - 73.96 kB
Encode 100 1 CodecGzip Optimal 103.9860 us 0.2018 us 7.0313 - - 73.94 kB
Encode 100 1000 CodecGzip Fastest 1,304.7041 us 9.7604 us 33.8542 33.8542 33.8542 442.34 kB
Encode 100 1000 CodecGzip Optimal 1,331.5181 us 8.1201 us 36.9792 36.9792 36.9792 442.37 kB
Encode 10000 1 CodecGzip Fastest 8,978.8525 us 78.0883 us 710.4167 56.2500 54.1667 6.45 MB
Encode 10000 1 CodecGzip Optimal 8,897.3779 us 88.5309 us 718.7500 64.5833 62.5000 6.45 MB
Encode 10000 1000 CodecGzip Fastest 133,817.3749 us 1,517.0871 us 1075.0000 262.5000 262.5000 45.02 MB
Encode 10000 1000 CodecGzip Optimal 138,365.5674 us 305.4618 us 1104.1667 291.6667 291.6667 46.15 MB

Benchmark Gzip (CodecGzip with Stream)

Method Messages MessageSize Codec Level Mean StdDev Gen 0 Gen 1 Gen 2 Allocated
Encode 100 1 CodecGzip Fastest 108.4666 us 1.6359 us 6.2826 - - 73.51 kB
Encode 100 1 CodecGzip Optimal 108.6534 us 1.1747 us 6.5430 - - 73.51 kB
Encode 100 1000 CodecGzip Fastest 1,291.8933 us 21.6539 us 32.8125 32.8125 32.8125 437.78 kB
Encode 100 1000 CodecGzip Optimal 1,288.3538 us 9.6701 us 33.8542 33.8542 33.8542 437.78 kB
Encode 10000 1 CodecGzip Fastest 8,635.0205 us 74.1442 us 718.7500 64.5833 62.5000 6.44 MB
Encode 10000 1 CodecGzip Optimal 8,621.5494 us 82.3327 us 677.0833 22.9167 20.8333 6.44 MB
Encode 10000 1000 CodecGzip Fastest 137,845.2882 us 5,184.4836 us 1158.3333 345.8333 345.8333 44.82 MB
Encode 10000 1000 CodecGzip Optimal 137,451.9864 us 1,243.2845 us 1187.5000 375.0000 375.0000 45.72 MB

The stream approach seems to be marginally better for memory allocation (consistent across runs), with timing in the same ballpark (varying across runs). I'm going to opt for the streaming approach.

The level isn't quite as clear -- it appears that Optimal is likely worth using, but I'd like to see what happens when unzipping first.

Extensible SASL support

There is a need for SASL support at the Connection level (to be triggered right after API version support is triggered), with a default implementation of PLAIN. This should be extensible so any user of the client can provide their own implementation if it isn't included in the library.

Add missing automated tests

Consumer:

  • do not lose message when blocked
  • querying should move to the next offset automatically
  • cancellation interrupts consumption
  • coordinator state changes are managed correctly (when heartbeating, etc)
  • when in the middle of processing a message and changing state, resolution for current position is satisfied

Assignment:

  • only assigned partitions are read from
  • if more members are subscribed than partitions, additional members gracefully exit (and do not cause excessive chatter)
  • if same cardinality, one member is assigned to one partition
  • if more partitions than members, all partitions are still assigned (as evenly as possible)
  • works with 1 or many subscriptions
  • extensible to a priority based assignment

Management / monitoring:

  • can list topics
  • can list consumer groups and offsets
  • throughput telemetry numbers lie in correct range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.