Git Product home page Git Product logo

dotnet-avro's Introduction

Chr.Avro

Chr.Avro is an Avro implementation for .NET. It’s designed to serve as a flexible alternative to the Apache implementation and integrate seamlessly with Confluent’s Kafka and Schema Registry clients.

For more information, check out the documentation.

Quick start

To use the command line interface: Install Chr.Avro.Cli as a global tool:

$ dotnet tool install Chr.Avro.Cli --global
You can invoke the tool using the following command: dotnet-avro
Tool 'chr.avro.cli' (version '10.3.0') was successfully installed.
$ dotnet avro help
Chr.Avro 10.3.0
...

To use the Kafka producer/consumer builders in your project: Add Chr.Avro.Confluent as a project dependency. After that, check out this guide or read on for some other examples.

Examples

The CLI can be used to generate Avro schemas for .NET types (both built-in and from compiled assemblies):

$ dotnet avro create -t System.Int32
"int"
$ dotnet avro create -t System.Decimal
{"type":"bytes","logicalType":"decimal","precision":29,"scale":14}
$ dotnet avro create -a out/example.dll -t ExampleRecord
{"name":"ExampleRecord","type":"record","fields":[{"name":"Number","type":"long"}]}

It can also verify that a .NET type can be mapped to a Schema Registry schema (useful for both development and CI):

$ dotnet avro registry-test -a out/example.dll -t ExampleRecord -r http://registry:8081 -i 242
A deserializer cannot be created for ExampleRecord: ExampleRecord does not have a field or property that matches the correlation_id field on example_record.

Extensions to the Confluent.Kafka ProducerBuilder and ConsumerBuilder configure Kafka clients to produce and consume Avro-encoded CLR objects:

using Chr.Avro.Confluent;
using Confluent.Kafka;
using Confluent.SchemaRegistry;
using System;
using System.Collections.Generic;

namespace Example
{
    class ExampleRecord
    {
        public Guid CorrelationId { get; set; }
        public DateTime Timestamp { get; set; }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var consumerConfig = new ConsumerConfig()
            {
                BootstrapServers = "broker1:9092,broker2:9092",
                GroupId = "example_consumer_group"
            };

            var registryConfig = new SchemaRegistryConfig()
            {
                SchemaRegistryUrl = "http://registry:8081"
            };

            var builder = new ConsumerBuilder<string, ExampleRecord>(consumerConfig);

            using (var registry = new CachedSchemaRegistryClient(registryConfig))
            {
                builder.SetAvroKeyDeserializer(registry);
                builder.SetAvroValueDeserializer(registry);

                using (var consumer = builder.Build())
                {
                    var result = consumer.Consume(CancellationToken.None);
                    Console.WriteLine($"Consumed message! {result.Key}: {result.Value.Timestamp}");
                }
            }
        }
    }
}

Under the hood, SchemaBuilder is responsible for generating schemas from CLR types:

using Chr.Avro.Abstract;
using Chr.Avro.Representation;
using System;

namespace Example
{
    enum Fear
    {
        Bears,
        Children,
        Haskell,
    }

    struct FullName
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
    }

    class Person
    {
        public Guid Id { get; set; }
        public Fear GreatestFear { get; set; }
        public FullName Name { get; set; }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var builder = new SchemaBuilder();
            var writer = new JsonSchemaWriter();

            Console.WriteLine(writer.Write(builder.BuildSchema<double>));
            // "double"

            Console.WriteLine(writer.Write(builder.BuildSchema<DateTime>));
            // "string"

            Console.WriteLine(writer.Write(builder.BuildSchema<Fear>));
            // {"name":"Fear","type":"enum","symbols":["Bears","Children","Haskell"]}

            Console.WriteLine(writer.Write(builder.BuildSchema<Person>));
            // {"name":"Person","type":"record"...}
        }
    }
}

For more complex examples, see the examples directory:

Contributing

Check out the contribution guidelines prior to opening an issue or creating a pull request. More information about the benchmark applications and documentation site can be found in their respective directories.

dotnet-avro's People

Contributors

chr-leeolsen avatar chwebdude avatar darren-clark avatar dstelljes avatar fabianoliver avatar gurry avatar jonasby avatar kalev-k avatar kingwill27 avatar mteng avatar nicodeslandes avatar promontis avatar runerys avatar sikta avatar thomasbrueggemann avatar woodlee avatar xontab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dotnet-avro's Issues

Invalid record alias; a definition for XYZ.ABC was already read.

@dstelljes While consuming message using Chr.avro (Message produce by Apache.avro) facing issue for Aliases Consume:System.IO.InvalidDataException: Invalid record alias; a definition for XYZ.ABC was already read.

{
	"type": "record",
	"name": "ABC",
	"namespace": "XYZ",
	aliases":["XYZ.ABC"]
	"fields": [{
			"name": "name",
			"type": ["null", "string"]
		}, {
			"name": "code",
			"type": ["null", "string"]
		},
		{
			"name": "cancel_date",
			"default": null,
			"type": ["null", {
				"type": "int",
				"logicalType": "date"
			}]
		}
	]
}

Same schema works fine after remove aliases. But In our case we require aliases
Can you suggest any workaround this?

Support default values when building serialization functions

Open questions:

  • The serializer builder currently throws when a field on a record does not have a matching property on the .NET type being mapped. If a default value were used instead, would that obscure errors (like typos on member names)?

  • How would default values be represented on the Schema class?

Failing to provide values for enum-typed properties can silently produce corrupt Avro data

In integrating Chr.Avro with the Confluent stack and producing from POCOs, we've run across a case where Avro-encoded messages are corrupt when enum-typed properties are null. Specifically, if the enum-typed property in the POCO is not a Nullable but no value is passed, the serialized bytes may either default to the first value of the enum, or not write any bytes for that property at all. This simplified code sample which was run against v5.0.1 demonstrates:

using System;
using Chr.Avro.Representation;
using Chr.Avro.Serialization;

namespace IssueExample
{
    public enum MyExplicitEnum
    {
        MyEnumOne = 2,
        MyEnumTwo = 3
    }

    public enum MyImplicitEnum
    {
        MyEnumOne,
        MyEnumTwo
    }

    public class TestRecord
    {
        public bool MyBool { get; set; }
        public MyExplicitEnum MyExplicitEnum { get; set; }
        public string MyString { get; set; }
        public MyImplicitEnum MyImplicitEnum { get; set; }
    }

    internal static class Program
    {
        private const string SchemaJson = @"
        {
            ""type"": ""record"",
            ""name"": ""TestRecord"",
            ""fields"": [
                {
                    ""name"": ""MyBool"",
                    ""type"":  ""boolean""
                },
                {
                    ""name"": ""MyExplicitEnum"",
                    ""type"": {
                        ""type"": ""enum"",
                        ""name"": ""MyExplicitEnum"",
                        ""symbols"": [
                            ""MyEnumOne"",
                            ""MyEnumTwo""
                        ]
                    }
                },
                {
                    ""name"": ""MyString"",
                    ""type"": ""string""
                },
                {
                    ""name"": ""MyImplicitEnum"",
                    ""type"": {
                        ""type"": ""enum"",
                        ""name"": ""MyImplicitEnum"",
                        ""symbols"": [
                            ""MyEnumOne"",
                            ""MyEnumTwo""
                        ]
                    }
                }
            ]
        }
        ";

        private static void Main()
        {
            var schema = new JsonSchemaReader().Read(SchemaJson);
            var serializer = new BinarySerializerBuilder().BuildSerializer<TestRecord>(schema);

            var enumsSpecified = new TestRecord
            {
                MyBool = true,
                MyExplicitEnum = MyExplicitEnum.MyEnumTwo,
                MyString = "abcd",
                MyImplicitEnum = MyImplicitEnum.MyEnumTwo
            };

            // This prints 01-02-08-61-62-63-64-02... all OK:
            Console.WriteLine(BitConverter.ToString(serializer.Serialize(enumsSpecified)));

            var enumsOmitted = new TestRecord
            {
                MyBool = true,
                MyString = "abcd"
            };

            // But this prints 01-08-61-62-63-64-00... MyImplicitEnum has been "defaulted" to 00 (i.e. "MyEnumOne") and
            // MyExplicitEnum is not encoded at all, leading to corrupt data:
            Console.WriteLine(BitConverter.ToString(serializer.Serialize(enumsOmitted)));

            // If the JSON Avro schema is changed to make the two enumeration fields union types of [null, <the enum type>],
            // and we run again we get similar troubles:
            //   enumsSpecified: 01-02-02-08-61-62-63-64-02-02
            //   enumsOmitted:   01-02-08-61-62-63-64-02-00

            // Only if the types of the enums in the `TestRecord` POCO are made nullable does it behave as expected:
            //   Nullable in POCO, NOT nullable in Avro schema:
            //     Throws "System.InvalidOperationException: The binary operator Equal is not defined for the types
            //     'System.Nullable`1[IssueExample.MyExplicitEnum] and 'IssueExample.MyExplicitEnum'."
            //   Nullable in POCO, nullable in Avro schema:
            //     enumsSpecified: 01-02-02-08-61-62-63-64-02-02
            //     enumsOmitted:   01-00-08-61-62-63-64-00
        }
    }
}

When a non-Nullable property is left null on a POCO passed to Chr.Avro, I think I would expect to see a null reference exception instead. Does that seem reasonable? In the meantime we're just going to make the problem fields nullable in the POCOs we are passing to Chr.Avro.

Support adding default values to generated schema

@dstelljes I'm creating this issue based on your comment on #7

Since default values aren't accessible via reflection, maybe an easier way to implement this functionality would be via an annotation? Something to the effect of

public class Message
{
  [AvroDefaultValue(null)]
  public int? Property { get; set; } = null;
}

Schema Registry serdes are incompatible with Confluent.SchemaRegistry 1.4.0

ISchemaRegistryClient.GetLatestSchemaAsync returns RegisteredSchema starting with 1.4.0:

Method not found:  System.Threading.Tasks.Task`1<Confluent.SchemaRegistry.Schema> Confluent.SchemaRegistry.ISchemaRegistryClient.GetLatestSchemaAsync(System.String)'.

The new lower bound for Chr.Avro.Confluent should be 1.4.0, and since the Confluent clients follow librdkafka versioning, not semver, we should narrow the range of allowed versions (probably to patch instead of minor).

With this change, when fetching a schema, the serdes should check the type to ensure it's Avro.

Exceptions in new version

Hi @dstelljes!

We see a big performance degrade in creating the schema's when switches to master (I think this is 3.0). I think this is because of the many exceptions being thrown in code like:

   foreach (var @case in Cases)
            {
                try
                {
                    return @case.Read(element, cache, scope);
                }
                catch (UnknownSchemaException exception)
                {
                    exceptions.Add(exception);
                }
            }

            throw new AggregateException($"No schema reader case matched {element.ToString()}", exceptions);

Such code is located in JsonSchemaReader, TypeResolver, BinarySerializerBuilder.

This is especially performance heavy when the type/schema is located at the end of the array.

For this POCO...

	public class Foo
	{
		public string Bar { get; set; }
		public DateTime Date { get; set; }
	}

... I'm currently getting 391 exceptions. All handled, but it takes a long time.

Account for nullable reference types when generating schemas

At present, the type resolvers take an all-or-nothing approach to determining nullability of reference types. The schema generator would be more useful if the resolvers took nullable reference type metadata into account.

This would be a breaking change given that it’s incongruent to the existing resolveReferenceTypesAsNullable option on the TypeResolver. One possible implementation:

  • Remove the resolveReferenceTypesAsNullable boolean and introduce a NullableReferenceTypeBehavior enum (like TemporalBehavior, TombstoneBehavior, etc.).
    • None (never generate nullable union schemas; equivalent to resolveReferenceTypesAsNullable: false)
    • Semantic (always generate nullable union schemas; equivalent to resolveReferenceTypesAsNullable: true)
    • FromMetadata (look for nullable metadata, falling back to None behavior if oblivious)
  • Make FromMetadata the default behavior. This matches the current default behavior (resolveReferenceTypesAsNullable: false) for oblivious types and transparently enables better behavior for non-oblivious types.
  • Keep the default behavior of dotnet avro generate as long as nullable reference types are opt-in, but add a flag to support generating non-oblivious code.

Open questions:

  • What’s the safest way to reflect on NullableAttribute/NullableContextAttribute? Direct use of those types isn’t allowed in source, but it’s still possible to grab them by name:
    var attribute = Type.CustomAttributes.SingleOrDefault(attribute => attribute.FullName == "...");
  • For consistency, should we also enum-ify resolveUnderlyingEnumTypes?

DateTimeOffset serialization does not preserve local date/time

Because DateTimeOffset is converted to a UTC-based DateTime prior to serialization, DateTimeOffsets created from local times are not deserialized back to the same representation.

The unit tests do not catch this because the equality operator for DateTimeOffset returns true if the represented points in time are the same. It does not care how they are represented as noted here.

To resolve this, DateTimeOffset should be handled separately from DateTime. The serializer and deserializer should use the ToString() and Parse() methods, respectively, on DateTimeOffset itself to preserve the original semantics of the DateTimeOffset value.

Reference:

BinarySerializerBuilder.cs / StringSerializerBuilderCase.BuildDelegate()
https://github.com/ch-robinson/dotnet-avro/blob/master/src/Chr.Avro.Binary/BinarySerializerBuilder.cs#L1953

BinaryDeserializerBuilder.cs / StringDeserializerBuilderCase.BuildDelegate()
https://github.com/ch-robinson/dotnet-avro/blob/development/src/Chr.Avro.Binary/BinaryDeserializerBuilder.cs#L1996

I'm still familiarizing myself with the code base here, but it appears that this implementation is consistent across the master, development, and 3.x branches as of this writing. I was focused on the string serializer because strings are the recommended schema type for handling DateTimes and DateTimeOffsets in this library. I have not looked closely at the micro- and millisecond logical type implementations to see if / how they might be affected by this.

This code illustrates the problem:

var loc = DateTime.Now;
var utc = loc.ToUniversalTime();
var dtoFromLocal = new DateTimeOffset(loc);
var dtoFromUtc = new DateTimeOffset(utc);

Although .NET considers dtoFromLocal and dtoFromUtc to be "equal", note that the DateTime and Offset properties differ, so they are not identical, and this can affect how they are rendered and persisted by other applications.

This code demonstrates handling roundtrip the value identically:

var dtoIn = DateTimeOffset.Now;
var dtoStr = dtoIn.ToString("O");
var dtoOut = DateTimeOffset.Parse(dtoStr);

Missing support for loading additional support assemblies during avro create

If you are using extra libraries around the models that use attributes, dotnet avro create -a ASSEMBLY -t TYPE throws assembly loading exceptions. For example if you are also using these models with the System.Text.Json serializer and have properties marked with [JsonPropertyName], then when running the avro create command will throw could not load file or assembly error. What I think is necessary is to make the -a flag a list value, allowing the assemblies to be loaded. That, or add support for loading a csproj file instead which would include all the nuget includes, etc. Or allow for setting the current working directory. I'm guessing that assemblies are being loaded from the global install path of dotnet-avro.

In the steps to reproduce the app wants to load System.Text.Json version 4.0.1.2. This fails.
On my machine not using the nuget install sets the assembly to require System.Text.Json version 4.0.1.0. This Succeeds.
I don't know why but project references load fine.

Steps to reproduce:

  1. Create new dotnet core console application
  2. Use nuget to add System.Text.Json version 4.7.2
  3. Create a dummy class to be the avro model
  4. Add [JsonPropertyName()] attribute to a property of the model
  5. Build
  6. Run the command to build a schema from that assembly and type: dotnet-avro -a assembly -t type
  7. The error will reproduce

Decimal serialization fails with large numbers/scales

Able to reproduce with this test case:

using Chr.Avro.Representation;
using Chr.Avro.Serialization;
using Xunit;

namespace Chr.Avro.Bugs
{
    public class DecimalTriage
    {
        public const string Schema = @"{
  ""type"": ""bytes"",
  ""logicalType"": ""decimal"",
  ""precision"": 29,
  ""scale"": 14
}";

        [Fact]
        public void TestOverflow()
        {
            var schema = new JsonSchemaReader().Read(Schema);
            var deserializer = new BinaryDeserializerBuilder().BuildDeserializer<decimal>(schema);
            var serializer = new BinarySerializerBuilder().BuildSerializer<decimal>(schema);

            var value = 999246978759766M;
            Assert.Equal(value, deserializer.Deserialize(serializer.Serialize(value)));
        }
    }
}

Fix is probably to create two BigIntegers, one for the whole part and one for the fractional part, instead of multiplying the decimal value by the scale: https://github.com/ch-robinson/dotnet-avro/blob/master/src/Chr.Avro.Binary/BinarySerializerBuilder.cs#L696

Schemas for flag enums should match underlying type

Currently, the schema builder always produces "long" for flag enums. This is inconsistent with the resolver’s resolveUnderlyingEnumTypes option, which will result in "int" or "long" depending on the underlying type.

Clean up builder/case contracts

  1. IDictionary/ConcurrentDictionary use should be consistent.
  2. Properties should be public/read-only instead of protected when appropriate.
  3. Case implementations should throw if not compatible (no IsMatch).
  4. There should be a consistent way for cases to be constructed with whatever instance is using them:
    public static readonly IEnumerable<Func<SomeBuilder, SomeBuilderCase>> DefaultCaseBuilders;
    
    public IEnumerable<SomeBuilderCase> Cases { get; }
    
    public SomeBuilder(IEnumerable<Func<SomeBuilder, SomeBuilderCase>> caseBuilders)
    {
        Cases = (caseBuilders ?? DefaultCaseBuilders).Select(builder => builder(this));
    }

Eliminate codec method calls

Replace the binary codec with a codec builder (should return Expressions given stream/value ParameterExpressions).

Investigate:

  • cost of ArrayLength in Read
    • assigning count to a variable instead of using ArrayLength results in a slim (< 2%) boost for small arrays (12-byte benchmark) and an appreciable (~ 10%) boost for large arrays (2048-byte benchmark), regardless of whether count is a constant
  • possible use of ArrayPool in Read
  • Math.Abs vs. * -1 in ReadBlocks
  • NotEqual in ReadBoolean
    • GreaterThan appears to be about 2% slower
  • widths in ReadInteger

Avro Generate CLI Message Incorrect

Usage wording is incorrect when parameters are missing on the avro generate functionality.

dotnet avro generate --registry-url http://localhost:8081
Either --id or --schema (and optionally --version) must be provided.

--schema is not correct however, since it should be --subject, as shown when --schema is used in the previous command:

dotnet avro generate --registry-url http://localhost:8081 --schema testTopic-value --version 1
Chr.Avro.Cli 7.0.2
Copyright (C) 2020 C.H. Robinson

ERROR(S):
  Option 'schema' is unknown.
USAGE:
Generate code for a schema by ID:
  dotnet avro generate --id 120 --registry-url http://registry:8081

  -r, --registry-url    The URL of the schema registry.

  -i, --id              If a subject/version is not specified, the ID of the schema.

  -s, --subject         If an ID is not specified, the subject of the schema.

  -v, --version         The version of the schema.

  --help                Display this help screen.

  --version             Display version information.

Support for polymorphism

I'm adding support for polymorpishm via the use of an interface together with a SchemaKnownType attribute.

This will result in a union with multiple records.

Just checking on how you are feeling about this.

Tombstone support

Hi @dstelljes!

Maybe you could help us out... we are trying to produce a tombstone message to Kafka. From what I understand, it is not yet supported by onfluent-kafka-dotnet: see confluentinc/confluent-kafka-dotnet#905.

We tried sending a null value to an avro topic that has a union of null and a record (the POCO), but the subscriber (a JDBC sink) doesn't seem to see this as a tombstone message, and crashes into a null-reference exception trying to deserialize the avro message. Therefore, I think that we shouldn't be sending the tombstone message as an avro serialized null value, but shortcircuit the serialization and sent a plain (non avro serialized) null, like the java implementation does:

https://github.com/confluentinc/schema-registry/blob/master/avro-serializer/src/main/java/io/confluent/kafka/serializers/AbstractKafkaAvroSerializer.java#L56

Is this currently possible with this library?

Add high-level serde builder cases to support polymorphic mapping

To support polymorphic mapping (i.e., mapping an interface or abstract class to concrete classes), an application has to provide custom cases for the serde builders. In practice, this is really onerous—building a custom deserializer case, for instance, entails copying and pasting the union deserializer case and tweaking it to work with a specific interface.

We should provide some high-level cases that enable most of that union logic to be recycled (or make the existing cases more extensible), something like:

public class UnionDeserializerBuilderCase : IDeserializerBuilderCase
{
    public Delegate BuildDelegate(TypeResolution resolution, Schema schema, ConcurrentDictionary<(Type, Schema), Delegate> cache)
    {
        // all of the complicated stuff here
    }

    protected virtual TypeResolution SelectType(TypeResolution resolution, Schema schema)
    {
        // resolution is the same as the one passed to BuildDelegate (the resolution for the interface or abstract type)
        // schema is a member of the union (SelectType is called for each member)

        // determine which concrete type applies; return the resolution for that concrete type

        // default implementation just returns the existing resolution
        return resolution;
    }
}

Then, it’d be easier to build cases that disambiguated interfaces:

public class EventDeserializerBuilderCase : UnionDeserializerBuilderCase
{
    public override TypeResolution SelectType(TypeResolution resolution, Schema schema)
    {
        if (!(resolution is RecordResolution recordResolution) || recordResolution.Type != typeof(IEvent))
        {
            throw new UnsupportedTypeException(resolution.Type);
        }

        switch ((schema as RecordSchema)?.Name)
        {
            case "Concrete1":
                return Resolver.ResolveType<Concrete1>();

             // ...

            default:
                throw new UnsupportedSchemaException(schema);
        }
    }
}

Support parameterized constructors for record deserialization

Constructors could be a fallback for public setters. Proposed flow:

  1. The deserializer builder looks for a matching publicly-writable field or property for each record field.

  2. If exactly one match is not found for any record field, the deserializer builder looks for exactly one public constructor with exactly one matching parameter for each field (and no other non-optional parameters).

cc @evanbb

Improve documentation home page

Some art/layout work would be nice, as well as answering some of these questions:

  • What makes Chr.Avro different from other Avro libraries? (mapping to POCOs, schema builder, CLI)
  • Where is it used? (dotnet avro, Chr.Avro.Confluent)
  • How do I get started? (consumer/producer guide, CLI guide)

Unseal schema classes

Right now, the sealed Chr.Avro.Abstract classes could be a barrier to extensibility (an application may want to extend the abstract schema model based on some custom metadata).

Type could not be found error using assembly and type console arguments

Using the --assembly and --type arguments on the commandline of the Chr.Avro.Cli tool (e.g. create --type KafkaTests.Models.TestAvro --assembly C:\Projects\KafkaTests\bin\Debug\netcoreapp3.1\KafkaTests.Models.dll), I'm getting an error "The type could not be found. You may need to provide additional assemblies." Looking into the code, I can see that the ResolveType extension methods is loading assemblies, but those assemblies do not get saved to any variables, so when the code gets to this line...

return Type.GetType(options.TypeName, ignoreCase: true, throwOnError: true);

... it's not actually using the assembly that was loaded from the previous lines. I'm proposing this be fixed by adding a collection of Assemblies that the Assembly.Load or the Assembly.LoadFrom methods return and then iterating through that array to find the type that belongs to one of these assemblies.

    internal static class TypeOptionExtensions
    {
        public static Type ResolveType(this IClrTypeOptions options)
        {
            List<Assembly> assemblies = new List<Assembly>();
            foreach (var assembly in options.AssemblyNames)
            {
                try
                {
                    // If found, save this assembly to the assemblies collection.
                    assemblies.Add(Assembly.Load(assembly));
                    continue;
                }
                catch (FileNotFoundException)
                {
                    // nbd
                }
                catch (FileLoadException)
                {
                    // also nbd
                }

                try
                {
                    // If found, save this assembly to the assemblies collection.
                    assemblies.Add(Assembly.LoadFrom(Path.GetFullPath(assembly)));
                }
                catch (FileNotFoundException)
                {
                    throw new ProgramException(message: $"{assembly} could not be found. Make sure that you’ve provided either a recognizable name (e.g. System.Runtime) or a valid assembly path.");
                }
                catch (BadImageFormatException)
                {
                    throw new ProgramException(message: $"{assembly} is not valid. Check that the path you’re providing points to a valid assembly file.");
                }
            }

            try
            {
                // Iterate through the loaded assemblies to find the first one that contains the given type.
                foreach (var assembly in assemblies)
                {
                    Type type = assembly.GetType(options.TypeName, throwOnError: false, ignoreCase: true);
                    if (type != null)
                    {
                        return type;
                    }
                }

                return Type.GetType(options.TypeName, throwOnError: true, ignoreCase: true);
            }
            catch (TypeLoadException)
            {
                throw new ProgramException(message: "The type could not be found. You may need to provide additional assemblies.");
            }
        }
    }

Evaluate support for Generic/Immutable collection classes

Follow up from #59: Also support immutable array/list types? More comprehensive deserialization support for (both mutable and immutable) stacks/sets/etc.?

Current support:

System.Collections.Generic

  • ICollection<T> (assignable from List<T>)
  • IDictionary<TKey, TValue> (assignable from Dictionary<TKey, TValue>)
  • IEnumerable<T> (assignable from List<T>)
  • IList<T> (assignable from List<T>)
  • IReadOnlyCollection<T> (assignable from List<T>)
  • IReadOnlyDictionary<T> (assignable from Dictionary<TKey, TValue>)
  • IReadOnlyList<T> (assignable from List<T>)
  • ISet<T>
  • Dictionary<TKey, TValue> (assignable from Dictionary<TKey, TValue>)
  • HashSet<T>
  • LinkedList<T>
  • List<T> (assignable from List<T>)
  • Queue<T>
  • SortedDictionary<TKey, TValue>
  • SortedList<TKey, TValue>
  • SortedSet<TKey, TValue>
  • Stack<T>

System.Collections.Immutable

  • IImmutableDictionary<TKey, TValue>
  • IImmutableList<T>
  • IImmutableQueue<T>
  • IImutableSet<T>
  • IImutableStack<T>
  • ImmutableArray<T>
  • ImmutableDictionary<TKey, TValue>
  • ImmutableHashSet<T>
  • ImmutableList<T>
  • ImmutableQueue<T>
  • ImmutableSortedDictionary<TKey, TValue>
  • ImmutableSortedSet<TKey, TValue>
  • ImmutableStack<T>

Usage with confluent-kafka-dotnet's DependentProducerBuilder?

Hi--thank you for putting this library out there. It looks like it could help us get past several limitations we've encountered with the current Apache/Confluent versions.

The confluent-kafka-dotnet library provides a DependentProducerBuilder which can be used to create additional Avro producers for messages of different types, while still using a single underlying librdkafka handle.

Between this library and the Confluent one, I am not seeing a way to create Avro producers of multiple types that use a single handle. Am I missing anything, or is this something that would require extending your existing ProducerBuilder extension methods to cover DependentProducerBuilders as well?

Add readme info for benchmarks

The benchmark project doesn't appear in the solution explorer when opening the solution in Visual Studio. This issue is for adding some documentation to the readme on how to run the benchmarks.

Output more friendly exception messages in case of serialization failure, like what property failed to serialize according to the schema

When I try to serialize an object that has a not null string property in avro schema but I set is as null in code I'm getting System.ArgumentNullException: String reference not set to an instance of a String. (Parameter 's')
at System.Text.Encoding.GetBytes(String s)
at Transaction serializer(Closure , TransactionAvro )
at Chr.Avro.Serialization.BinarySerializer`1.Serialize(T value)

Not clear at what property it failed, it will be better to have an exception like Serialization failed, expected property {nameOfTheProperty} to be not null but was null. Or exepected property type to be string but was int, etc.

Support multidimensional arrays

Jagged (T[][]) arrays are currently supported; multidimensional (T[,]) are not. This would require some major changes to how the binary serdes treat arrays. Currently we just build a list and feed it to Enumerable.ToArray, which sidesteps the problem of knowing array size in advance.

Clarify auto registration behavior

The registerAutomatically parameter can be deceptive. Even if true, the Schema Registry producer builder won’t attempt to register a schema that matches the type unless (1) there’s no existing schema or (2) mapping the type to the schema fails.

Proposal: Make registerAutomatically an enum instead of a boolean, something like:

enum AutomaticRegistrationBehavior
{
    Never,
    WhenIncompatible,
    Always
}

Schema registry HTTP error responses are cached

We've recently been getting spates of tracebacks like the one below in an app of ours that uses Chr.Avro:

[("HResult": -2146233088), ("Message": "System.Net.Http.HttpRequestException: [https://schema-registry.***.com/] GatewayTimeout[https://schema-registry.***.com/] GatewayTimeout -1 
   at Confluent.SchemaRegistry.RestService.ExecuteOnOneInstanceAsync(Func`1 createRequest)
   at Confluent.SchemaRegistry.RestService.RequestAsync[T](String endPoint, HttpMethod method, Object[] jsonBody)
   at Confluent.SchemaRegistry.RestService.GetLatestSchemaAsync(String subject)
   at Confluent.SchemaRegistry.CachedSchemaRegistryClient.GetLatestSchemaAsync(String subject)
   at Chr.Avro.Confluent.AsyncSchemaRegistrySerializer`1.<SerializeAsync>b__24_0(String subject)
   at Chr.Avro.Confluent.AsyncSchemaRegistrySerializer`1.SerializeAsync(T data, SerializationContext context)
   at Confluent.Kafka.SyncOverAsync.SyncOverAsyncSerializer`1.Serialize(T data, SerializationContext context)
   at Confluent.Kafka.Producer`2.Produce(TopicPartition topicPartition, Message`2 message, Action`1 deliveryHandler)"), ...<snip>...
   at Confluent.SchemaRegistry.RestService.ExecuteOnOneInstanceAsync(Func`1 createRequest)
   at Confluent.SchemaRegistry.RestService.RequestAsync[T](String endPoint, HttpMethod method, Object[] jsonBody)
   at Confluent.SchemaRegistry.RestService.GetLatestSchemaAsync(String subject)
   at Confluent.SchemaRegistry.CachedSchemaRegistryClient.GetLatestSchemaAsync(String subject)
   at Chr.Avro.Confluent.AsyncSchemaRegistrySerializer`1.<SerializeAsync>b__24_0(String subject)
   at Chr.Avro.Confluent.AsyncSchemaRegistrySerializer`1.SerializeAsync(T data, SerializationContext context)
   at Confluent.Kafka.SyncOverAsync.SyncOverAsyncSerializer`1.Serialize(T data, SerializationContext context)
   at Confluent.Kafka.Producer`2.Produce(TopicPartition topicPartition, Message`2 message, Action`1 deliveryHandler)"), ("IsError": True), ("IsLocalError": True), ("IsBrokerError": False)]), ("Type": "Confluent.Kafka.ProduceException`2[[...<snip>...]]")]

After digging a bit, I think what is happening is that our application is--for reasons unrelated to this library or any C# code in general--receiving HTTP 504 responses in some of its initial attempts to contact the schema registry. When this happens, I think Chr.Avro caches this error result since here it is adding a single task to the cache, and after the initial add subsequent hits on the cache are awaiting that same task, leading to the HttpRequestException being raised on every access.

Of course, solving the 504s is a thing we should work on, but more specific to Chr.Avro: does it sound like I'm reading the code right, there? If so, would it make sense to try to come up with a way of skipping addition to the cache for HTTP 5xx response statuses?

Thank you!

Update NuGet icon definition

iconUrl is deprecated. Use icon instead.

In Chr.Avro.Build.props:

  • update PackageIconUrl -> PackageIcon
  • include docs/static/nuget-icon.png in the package (like the readme)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.