Git Product home page Git Product logo

Comments (4)

dstelljes avatar dstelljes commented on August 25, 2024

Hmm, I would guess that this is because inner LambdaExpressions aren't compiled when the outer expression is compiled. Chr.Avro generates a LambdaExpression for each record type to support recursive references. (To confirm, you should be able to verify that (de)serializers for non-record types do not call CreateDelegate.)

Given this, we might have to revisit how we build expression trees.

from dotnet-avro.

fabianoliver avatar fabianoliver commented on August 25, 2024

I'm still trying to understand what .NET is doing under the hood there. I don't think it's an issue of compilation per se - in the sense that the nested lamda is compiled fine I think.
I'm not even sure it'd really count as a proper bug, whether delegates are or are not cached is definitely more art than science in C# I think, with the how and ifs even having changed over language versions. So probably "just" more of a performance downside particularly noticeable in scenarios with fairly high message rates.

As for delegate creation, my understanding was that non-capturing lambdas are cached, but that doesn't seem to be happening for expressions (though I'd need to double check if the language specs say they can be cached, or must be cached).

The simplest example I could condense it down to so far is the following:

var param = Expression.Parameter(typeof(int));
var innerParam = Expression.Parameter(typeof(Action<int>));
var inner = Expression.Lambda(innerParam.Type, Expression.Block(), "test", new[] { param });
var action = Expression.Lambda<Action<int>>(Expression.Block(new[] { innerParam }, Expression.Assign(innerParam, inner), Expression.Invoke(innerParam, param)), param).Compile();

// Each invocation will trigger delegate creation
action(456);
action(567);

Interestingly, I don't observe the same behaviour when creating an equivalent function/invocation structure directly in code outside of the expression system:

Action<int> inner = _ => { };
Action<int> action= x =>
{
  Action<int> innerParam = inner;
  innerParam(x);
};

// Delegates are cached just fine
action(456);
action(567);

Even more interestingly, FastExpressionCompiler seems to be doing a better job here. If I compile any of my samples with that, delegates do seem to be cached fine - though I may have to run this through for more complex examples.

from dotnet-avro.

fabianoliver avatar fabianoliver commented on August 25, 2024

After a bit more testing - FastExpressionCompiler isn't able to compile more complex lambda expressions / throws in those cases, so unfortunately doesn't seem to be a great workaround.

Thinking about this a little further, I suspect that in the current way expression trees are created, re-creating delegates might be unavoidable in some cases. Not in the simple examples above, but at least once you have types whose properties are other complex types (whether self-referential or not) - if I understand the current generators right, their lambda would reference variables from the outer scope to access their nested types' deserializers. I think that would inevitably make the lambda capture the outer parameter, and capturing lambdas are never cached. Though hard to verify at the moment given even the simple, non-capturing versions aren't cached in the default compilation path.

I think a few options in increasing order of madness that come to mind at the moment are:

  • Keep everything as-is. Which might be a perfectly sane choice if the alternatives are too complex / not worth the gain in performance and reduced allocs!!
  • Possibly do a lazy workaround that'd generate a simpler expression tree in case where the record is a simple type that contains only primitive fields. That might be the quickest win that'd probably benefit a reasonable percentage of real use cases
  • Try to wire up circular references in the expression tree in another way. An interesting technique for anonymous recursion is described here for example - maybe it'd similarly be able to create some variation of delegate object DeserializeRecord(Type dtoType, string recordName, ref BinaryReader reader), then have each generated record deserializer take a delegate of that signature that'd allow it to deserialize nested types without having to capture/reference something from the outer scope? Hard to say for sure though if that'd really do the trick
  • Instead of an expression tree, explicitly create an AnonymousType with one AnonymousMethod to deserialize each record type. Emitting IL directly sounds like a nightmare though and I'm not sure if there's much in terms of supporting libraries out there that could make it much easier.
  • Ditching all that; generate "proper" full source for a deserializer, and use Roslyn to compile and load up the assembly. It wouldn't get more flexible than that, and leaving behind "almost-abandoned-by-microsoft" expression trees could be nice as well. However.. pulling in those giant dependencies, plus dealing with possibly breaking changes (eg definite no for AOT or mobile devices; though not quite sure if expression trees work there either) probably makes this a non-starter anyway

Maybe the first option isn't so bad after all.. :-)

from dotnet-avro.

dstelljes avatar dstelljes commented on August 25, 2024

Thanks for the detailed writeup! A few additional thoughts:

Possibly do a lazy workaround that'd generate a simpler expression tree in case where the record is a simple type that contains only primitive fields.

Of the options you laid out, something like that is probably the most pragmatic. (A more sophisticated variation might be to walk the type graph and only do the circular reference trick for types that appear in a cycle.)

Maybe the first option isn't so bad after all.. :-)

This is more or less where we're at too. I'm going to leave this as a bug since it's an unintended consequence of how the serdes are built, but it's unlikely that we'll be able to prioritize working on it.

from dotnet-avro.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.