Git Product home page Git Product logo

Comments (4)

jll63 avatar jll63 commented on July 23, 2024

Hello and thanks for the wonderful library.

Thanks for the kind words. May I ask what you use YOMM2 for? I am trying to build a collection of use cases.

Is there any way to dispatch on arbitary values, not just types? Coming from a clojure background, I am trying to replicate some of the behaviour of clojures multimethods.

The YOMM (11 and 2) libraries were inspired by CLOS and the Solodkyy et al papers like this one. You can see YOMM as an implementation of the proposed standard extension in a library, and also an attempt at popularizing the idea so it finally makes it into the language (unlikely to happen soon, as it seems that to be cool these days, it has to be compile-time). Thus, they firmly embrace the design philosophy of C++, in particular these principles: (P1) emphasis on speed; and (P2) "don't pay for what you don't use". As a result, method dispatch is very fast, in the same ballpark as native virtual functions, which themselves are almost as fast as straight function calls (see my CppCon presentation and the slides). In particular, note that there are no conditionals or loops in the dispatch code.

Regarding P1, value-based dispatch would but much slower, even assuming that all the values are hashable. However, P2 can be respected iff the method declaration states, for each virtual argument, whether it can be specialized by type, by value, or both.

After some cleanup and making some of YOMM2's internals public, it might become possible to turn the library into a framework that would natively support fast open methods, while allowing pluggable dispatcher extensions. At the moment, my main interest is to support templatized method declarations and definitions in some way, and C++'s Byzantine syntax is not helping (too bad that C++20's abbreviated template functions can substitute for only a small subset of the full function template syntax). So the solution may indeed involve that sort of work.

In clojure the method declaration takes a function some_fn and before dispatching, it calls some_fn(arg1,args2, ...) and then dispatches based on the return value of that function.

Actually, YOMM2 does something similar to handle smart pointers. By the way, this is a point where I diverge from the Solodkyy/Stroustrup proposal, which strictly requires virtual parameters to have polymorphic types. I think that smart pointers are too important. Without built-in support for them, users would have to go through hoops like this (using the proposal's syntax):

// assume a hierarchy of immutable matrix classes
shared_ptr<matrix> times(
    shared_ptr<matrix> a, virtual matrix&, 
    shared_ptr<matrix> b, virtual matrix&) {
    // default: do it the slow, hard way
}

shared_ptr<matrix> times(
    shared_ptr<matrix> a, virtual matrix&, 
    shared_ptr<identity_matrix>, virtual identity_matrix&) {
    assert(a.is_square());
    return a;
}

shared_ptr<matrix> operator *(shared_ptr<matrix> a, shared_ptr<matrix> b) {
    return times(a, *a.get(), b, *b.get());
}

Some pseudo code of what I wanted to write:
[...]
My current workaround idea is just to create a lot of classes, so I can have the type dispatch. But it gets quite ugly. I would like for example to dispatch on whatever is the first item in a std::vector and I feel bad creating subtypes of that. Or is your advice to just go with this?

You mean subtypes of vector? Yes, that is way too ugly ;-) Do you want to dispatch on the first item's type (and it is polymorphic), or its value? In the first case, you can use the trick outlined above. If it happens a lot in your use case, you can look at how shared_ptr is supported, and use the same technique.

And to dispatch on values, indeed you can express these values as a class hierarchy (pretty much like in Smalltalk, where true is an instance of True, false an instance of False, and both classes are subtypes of Boolean).

This approach has some benefits though:

  1. It gives you fast, constant time lookup - each value, or rather, its associated &typeid, will be taken into consideration when YOMM2 calculates the collision-free hash functions.
  2. It allows grouping via sub-classing (e.g. -3 is an instance of negative_integer which is a subclass of integer), and, if you do that, it will also give you dispatch table compression.

Can you give me an idea (or point me to resources) of how Clojure implements method dispatch, the performance, and the Big O? I googled around a bit, but Clojure is not as popular as, say, Python.

What if the existing class I want to dispatch on has no virtual functions? Should I just wrap them in one with for example a virtual destructor?

Yes; again it's C++'s philosophy: "virtual" has a cost that you need not pay if you don't use it. Thus, two types of types ;-)

from yomm2.

rutenkolk avatar rutenkolk commented on July 23, 2024

Thank you for the quick answer!

May I ask what you use YOMM2 for? I am trying to build a collection of use cases.

It's not one use-case in particular and I have just started using your library. But there have been several occasions where I needed some kind of dynamic dispatch and was at a loss how to implement this in c++ without loosing my mind. For example writing a transpiler / code generator can get pretty ugly in my opinion without any kind of multimethods.

Another is, that I sometimes deal with a stream of uniform data. Lets just say I read a file and return a std::vector<pod_type> vec for simplicity. I have no way of knowing what kind of potential subtype of pod_type each element vec[i] could be. But I want to do different things based on what is inside the pod_type. I cannot decide which potential subtype to create up front, I don't control the input.

Lately I have also been thinking about writing a helper library for functional programming. None of the options I know like functionalPlus satisfy me. A pipe dream of mine is to be able to write a function, whose pointer I can pass around and call with different arguments. This isn't possible with templated functions or lambdas as they are inherently different things for different template arguments.

The YOMM (11 and 2) libraries were inspired by CLOS and the Solodkyy et al papers like this one.

Thank you for pointers. I was aware that there was an old proposal but I never read it as it never made it into the language. Your talk is great :)

Do you want to dispatch on the first item's type (and it is polymorphic), or its value?

I would like the option for both but the values is really what I am after. I want to dispatch on a int 1, or a std::string "foo". I had a look at yomm2.hpp and the shared_ptr_traits SFINAE trick with using the base type only if it hits the specialized template is a really neat trick!

And to dispatch on values, indeed you can express these values as a class hierarchy (pretty much like in Smalltalk, where true is an instance of True, false an instance of False, and both classes are subtypes of Boolean).

What if I want to define_method at runtime? I can't create a class at runtime. That is why I wanted to be able to dispatch on values.

Can you give me an idea (or point me to resources) of how Clojure implements method dispatch, the performance, and the Big O?

Sure. As far as I can see, yomm2 currently behaves more like clojures protocols, which are a special case of multimethods, that are faster and only dispatch on the type of an object.
Clojure is implemented in Java and relies heavily on the JVM to strip a lot of overhead away. Multimethods themselves are implemented as a Java Class that simply has an untyped (object to object) map called methodTable.
When you register a new implementation of the multimethod via (defmethod my-multi ...) clojures MultiFn literally just stuffs the key and value into the map.

methodTable = getMethodTable().assoc(dispatchVal, method);

Clojure gets away with this because of the faster protocols and because it's written in a way that it often doesn't perform too terrible on the JVM and its JIT (a bit of caching for example. Also all the different invoke functions are quite something 😅 ).

Clojures multimethods also allow for an explicit ad-hoc hierarchy that you can build out of arbitrary values on which you can dispatch, but that gets a bit out of scope.

The faster protocols work by defining a java interface in memory at runtime (or actually creating the bytecode for the interface when AOT compiling). The gist is that protocols are written in a way that they make use of basically as fast a dispatch as you can get on the JVM with all the JIT goodies. Using reflection in regular multimethods and dispatching on the Class is really slow, but thanks to the java interface generation the JVM makes the dispatch for protocols consistently very fast.

Have a look at this benchmark from 2015 where protocol dispatch is <10 ns according to alex miller (one of the core maintainers of clojure)

If you want more information on this, I recommend you Rich Hickeys (inventor of clojure) recently released A History of clojure. Specifically chapter 4.3

Under the hood, protocols define Java interfaces, and deftype/defrecords that have inline protocol
definitions establish an inheritance relationship. At an invocation point of a protocol, the compiler
creates a call site with a fast path for interface implementations and a polymorphic inline cache
for types externally extended to satisfy the protocol. In this way, if programmers use either inline
definition in deftype or defrecord, or use reify, they get performance on par with Java’s native
polymorphic dispatch, and otherwise the performance is still good.

I'm not really that familiar with the internals of clojure other that, and I'm sorry if the answer is a lot of "jvm magic". To answer the Big O question: everything is amortized O(1) as far as I am aware.

EDIT: The last statement is likely not correct. The clojure devs like to call it O("1"). clojure map lookup is actually O(log_32(n)), which for all reasonable purposes is just as good. Clojure map insert is actually O(1) making use of persistent data structures.

from yomm2.

jll63 avatar jll63 commented on July 23, 2024

Thanks for the links. I am going to take a look at them. My hunch is that Protocols are much easier to implement efficiently though. FWIW, I have some exposure to Clojure (I implemented a Jepsen test for a proprietary distributed system at work), but I am far from being an advanced user, even less an expert.

from yomm2.

rutenkolk avatar rutenkolk commented on July 23, 2024

My hunch is that Protocols are much easier to implement efficiently though.

Definitely. And I can totally understand if my "problem" is simply out of scope :)

from yomm2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.