Git Product home page Git Product logo

kiwaku's Introduction

⚡ Short Intro ⚡

I am Joel Falcou, Destroyer of World, Terror of the Compilers.

In my spare time, I am an associated professor at the University Paris-Saclay and researcher at the Laboratoire de Recherche d’Informatique in Orsay, France. My research focuses on studying generative programming idioms and techniques to design tools for parallel software development.

I also have a rather personal take on humor as you may have noticed already.

❓ Research Activities ❓

The main parts of my research topic are:

  • the exploration of Embedded Domain Specific Language design for parallel computing on various architectures;
  • the definition of a formal framework for reasoning about meta-programs.

As I need something to pad my academic paper up to at least eight pages, I usually play around with various application fields like real-time image processing on embedded architectures or HPC on multi-core clusters.

👯C++ Community 👯

I am the co-host of the C++FRUG Meetup, president of the C++FRUG Association and I co-organize the CPPP Conference.

You can find me on Mastodon or on the #include Discord

kiwaku's People

Contributors

jfalcou avatar psiha avatar sylvainjoube avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kiwaku's Issues

Compilation failure

Hey Joel :)

I finally get around to try your thingies again and you hit me with compiler errors out of the gate ;P

So the basic examples from your cppcon lecture fail to compile for me, or this one from your tests:

MB_DISABLE_WARNING_CLANG( "-Wgnu-string-literal-operator-template" ) <<-- please silence this warning in the library itself
#include <kiwaku/container/array.hpp>
#include <kiwaku/container/view.hpp>

kwk::array< float, kwk::_2D > y({ 4, 6 }); // so this compiles after adding the extra braces (compared to the presentation - I guess this was a later change)

float ref[7] = { 1,2,3,4,5,6,7 };
kwk::view<float, kwk::extent[7]> view(ref);

with Clang 13.0.0 on Linux I get:

In file included from kiwaku/include/kiwaku/container/array.hpp:11:
In file included from kiwaku/include/kiwaku/detail/container/array_builder.hpp:13:
In file included from kiwaku/include/kiwaku/detail/container/heap_storage.hpp:13:
In file included from kiwaku/include/kiwaku/container/view.hpp:13:
kiwaku/include/kiwaku/detail/container/view_builder.hpp:26:27: error: constexpr variable 'shape_' must be initialized by a constant expression
static constexpr auto shape_ = kwk::shape<opt_[ option::shape | shaper_t{} ] >{};
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
kiwaku/include/kiwaku/container/view.hpp:21:19: note: in instantiation of template class 'kwk::detail::view_builder<float, {{{7}}}>' requested here
: detail::view_builder<Type,Settings...>::access_base
^
<internal.cpp>:39:38: note: in instantiation of template class 'kwk::view<float, {{{7}}}>' requested here
kwk::view<float, kwk::extent[7]> view(ref);
^
kiwaku/include/kiwaku/detail/container/view_builder.hpp:26:27: note: subobject of type 'std::array<long, 0>::CharType' (aka 'char') is not initialized
static constexpr auto shape
= kwk::shape<opt_[ option::shape | shaper_t{} ] >{};
^
/usr/local/bin/../include/c++/v1/array:252:46: note: subobject declared here
_ALIGNAS_TYPE(_ArrayInStructT) _CharType _elems[sizeof(_ArrayInStructT)];

similarily with Clang-CL 13.0.1 I get:

In file included from kiwaku\include\kiwaku/container/array.hpp:11:
In file included from kiwaku\include\kiwaku/detail/container/array_builder.hpp:13:
In file included from kiwaku\include\kiwaku/detail/container/heap_storage.hpp:13:
In file included from kiwaku\include\kiwaku/container/view.hpp:13:
kiwaku\include\kiwaku\detail\container\view_builder.hpp(26,27): error : constexpr variable 'shape_' must be initialized by a constant expression
static constexpr auto shape_ = kwk::shape<opt_[ option::shape | shaper_t{} ] >{};
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
kiwaku\include\kiwaku\container\view.hpp(21,19): note: in instantiation of template class 'kwk::detail::view_builder<float, {{{7}}}>' requested here
: detail::view_builder<Type,Settings...>::access_base
^
.cpp(39,38): note: in instantiation of template class 'kwk::view<float, {{{7}}}>' requested here
kwk::view<float, kwk::extent[7]> view(ref);
^
kiwaku\include\kiwaku\detail\container\view_builder.hpp(26,27): note: subobject of type 'std::array<long long, 0>::CharType' (aka 'char') is not initialized
static constexpr auto shape
= kwk::shape<opt_[ option::shape | shaper_t{} ] >{};
^
llvm\13.0.1...\include\c++\v1\array(252,46): note: subobject declared here
_ALIGNAS_TYPE(_ArrayInStructT) _CharType _elems[sizeof(_ArrayInStructT)];

Indexing debug performance

If you remember this is something I fixed in the previous version (before the last major PR) - we have a serious issue with indexing performance in builds that have asserts and sanitizers enabled, even if all other optimizations are turned on (the attached image shows this in the case of a reference convolution implementation - that does the triple loop and indexing - and this is with inline and flatten attributes slapped all over the place) - it actually causes unit tests on xeon servers to timeout.
This is also connected to:

  • #39 (necessity to do the stride reversing dance)
  • #63

image

Optional iostream usage

Somewhat related to #83.

Please provide an build time option to (not) use/provide iostream functionality (eg. ifdef wrapping of std::ostream& operator<<).

[FEATURE] General 'simultaneous static&dynamic' ops on axes

I often have a need to perform some transformation on a shape (or a pair of shapes) (implicitly when doing transformations on an ndarray which affects its shape) - and when we have the ability to have both static and dynamic ones I usually have to duplicate the logic. Some motivating examples:

  • merge two shapes (example an add operation): verify that the two shapes are the same (mathematically) and produce a new one which maximizes compile time information from both input shapes (e.g. one can have a fixed 2nd axis while the other a fixed 3rd axis so the result should have two fixed axes)
  • concatenation of two shapes across an axis - you have to perform the addition in ct and rt space (and again maximize ct information)
  • reshaping with a placeholder/free axis (this is an operation in itself that you could add to the library)
  • what convolutions (in ml space) do with their inputs - for in[N, H, W, C] - the N gets forwarded while H and W get forwarded if padding is on, otherwise they are slightly reduced and input C is fully replaced with a different value
    etc....

So provide a generic way (which compiles today ;P) to specify and perform these operations w/o duplication ;D

stride uses size_type from shape

When using shorts for shapes that is a problem - 16bits is easily overflowed (for our use cases we'd need uint32_t for strides - although strictly the first/rightmost one is fix one, the next is copied short and only the third and further ones need 32bits)...

linear_index does colmajor/inverse axis order?

  1. I have a kwk::shape< 1, 0, 0 > my_shape{ 1, 4, 4 } i.e. a shape with the first/major/leftmost dimension fixed at 1 and the other two dynamically set to 4.
  2. I then try to iterate over the shape (effectively a matrix) using kwk::linear_index( my_shape, 0, h, w ) but it keeps returning index values as if h w are swapped (i.e. as if it is returning col-major linearized indices)
kwk::shape<{{{1, 0, 0}}}>
  std::__2::array<unsigned short,2>	{__elems_=0x000000e8910fdb20 {4, 4} }
	__elems_	0x000000e8910fdb20 {4, 4}	unsigned short[2]
        [0]	4	unsigned short
        [1]	4	unsigned short

this is what i have in the debugger in linear_index.hpp line 29

 template<auto Shaper, std::integral... Index>
  auto linear_index( shape<Shaper> const& sh, Index... idx )
  {
    return sh.as_stride().index(idx...);
  }
kwk::detail::linearize<kwk::stride<{}>,0,1,2,int,unsigned short,unsigned short> returned	1	int
this	0x000000e8910fd8c0 {storage_={__elems_=0x000000e8910fd8c0 {1, 4, 583} } }	const kwk::stride<{}> *
is	0	int
is	1	unsigned short
is	0	unsigned short

IOW: linearize returned 1 instead of 4 - and when I pass 0 0 1 then it returns 4 instead of 1...
...plus there is this weird big/garbage value (583) which changes with each restart/recompile...

shape operator[] and index() members

hi @jfalcou, finally back to trying to integrate kiwaku ;)
missing some interfaces, would it be possible to add:

  • the ability to access values of individual shape dimensions with constexpr operator[] instead of/in addition to with the get<>() template member function - so that it works for runtime/dynamic and static arguments

  • add an index() or offset() member function that returns a linear/flattened index or offset for a given set of coordinates, i.e. generalized equivalent of
    `std::uint32_t Dimensions::index( shape_t const dim0, shape_t const dim1, shape_t const dim2, shape_t const dim3 ) const noexcept
    {
    BOOST_ASSERT( dim0 < (*this)[ 0 ] );
    BOOST_ASSERT( dim1 < (*this)[ 1 ] );
    BOOST_ASSERT( dim2 < (*this)[ 2 ] );
    BOOST_ASSERT( dim3 < (*this)[ 3 ] );

    std::uint32_t const result( ( ( dim0 * (*this)[ 1 ] + dim1 ) * (*this)[ 2 ] + dim2 ) * (*this)[ 3 ] + dim3 );
    BOOST_ASSERT( result < count() );
    return result;
    }`

  • also 🙈 maybe behind a 'EIGEN_INTEROP' define
    // implicit conversion to Eigen Tensor dimensions w/o Eigen includes
    constexpr operator std::array< int, 4 >() const noexcept;

------------ [EDIT JFALCOU]
Other element has been put in their own issues:

[FEATURE] Dependencies as submodules

Switch to using dependencies (such as kumi) as submodules (or packages or...) instead of as 'flattened detail headers'.
(makes collaboration easier)

sizeof static shape

Currently a fully static shape is not an empty class (and I rely on this being the case for cross-compilation ABI reasons) - its size is not zero when used with EBO - this is due to the storage_ data member.
For some compilers this should/could be fixed by simply slapping [[ no_unique_address ]] on the member however this does not help with, you guessed it, MSVC or even Clang-CL - for that usual pain in the behind you have to derive from the storage_ type and slap __declspec( empty_bases ) on the shape struct to be sure that EBO will kick in (yes 🙄)

Better constraints checking on shape

From #18

Also something more exotic like fits_static_constraints - a function that compares a 'less static' (dynamic_shape) with a 'more static' (reference_shape) shape by comparing/verifying that corresponding dimensions from dynamic_shape are equal to the corresponding statically defined/fixed dimensions from reference_shape...
for example
dynamic_shape = extent()()()()
reference_shape = extent()[100]200

bool fits_static_constraints() { return ( dynamic_shape[1] == reference_shape[1] ) && ( dynamic_shape[2] == reference_shape[2] ); }

This would be used in __builtin_assume() statements to tell the optimizer what it can assume about a given dynamic shape. For this to work (without compiler warnings about 'lost sideeffects') the function has to be marked with attribute(( const )) (or attribute(( pure )) for member functions, due to the access of this)...

#if _MSC_VER

Clang-CL also defines _MSC_VER - so consider using something like if ( defined(_MSC_VER) && !defined(clang) ) for stuff really msvc specific (what BOOST_MSVC is for).

ps. saw this when reviewing the KWK_CONST change - so two 'btw' points here:

  • member (non static) functions can at most be pure (because of the this pointer)
  • MSVC has something very similar __declspec( noalias )

nbdims&co return/value type

Is it really necessary/reasonable to use signed 64bit type for the number of dimensions?
8bits should pro'lly be enough for 99% of everybody 😁 or at least go with uint32 - to avoid using 64bit instructions...
(the required type could in fact be inferred from the number of static_order but for starters i'd be pleased with at least dropping down to 32bits)

Indexing sanity checks

Add assertions/sanity checks for indexing (that verify that individual indices aren't out of the extent range of a given dimension).

Non-trivial data sources

Can you make your allocator machinery support allocators (e.g. auto defragmenting allocators) that return handles - objects that contain pointers which can be changed under-the-hood so you cannot store the pointer but rather have to ask the handle for the pointer (call get() or data() on it) every time you need the pointer to the data (it is up to the user to make sure and pin the handle or allocator when data access is required)?

Similarily for the detail::view_builder<>::data_block machinery - can it support data sources that are:

  • handles (i.e. one extra level of indirection)
  • CRTP derived classes (i.e. the view does not store the source object but casts itself to it - that is the way I 'handle the handle' pardon the pun :D)
template < typename DimensionsParam, typename DataSource >
struct ViewImpl : private DimensionsParam
{
    using Dimensions = DimensionsParam;

    constexpr ViewImpl() noexcept = default;
    constexpr ViewImpl( Dimensions const & dimensions ) noexcept { this->setDimensions( dimensions ); }

    constexpr auto count() const noexcept { return this->dimensions().count(); }

    constexpr auto dimension( std::uint8_t const index ) const noexcept { return this->dimensions()[ index ]; }
    constexpr auto shape    ( std::uint8_t const index ) const noexcept { return dimension( index ); }

    template < typename ... Indices > auto const & operator()( Indices ... indices ) const noexcept { return const_cast< ViewImpl & >( *this ).operator()( indices... ); }
    template < typename ... Indices > auto       & operator()( Indices ... indices )       noexcept { return getData()[ this->dimensions().index( indices... ) ]; }

    ....

private:
    BOOST_FORCEINLINE auto getData()       noexcept { return static_cast< DataSource * >( this )->data(); }
    BOOST_FORCEINLINE auto getData() const noexcept { return const_cast< ViewImpl & >( *this ).getData(); }

protected:
    constexpr void setDimensions( Dimensions const & newDimensions ) noexcept { static_cast< Dimensions & >( *this ) = newDimensions; }
}; // struct ViewImpl

(might be related to #7 WRT storing 'non trivial pointers' in views)

Layout support

A classic ML example are the NHWC (batch index, height, width, channel - channel interleaved layout) and/vs NCHW (channel separated layout) layouts.
Consider some sort of builtin support (generic layout definition mechanism and mapping and converting between compatible layouts).

Questions about this?

Hello,

I'm not finding any more information on here about this project? I was curious of what this project is and what makes it better than say the std:: containers? If you can just give me some more information, I'd appreciate it and you can just remove this post after. I found the description of this library from Compiler-Explorer.

Fix or silence warnings

currently we need to disable:
"-Wconversion"
"-Wdocumentation"
"-Wunused-variable"
before
#include <kwk/utility/container/shape.hpp>
#include <kwk/utility/linear_index.hpp>

Investigate how we can use `int` for all the size that are far far below the size requirements of `std::size_t` et al.

Perhaps rather do not use 64bit types unconditionally - probably a size_t-like type would do in most those situations...
Also, even using explicitly 32bit types (instead of size_t) can give smaller codegen (if there is no mixing with 64bit types) when you know that you don't need the range (e.g. for 'certainly tiny' numbers like ranks/numbers of dimensions)..

Originally posted by @psiha in #53 (comment)

Configurable dimension type

For example 16bit ints are quite enough for our use case and an array of 4 shorts (for 4d arrays - typical "tensors" used in ML) fits into a single 64bit register - so a fully dynamic view can be passed by value with only two 64bit registers (as opposed to five if you use size_ts)...

View pointer attributes

  • alignment
  • aliasing: 'certainly aliased' (GNU [[ may_alias ]]), 'may alias' (C++ default/no explicit attribute), 'not aliased' (__restrict)

[FEATURE] for_each_index redesign

  • decoupling from a container: only accept a function and a shape (because you don't know what the function will do with the indices - it may access several containers, or none)
  • pass the indices as a variadic pack instead of a tuple (varargs -> pack is easy, the other direction not so much)

[FEATURE] axis::is_dynamic

or some such shorthand for
if constexpr ( kwk::concepts::dynamic_axis< decltype( dim ) > )
vs
if constexpr ( dim.dynamic )

[FEATURE] Coordinate system modifiers

As per the discussion with our physicist friends:
We want:

for_each( f, view{v, translate({3_c,3.5}, interpolate)} );
for_each( f, view{v, scaler({0.35,1}, interpolate)} );
for_each( f, view{v, rotate(0.256, interpolate)} );
for_each( f, view{v, affine(matrix = { 0.5,0   ,-3
                                     , 0  ,-0.5,-3
                                     }, interpolate)} );

with interpolate one of : floor, ceil, round, linear, cubic

coordinates() (ct) efficiency

@ "This save some 1-200ms of compile-time for those very common cases" comment

in my nestedFors( auto && f, std::uint32_t const startIteration, std::uint32_t const endIteration ) i used this (to convert startIteration to a set of indices):

            /* leftmost/outer dimension does not require the modulo operation, example for 4 dimensions:
                  startIteration / ( dimensions[ 1 ] * dimensions[ 2 ] * dimensions[ 3 ] )                    ,
                ( startIteration / (                   dimensions[ 2 ] * dimensions[ 3 ] ) ) % dimensions[ 1 ],
                ( startIteration /                                       dimensions[ 3 ]   ) % dimensions[ 2 ],
                  startIteration                                                             % dimensions[ 3 ],
            */
            [ & ]< int... i >( std::integer_sequence< int, i... > )
            {
                auto const strides{ this->as_stride() };
                return std::array< shape_t, cardinality >
                {
                    static_cast< shape_t >(   startIteration / strides[ 0     ]                      ),
                    static_cast< shape_t >( ( startIteration / strides[ i + 1 ] ) % (*this)[ i + 1 ] )...
                };
            }( std::make_integer_sequence< int, cardinality - 1 >{} ),

would this not be faster than going through kumi fold?

BOOST_ASSERT as KIWAKU_ASSERT no longer works

Now that you have started using iostream constructs in invocations of KIWAKU_ASSERT this fails to compile when BOOST_ASSERT is used
https://user-images.githubusercontent.com/340735/222702614-350a6e2f-67aa-438f-97a5-dc62f82c2084.png

The first fix is to remove BOOST_ASSERT support and let/us other provide our own KIWAKU_ASSERT implementation. The core issue for us is the use of iostreams (yuck :D) and also a different 'abort' point (more then one place to place a break point and/or stacktrace printouts for CI environments).

[BUG] axis assignment

auto ba{ kwk::height[ 3 ] };
ba = 2;

error : use of overloaded operator '=' is ambiguous (with operand types 'axis_<rbr::literals::str{"height", 7}, decltype(v)>' (aka 'axis_<rbr::literals::str{"height", 7}, int>') and 'int')
kiwaku\include\kwk\detail\raberu.hpp(397,20): message : candidate function [with Type = int]
kiwaku\include\kwk\settings\axis.hpp(24,10): message : candidate function (the implicit move assignment operator)
kiwaku\include\kwk\settings\axis.hpp(24,10): message : candidate function (the implicit copy assignment operator)

[FEATURE] Policy for default dynamic axis values

(in shape default construction): your current logic/behaviour:
//! @brief Constructs a default @ref kwk::shape equals to [1 1 ... 0]
does not 'work' for me - in my wrapper I have to override it by resetting everything to zero.

I know this is another oh crap facepalm moment but 🤷 :/

Easier shape construction

Construction syntax for partially static shapes is IMO really cumbersome - the axis placeholder syntax is cool and useful certainly for some situations, in others (most ones in our use cases) however it would be easier if we could have the 'old school less cool eigen' syntax ;D +i'd expect this syntax to be faster to compile ;D
i.e. so that one can simply pass values to the constructor using plain C syntax - and the library will simply assert that the values passed for fixed/static axis match the static values...
A companion version that takes only as many arguments as there are dynamic dimensions (and assigns them in order) would also be useful...

Tiled/blocked layout support

Support sort-of-embedded nD-arrays ('blocks') in mD-arrays where n <= m (e.g. tile iteration, transparent indexing/coordinate translation...).
A special case of this would be the case of 2D arrays or matrices which are laid out in tiles in memory (n==2 and m==2) as is frequently the case in intermediate results (e.g. in matrix multiplication).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.