jfalcou / kiwaku Goto Github PK

View Code? Open in Web Editor NEW

51.0 51.0 5.0 38.9 MB

C++20 and onward collection of high performance data containers and related tools

Home Page: https://jfalcou.github.io/kiwaku

License: Boost Software License 1.0

CMake 4.55% C++ 64.42% HTML 10.95% CSS 3.73% JavaScript 16.35%

cpp cpp-library cpp20 matrix parallel-computing

kiwaku's Introduction

⚡ Short Intro ⚡

I am Joel Falcou, Destroyer of World, Terror of the Compilers.

In my spare time, I am an associated professor at the University Paris-Saclay and researcher at the Laboratoire de Recherche d’Informatique in Orsay, France. My research focuses on studying generative programming idioms and techniques to design tools for parallel software development.

I also have a rather personal take on humor as you may have noticed already.

❓ Research Activities ❓

The main parts of my research topic are:

the exploration of Embedded Domain Specific Language design for parallel computing on various architectures;
the definition of a formal framework for reasoning about meta-programs.

As I need something to pad my academic paper up to at least eight pages, I usually play around with various application fields like real-time image processing on embedded architectures or HPC on multi-core clusters.

👯C++ Community 👯

I am the co-host of the C++FRUG Meetup, president of the C++FRUG Association and I co-organize the CPPP Conference.

You can find me on Mastodon or on the #include Discord

kiwaku's People

Contributors

Stargazers

Watchers

Forkers

clayne microblink thomasretornaz nxirda

kiwaku's Issues

Compilation failure

Hey Joel :)

I finally get around to try your thingies again and you hit me with compiler errors out of the gate ;P

So the basic examples from your cppcon lecture fail to compile for me, or this one from your tests:

MB_DISABLE_WARNING_CLANG( "-Wgnu-string-literal-operator-template" ) <<-- please silence this warning in the library itself
#include <kiwaku/container/array.hpp>
#include <kiwaku/container/view.hpp>

kwk::array< float, kwk::_2D > y({ 4, 6 }); // so this compiles after adding the extra braces (compared to the presentation - I guess this was a later change)

float ref[7] = { 1,2,3,4,5,6,7 };
kwk::view<float, kwk::extent[7]> view(ref);

with Clang 13.0.0 on Linux I get:

In file included from kiwaku/include/kiwaku/container/array.hpp:11:
In file included from kiwaku/include/kiwaku/detail/container/array_builder.hpp:13:
In file included from kiwaku/include/kiwaku/detail/container/heap_storage.hpp:13:
In file included from kiwaku/include/kiwaku/container/view.hpp:13:
kiwaku/include/kiwaku/detail/container/view_builder.hpp:26:27: error: constexpr variable 'shape_' must be initialized by a constant expression
static constexpr auto shape_ = kwk::shape<opt_[ option::shape | shaper_t{} ] >{};
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
kiwaku/include/kiwaku/container/view.hpp:21:19: note: in instantiation of template class 'kwk::detail::view_builder<float, {{{7}}}>' requested here
: detail::view_builder<Type,Settings...>::access_base
^
<internal.cpp>:39:38: note: in instantiation of template class 'kwk::view<float, {{{7}}}>' requested here
kwk::view<float, kwk::extent[7]> view(ref);
^
kiwaku/include/kiwaku/detail/container/view_builder.hpp:26:27: note: subobject of type 'std::array<long, 0>::CharType' (aka 'char') is not initialized
static constexpr auto shape = kwk::shape<opt_[ option::shape | shaper_t{} ] >{};
^
/usr/local/bin/../include/c++/v1/array:252:46: note: subobject declared here
_ALIGNAS_TYPE(_ArrayInStructT) _CharType _elems[sizeof(_ArrayInStructT)];

similarily with Clang-CL 13.0.1 I get:

In file included from kiwaku\include\kiwaku/container/array.hpp:11:
In file included from kiwaku\include\kiwaku/detail/container/array_builder.hpp:13:
In file included from kiwaku\include\kiwaku/detail/container/heap_storage.hpp:13:
In file included from kiwaku\include\kiwaku/container/view.hpp:13:
kiwaku\include\kiwaku\detail\container\view_builder.hpp(26,27): error : constexpr variable 'shape_' must be initialized by a constant expression
static constexpr auto shape_ = kwk::shape<opt_[ option::shape | shaper_t{} ] >{};
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
kiwaku\include\kiwaku\container\view.hpp(21,19): note: in instantiation of template class 'kwk::detail::view_builder<float, {{{7}}}>' requested here
: detail::view_builder<Type,Settings...>::access_base
^
.cpp(39,38): note: in instantiation of template class 'kwk::view<float, {{{7}}}>' requested here
kwk::view<float, kwk::extent[7]> view(ref);
^
kiwaku\include\kiwaku\detail\container\view_builder.hpp(26,27): note: subobject of type 'std::array<long long, 0>::CharType' (aka 'char') is not initialized
static constexpr auto shape = kwk::shape<opt_[ option::shape | shaper_t{} ] >{};
^
llvm\13.0.1...\include\c++\v1\array(252,46): note: subobject declared here
_ALIGNAS_TYPE(_ArrayInStructT) _CharType _elems[sizeof(_ArrayInStructT)];

Indexing debug performance

If you remember this is something I fixed in the previous version (before the last major PR) - we have a serious issue with indexing performance in builds that have asserts and sanitizers enabled, even if all other optimizations are turned on (the attached image shows this in the case of a reference convolution implementation - that does the triple loop and indexing - and this is with inline and flatten attributes slapped all over the place) - it actually causes unit tests on xeon servers to timeout.
This is also connected to:

#39 (necessity to do the stride reversing dance)
#63

Optional iostream usage

Somewhat related to #83.

Please provide an build time option to (not) use/provide iostream functionality (eg. ifdef wrapping of std::ostream& operator<<).

[FEATURE] General 'simultaneous static&dynamic' ops on axes

I often have a need to perform some transformation on a shape (or a pair of shapes) (implicitly when doing transformations on an ndarray which affects its shape) - and when we have the ability to have both static and dynamic ones I usually have to duplicate the logic. Some motivating examples:

merge two shapes (example an add operation): verify that the two shapes are the same (mathematically) and produce a new one which maximizes compile time information from both input shapes (e.g. one can have a fixed 2nd axis while the other a fixed 3rd axis so the result should have two fixed axes)
concatenation of two shapes across an axis - you have to perform the addition in ct and rt space (and again maximize ct information)
reshaping with a placeholder/free axis (this is an operation in itself that you could add to the library)
what convolutions (in ml space) do with their inputs - for in[N, H, W, C] - the N gets forwarded while H and W get forwarded if padding is on, otherwise they are slightly reduced and input C is fully replaced with a different value
etc....

So provide a generic way (which compiles today ;P) to specify and perform these operations w/o duplication ;D

[FEATURE] Simplify access to container with different number of dimensions

We want to access nD container with p index where p < n

stride uses size_type from shape

When using shorts for shapes that is a problem - 16bits is easily overflowed (for our use cases we'd need uint32_t for strides - although strictly the first/rightmost one is fix one, the next is copied short and only the third and further ones need 32bits)...

linear_index does colmajor/inverse axis order?

I have a kwk::shape< 1, 0, 0 > my_shape{ 1, 4, 4 } i.e. a shape with the first/major/leftmost dimension fixed at 1 and the other two dynamically set to 4.
I then try to iterate over the shape (effectively a matrix) using kwk::linear_index( my_shape, 0, h, w ) but it keeps returning index values as if h w are swapped (i.e. as if it is returning col-major linearized indices)

kwk::shape<{{{1, 0, 0}}}>
  std::__2::array<unsigned short,2>	{__elems_=0x000000e8910fdb20 {4, 4} }
	__elems_	0x000000e8910fdb20 {4, 4}	unsigned short[2]
        [0]	4	unsigned short
        [1]	4	unsigned short

this is what i have in the debugger in linear_index.hpp line 29

 template<auto Shaper, std::integral... Index>
  auto linear_index( shape<Shaper> const& sh, Index... idx )
  {
    return sh.as_stride().index(idx...);
  }

kwk::detail::linearize<kwk::stride<{}>,0,1,2,int,unsigned short,unsigned short> returned	1	int
this	0x000000e8910fd8c0 {storage_={__elems_=0x000000e8910fd8c0 {1, 4, 583} } }	const kwk::stride<{}> *
is	0	int
is	1	unsigned short
is	0	unsigned short

IOW: linearize returned 1 instead of 4 - and when I pass 0 0 1 then it returns 4 instead of 1...
...plus there is this weird big/garbage value (583) which changes with each restart/recompile...

[FEATURE] How to defines a kwk types using options as a member

Turns out the all-auto, settings based interface for building view and table is rad for building them in the wild but not so much for declaring member without ugly delctype

We need a way to provide alias that helps building those types.

shape operator[] and index() members

hi @jfalcou, finally back to trying to integrate kiwaku ;)
missing some interfaces, would it be possible to add:

the ability to access values of individual shape dimensions with constexpr operator[] instead of/in addition to with the get<>() template member function - so that it works for runtime/dynamic and static arguments
add an index() or offset() member function that returns a linear/flattened index or offset for a given set of coordinates, i.e. generalized equivalent of
`std::uint32_t Dimensions::index( shape_t const dim0, shape_t const dim1, shape_t const dim2, shape_t const dim3 ) const noexcept
{
BOOST_ASSERT( dim0 < (*this)[ 0 ] );
BOOST_ASSERT( dim1 < (*this)[ 1 ] );
BOOST_ASSERT( dim2 < (*this)[ 2 ] );
BOOST_ASSERT( dim3 < (*this)[ 3 ] );

std::uint32_t const result( ( ( dim0 * (*this)[ 1 ] + dim1 ) * (*this)[ 2 ] + dim2 ) * (*this)[ 3 ] + dim3 );
BOOST_ASSERT( result < count() );
return result;
}`
also 🙈 maybe behind a 'EIGEN_INTEROP' define
// implicit conversion to Eigen Tensor dimensions w/o Eigen includes
constexpr operator std::array< int, 4 >() const noexcept;

------------ [EDIT JFALCOU]
Other element has been put in their own issues:

[FEATURE] Dependencies as submodules

Switch to using dependencies (such as kumi) as submodules (or packages or...) instead of as 'flattened detail headers'.
(makes collaboration easier)

sizeof static shape

Currently a fully static shape is not an empty class (and I rely on this being the case for cross-compilation ABI reasons) - its size is not zero when used with EBO - this is due to the storage_ data member.
For some compilers this should/could be fixed by simply slapping [[ no_unique_address ]] on the member however this does not help with, you guessed it, MSVC or even Clang-CL - for that usual pain in the behind you have to derive from the storage_ type and slap __declspec( empty_bases ) on the shape struct to be sure that EBO will kick in (yes 🙄)

[FEATURE] Use pure and const attribute whenever possible

In general it would be helpful if you could make it a practice to slap those attributes on all appropriate functions (at least constexpr ones should be no brainers)..

kwk::shape should be aware of its original storage_order

Tasks

Fix #8 by implementing kwk::storage_order + associated concepts
Add support for storage order in kwk::shape
Fix #39
Fix #33

Depends on

Better constraints checking on shape

From #18

Also something more exotic like fits_static_constraints - a function that compares a 'less static' (dynamic_shape) with a 'more static' (reference_shape) shape by comparing/verifying that corresponding dimensions from dynamic_shape are equal to the corresponding statically defined/fixed dimensions from reference_shape...
for example
dynamic_shape = extent()()()()
reference_shape = extent()[100]200

bool fits_static_constraints() { return ( dynamic_shape[1] == reference_shape[1] ) && ( dynamic_shape[2] == reference_shape[2] ); }

This would be used in __builtin_assume() statements to tell the optimizer what it can assume about a given dynamic shape. For this to work (without compiler warnings about 'lost sideeffects') the function has to be marked with attribute(( const )) (or attribute(( pure )) for member functions, due to the access of this)...

#if _MSC_VER

Clang-CL also defines _MSC_VER - so consider using something like if ( defined(_MSC_VER) && !defined(clang) ) for stuff really msvc specific (what BOOST_MSVC is for).

ps. saw this when reviewing the KWK_CONST change - so two 'btw' points here:

member (non static) functions can at most be pure (because of the this pointer)
MSVC has something very similar __declspec( noalias )

nbdims&co return/value type

Is it really necessary/reasonable to use signed 64bit type for the number of dimensions?
8bits should pro'lly be enough for 99% of everybody 😁 or at least go with uint32 - to avoid using 64bit instructions...
(the required type could in fact be inferred from the number of static_order but for starters i'd be pleased with at least dropping down to 32bits)

Indexing sanity checks

Add assertions/sanity checks for indexing (that verify that individual indices aren't out of the extent range of a given dimension).

Non-trivial data sources

Can you make your allocator machinery support allocators (e.g. auto defragmenting allocators) that return handles - objects that contain pointers which can be changed under-the-hood so you cannot store the pointer but rather have to ask the handle for the pointer (call get() or data() on it) every time you need the pointer to the data (it is up to the user to make sure and pin the handle or allocator when data access is required)?

Similarily for the detail::view_builder<>::data_block machinery - can it support data sources that are:

handles (i.e. one extra level of indirection)
CRTP derived classes (i.e. the view does not store the source object but casts itself to it - that is the way I 'handle the handle' pardon the pun :D)

template < typename DimensionsParam, typename DataSource >
struct ViewImpl : private DimensionsParam
{
    using Dimensions = DimensionsParam;

    constexpr ViewImpl() noexcept = default;
    constexpr ViewImpl( Dimensions const & dimensions ) noexcept { this->setDimensions( dimensions ); }

    constexpr auto count() const noexcept { return this->dimensions().count(); }

    constexpr auto dimension( std::uint8_t const index ) const noexcept { return this->dimensions()[ index ]; }
    constexpr auto shape    ( std::uint8_t const index ) const noexcept { return dimension( index ); }

    template < typename ... Indices > auto const & operator()( Indices ... indices ) const noexcept { return const_cast< ViewImpl & >( *this ).operator()( indices... ); }
    template < typename ... Indices > auto       & operator()( Indices ... indices )       noexcept { return getData()[ this->dimensions().index( indices... ) ]; }

    ....

private:
    BOOST_FORCEINLINE auto getData()       noexcept { return static_cast< DataSource * >( this )->data(); }
    BOOST_FORCEINLINE auto getData() const noexcept { return const_cast< ViewImpl & >( *this ).getData(); }

protected:
    constexpr void setDimensions( Dimensions const & newDimensions ) noexcept { static_cast< Dimensions & >( *this ) = newDimensions; }
}; // struct ViewImpl

(might be related to #7 WRT storing 'non trivial pointers' in views)

(no) explicitly qualifiable shape/stride/prefilled element getter

something like kwk::get<>() and/or std::get

required for disambiguation

Layout support

A classic ML example are the NHWC (batch index, height, width, channel - channel interleaved layout) and/vs NCHW (channel separated layout) layouts.
Consider some sort of builtin support (generic layout definition mechanism and mapping and converting between compatible layouts).

Klangfarts heads up

Just a heads up on Clang's codegen issues I've come up so far - so that you can be aware of those and design around them when possible/appropriate :D

llvm/llvm-project#58899
llvm/llvm-project#58790
llvm/llvm-project#58789
llvm/llvm-project#52691

Implements linear_index() -> coordinates conversion

also nice to have would be the inverse of index() -> coordinates_from_index()

Originally posted by @psiha in #18 (comment)

[FEATURE] shape construction from std span and initializer_list

via constructors or make/factory function (to avoid std::vector style () vs {} construction ambiguities)

Questions about this?

Hello,

I'm not finding any more information on here about this project? I was curious of what this project is and what makes it better than say the std:: containers? If you can just give me some more information, I'd appreciate it and you can just remove this post after. I found the description of this library from Compiler-Explorer.

[FEATURE] shape insert modifiers

add insert and (push/pop)(front/back) funcions to shape - which function as 'factories' - they return a new object

Generalize numel() that counts the volume of a slice

Fix or silence warnings

currently we need to disable:
"-Wconversion"
"-Wdocumentation"
"-Wunused-variable"
before
#include <kwk/utility/container/shape.hpp>
#include <kwk/utility/linear_index.hpp>

[FEATURE] Shape type construction from a dimensions variadic pack

without the need for the dectlype( kwk::of_sze( dims... ) ) 'hack'

Investigate how we can use `int` for all the size that are far far below the size requirements of `std::size_t` et al.

Perhaps rather do not use 64bit types unconditionally - probably a size_t-like type would do in most those situations...
Also, even using explicitly 32bit types (instead of size_t) can give smaller codegen (if there is no mixing with 64bit types) when you know that you don't need the range (e.g. for 'certainly tiny' numbers like ranks/numbers of dimensions)..

Originally posted by @psiha in #53 (comment)

Configurable dimension type

For example 16bit ints are quite enough for our use case and an array of 4 shorts (for 4d arrays - typical "tensors" used in ML) fits into a single 64bit register - so a fully dynamic view can be passed by value with only two 64bit registers (as opposed to five if you use size_ts)...

View pointer attributes

alignment
aliasing: 'certainly aliased' (GNU [[ may_alias ]]), 'may alias' (C++ default/no explicit attribute), 'not aliased' (__restrict)

[FEATURE] Configurable shape conversion compatibility check

A macro-based global configuration fine by me :)

stride and shape shoudl be factorized via kwk::prefilled_array

Tasks:

Splitdetail/shaper.hppinto the detail and non-detail parts
Implements kwk::prefilled_array<Type, Desc>
Rewrite kwk::shape with kwk::prefilled_array
Rewrite kwk::stride with kwk::prefilled_array
Fix #41
Fix #40

stride::get() return for fixed/static values

concretely the line:

return std::integral_constant<size_type,1>{};

returns always one - shouldn't it return the static value for the given I?

[FEATURE] for_each_index redesign

decoupling from a container: only accept a function and a shape (because you don't know what the function will do with the indices - it may access several containers, or none)
pass the indices as a variadic pack instead of a tuple (varargs -> pack is easy, the other direction not so much)

[FEATURE] axis::is_dynamic

or some such shorthand for
if constexpr ( kwk::concepts::dynamic_axis< decltype( dim ) > )
vs
if constexpr ( dim.dynamic )

[FEATURE] Coordinate system modifiers

As per the discussion with our physicist friends:
We want:

for_each( f, view{v, translate({3_c,3.5}, interpolate)} );
for_each( f, view{v, scaler({0.35,1}, interpolate)} );
for_each( f, view{v, rotate(0.256, interpolate)} );
for_each( f, view{v, affine(matrix = { 0.5,0   ,-3
                                     , 0  ,-0.5,-3
                                     }, interpolate)} );

with interpolate one of : floor, ceil, round, linear, cubic

coordinates() (ct) efficiency

@ "This save some 1-200ms of compile-time for those very common cases" comment

in my nestedFors( auto && f, std::uint32_t const startIteration, std::uint32_t const endIteration ) i used this (to convert startIteration to a set of indices):

            /* leftmost/outer dimension does not require the modulo operation, example for 4 dimensions:
                  startIteration / ( dimensions[ 1 ] * dimensions[ 2 ] * dimensions[ 3 ] )                    ,
                ( startIteration / (                   dimensions[ 2 ] * dimensions[ 3 ] ) ) % dimensions[ 1 ],
                ( startIteration /                                       dimensions[ 3 ]   ) % dimensions[ 2 ],
                  startIteration                                                             % dimensions[ 3 ],
            */
            [ & ]< int... i >( std::integer_sequence< int, i... > )
            {
                auto const strides{ this->as_stride() };
                return std::array< shape_t, cardinality >
                {
                    static_cast< shape_t >(   startIteration / strides[ 0     ]                      ),
                    static_cast< shape_t >( ( startIteration / strides[ i + 1 ] ) % (*this)[ i + 1 ] )...
                };
            }( std::make_integer_sequence< int, cardinality - 1 >{} ),

would this not be faster than going through kumi fold?

BOOST_ASSERT as KIWAKU_ASSERT no longer works

Now that you have started using iostream constructs in invocations of KIWAKU_ASSERT this fails to compile when BOOST_ASSERT is used
https://user-images.githubusercontent.com/340735/222702614-350a6e2f-67aa-438f-97a5-dc62f82c2084.png

The first fix is to remove BOOST_ASSERT support and let/us other provide our own KIWAKU_ASSERT implementation. The core issue for us is the use of iostreams (yuck :D) and also a different 'abort' point (more then one place to place a break point and/or stacktrace printouts for CI environments).

[BUG] Table constructor need to call constructor

auto cells = table{of_size(size*size), as };
Doesn't call cell's constructor.
-> Its value is then 0 0 0

[BUG] axis assignment

auto ba{ kwk::height[ 3 ] };
ba = 2;

error : use of overloaded operator '=' is ambiguous (with operand types 'axis_<rbr::literals::str{"height", 7}, decltype(v)>' (aka 'axis_<rbr::literals::str{"height", 7}, int>') and 'int')
kiwaku\include\kwk\detail\raberu.hpp(397,20): message : candidate function [with Type = int]
kiwaku\include\kwk\settings\axis.hpp(24,10): message : candidate function (the implicit move assignment operator)
kiwaku\include\kwk\settings\axis.hpp(24,10): message : candidate function (the implicit copy assignment operator)

[FEATURE] Initializer list should be a valid way to initialize a kwk::table (problem with range_source const)

[FEATURE] Policy for default dynamic axis values

(in shape default construction): your current logic/behaviour:
//! @brief Constructs a default @ref kwk::shape equals to [1 1 ... 0]
does not 'work' for me - in my wrapper I have to override it by resetting everything to zero.

I know this is another oh crap facepalm moment but 🤷 :/

[FEATURE] Support heterogeneous shape

Like of_size(short,char,fixed<N>, int> etc

Complete support for kwk::table

Where have the sources for the owning container/array.hpp gone? 🤔

Remaining tasks:

table basic tests
move/copy tests

Easier shape construction

Construction syntax for partially static shapes is IMO really cumbersome - the axis placeholder syntax is cool and useful certainly for some situations, in others (most ones in our use cases) however it would be easier if we could have the 'old school less cool eigen' syntax ;D +i'd expect this syntax to be faster to compile ;D
i.e. so that one can simply pass values to the constructor using plain C syntax - and the library will simply assert that the values passed for fixed/static axis match the static values...
A companion version that takes only as many arguments as there are dynamic dimensions (and assigns them in order) would also be useful...

[FEATURE] Make _ and as behaves as pseudo integer greedy types

To simplify handling of _ and as in shape code, we need to extent operator support and add a bunch of basic functions like min max etc...

[BUG] Compliance with incomplete standard library

Apple clang miss some implementation and probably misses others.

Tiled/blocked layout support

Support sort-of-embedded nD-arrays ('blocks') in mD-arrays where n <= m (e.g. tile iteration, transparent indexing/coordinate translation...).
A special case of this would be the case of 2D arrays or matrices which are laid out in tiles in memory (n==2 and m==2) as is frequently the case in intermediate results (e.g. in matrix multiplication).

[FEATURE] shape and stride should verify is_trivial

Make a test for that and apply @psiha changes to ensure