The portable-simd intrinsics simd_bitmask and <code c

What should SIMD bitmasks look like? about rust HOT 6 OPEN

RalfJung commented on July 21, 2024

What should SIMD bitmasks look like?

from rust.

Comments (6)

programmerjake commented on July 21, 2024

if we get generic-sized integers uint<N> (like C's unsigned _BitInt(N)) somewhat soon, I would just use those.

Without knowing those, here's a completely naive proposal: the array must be big enough to contain at least as many bits as the vector has elements (but that's just a lower bound, arbitrarily bigger arrays are allowed), and then vector elements are mapped to bits in the array as follows: the vector element i is represented in array element i / 8, in the bit i % 8, where bits are indexed from most significant to least significant.

The standard format that LLVM uses on little-endian (and x86 and a few other arches too) is that bits are counted from the LSB end to the MSB end, not MSB to LSB. The idea is if you have some integer type uint<N> then:
(uint::<N>::from_le_bytes(the_bytes) >> i) & 1 != 0 is true iif element i of the corresponding mask is true.

I strongly think that we should just use that format everywhere if we don't want an endian-dependent format and generic-sized integers aren't ready yet.

from rust.

RalfJung commented on July 21, 2024

if we get generic-sized integers uint<N> (like C's unsigned _BitInt(N)) somewhat soon, I would just use those.

I don't know of any initiative working on them. What is the current status?

from rust.

programmerjake commented on July 21, 2024

if we get generic-sized integers uint (like C's unsigned _BitInt(N)) somewhat soon, I would just use those.
I don't know of any initiative working on them. What is the current status?

There's a postponed RFC that people recently have been asking if it's been long enough to restart it: rust-lang/rfcs#2581 iirc 3-4 different people have been talking about it in the last month or two (mostly on Zulip or other random corners of the Rust project).

from rust.

RalfJung commented on July 21, 2024

I strongly think that we should just use that format everywhere if we don't want an endian-dependent format and generic-sized integers aren't ready yet.

I don't have an opinion either way -- this sounds perfectly reasonable to me. We'd then say:
the vector element i is represented in array element i / 8, in the bit i % 8, where bits are indexed from least significant to most significant.

I think the LLVM IR for big-endian would then be something like

do the bitcast from <N x i1> to iN
reverse the bits
zero-extend to match the size of the array
reverse the bytes (strangely LLVM bswap only works for types with an even number of bytes, so if the array has e.g. length 3 we'd have to use some other encoding...)
transmute to array

Is there some particular instruction sequence we want to generate here or would something like that work?

from rust.

RalfJung commented on July 21, 2024

FWIW I am also entirely open to the idea that the current behavior is already what we want (including on big-endian). But the fact that portable-simd stopped using the array-based variant entirely is an indication that something is not optimal. I have no idea what, as I don't really know the design space here. I see my role as that of an advisor with a t-opsem view point.

The reason the current semantics seem odd is that Miri currently has exactly 4 places where endianess matters:

loading integers
storing integers
simd_bitmask
simd_select_bitmask

So, the intrinsics are certainly somewhat striking. But maybe that's expected for converting between arrays of bits and a more compact representation; I don't have any intuition for what to expect here.

from rust.

programmerjake commented on July 21, 2024

FWIW I am also entirely open to the idea that the current behavior is already what we want (including on big-endian). But the fact that portable-simd stopped using the array-based variant entirely is an indication that something is not optimal.

the reasons we stopped are that because generic const exprs aren't working that well, we have to have the output byte array have the same length as the input mask, despite being 8x overkill -- this would be solved by uint<N> since that N specifies bit count rather than byte count. also because those intrinsics are currently plain broken for non-power-of-two lengths on at least aarch64, probably due to a combination of rustc and llvm bugs.

from rust.

What should SIMD bitmasks look like? about rust HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent