Comments (6)
Thanks @trofi , I agree we should not be requiring a particular outcome for something that in scalar code would be UB.
We'll update the docs to say that behavior is implementation-defined for values outside the destination range.
from highway.
On second thought, we have a PromoteInRangeTo which has the implementation-defined behavior. Our PromoteTo is at least trying to do the right thing for out of range inputs. Having a close look what's going on..
from highway.
I posted on the GCC bug; seems the compiler is treating 2^63 as UB, though it is actually well-defined on x86. Thus our fix-afterwards strategy will not work. It would be nice if the compiler changed, but we have to live with the current behavior for the next few years, so I am changing our code to compare/min beforehand.
from highway.
I'm still seeing failures of HwyConvertTestGroup/HwyConvertTest.TestAllF2IPromoteTo/SSE2
on gcc-15
against the master
branch:
[ RUN ] HwyConvertTestGroup/HwyConvertTest.TestAllF2IPromoteTo/SSE2
i64x2 expect [0+ ->]:
0x7fffffffffffffff,0x7fffffffffffffff,
i64x2 actual [0+ ->]:
0x7fffffff00000000,0x7fffffff00000000,
Abort at convert_test.cc:41: SSE2, i64x2 lane 0 mismatch: expected '0x7fffffffffffffff', got '0x7fffffff00000000'.
Zeros look a bit suspicious.
It looks like the failing promotion bit is this code:
// Generic for all vector lengths on SSE2/SSSE3/SSE4/AVX2
template <class D, HWY_IF_I64_D(D)>
HWY_API VFromD<D> PromoteTo(D di64, VFromD<Rebind<float, D>> v) {
const Rebind<int32_t, decltype(di64)> di32;
const RebindToFloat<decltype(di32)> df32;
const RebindToUnsigned<decltype(di32)> du32;
const Repartition<uint8_t, decltype(du32)> du32_as_du8;
const auto exponent_adj = BitCast(
du32,
Min(SaturatedSub(BitCast(du32_as_du8, ShiftRight<23>(BitCast(du32, v))),
BitCast(du32_as_du8, Set(du32, uint32_t{157}))),
BitCast(du32_as_du8, Set(du32, uint32_t{32}))));
const auto adj_v =
BitCast(df32, BitCast(du32, v) - ShiftLeft<23>(exponent_adj));
const auto f32_to_i32_result = ConvertTo(di32, adj_v);
const auto lo64_or_mask = PromoteTo(
di64,
BitCast(du32, VecFromMask(di32, Eq(f32_to_i32_result,
Set(di32, LimitsMax<int32_t>())))));
return Or(PromoteTo(di64, BitCast(di32, f32_to_i32_result))
<< PromoteTo(di64, exponent_adj),
lo64_or_mask);
}
I did not wrap my head around it yet: does this code already implement AVX3 equivalent does with return IfThenElse(overflow, Set(di64, LimitsMax<int64_t>()), PromoteInRangeTo(di64, v));
? Or it needs some similar handling as well?
from highway.
Oh, I see that const auto f32_to_i32_result = ConvertTo(di32, adj_v);
is expected to do masking. I'll spend some time understanding why gcc-15
compiles it incorrectly.
from highway.
FYI @johnplatts has a workaround in #2189, entirely avoiding the UB via inline assembly :) Although regrettable this is necessary, I think it is a safe solution.
from highway.
Related Issues (20)
- Support GatherIndex different sizes (_mm512_i64gather_epi32 etc.) HOT 2
- [feature request] Add a HWY_REGISTER_CALL macro for __vectorcall HOT 2
- Question: VEX-encoded SSE4 mentioned in `README.md` HOT 8
- Support for complex arithmetics HOT 9
- `SetTableIndices` for `TableLookupBytes` raises a compilation error on NEON HOT 4
- Support for saturating doubling multiply add HOT 9
- Choosing NEON over SVE when fixed size vectors are used where possible HOT 7
- ReorderWidenMulAccumulate with guaranteed lanes order and sum HOT 10
- README is ambiguous due to errant comma
- RFC: renumber Arm targets + Apple feature detection HOT 9
- `RVV` target test failures HOT 9
- About std:rint in highway HOT 1
- Compile failure: SVE vector type 'svbool_t' (aka '__SVBool_t') cannot be used in a target without sve HOT 14
- Question: header only version ? HOT 11
- test with target RVV failed with gcc13 and glibc 2.38 HOT 2
- tests fail on riscv64 ***Exception: Illegal on Milk-V Pioneer HOT 3
- cmake gives error: CMake can not determine linker language HOT 2
- Different test results using Clang when enabling Debug or not on targer RVV HOT 2
- highway.h --> undeclared identifier 'AllExports' HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from highway.