Git Product home page Git Product logo

Comments (6)

jan-wassenberg avatar jan-wassenberg commented on June 3, 2024 1

Thanks @trofi , I agree we should not be requiring a particular outcome for something that in scalar code would be UB.
We'll update the docs to say that behavior is implementation-defined for values outside the destination range.

from highway.

jan-wassenberg avatar jan-wassenberg commented on June 3, 2024

On second thought, we have a PromoteInRangeTo which has the implementation-defined behavior. Our PromoteTo is at least trying to do the right thing for out of range inputs. Having a close look what's going on..

from highway.

jan-wassenberg avatar jan-wassenberg commented on June 3, 2024

I posted on the GCC bug; seems the compiler is treating 2^63 as UB, though it is actually well-defined on x86. Thus our fix-afterwards strategy will not work. It would be nice if the compiler changed, but we have to live with the current behavior for the next few years, so I am changing our code to compare/min beforehand.

from highway.

trofi avatar trofi commented on June 3, 2024

I'm still seeing failures of HwyConvertTestGroup/HwyConvertTest.TestAllF2IPromoteTo/SSE2 on gcc-15 against the master branch:

[ RUN      ] HwyConvertTestGroup/HwyConvertTest.TestAllF2IPromoteTo/SSE2


i64x2 expect [0+ ->]:
  0x7fffffffffffffff,0x7fffffffffffffff,
i64x2 actual [0+ ->]:
  0x7fffffff00000000,0x7fffffff00000000,
Abort at convert_test.cc:41: SSE2, i64x2 lane 0 mismatch: expected '0x7fffffffffffffff', got '0x7fffffff00000000'.

Zeros look a bit suspicious.

It looks like the failing promotion bit is this code:

// Generic for all vector lengths on SSE2/SSSE3/SSE4/AVX2
template <class D, HWY_IF_I64_D(D)>
HWY_API VFromD<D> PromoteTo(D di64, VFromD<Rebind<float, D>> v) {
  const Rebind<int32_t, decltype(di64)> di32;
  const RebindToFloat<decltype(di32)> df32;
  const RebindToUnsigned<decltype(di32)> du32;
  const Repartition<uint8_t, decltype(du32)> du32_as_du8;

  const auto exponent_adj = BitCast(
      du32,
      Min(SaturatedSub(BitCast(du32_as_du8, ShiftRight<23>(BitCast(du32, v))),
                       BitCast(du32_as_du8, Set(du32, uint32_t{157}))),
          BitCast(du32_as_du8, Set(du32, uint32_t{32}))));
  const auto adj_v =
      BitCast(df32, BitCast(du32, v) - ShiftLeft<23>(exponent_adj));

  const auto f32_to_i32_result = ConvertTo(di32, adj_v);
  const auto lo64_or_mask = PromoteTo(
      di64,
      BitCast(du32, VecFromMask(di32, Eq(f32_to_i32_result,
                                         Set(di32, LimitsMax<int32_t>())))));

  return Or(PromoteTo(di64, BitCast(di32, f32_to_i32_result))
                << PromoteTo(di64, exponent_adj),
            lo64_or_mask);
}

I did not wrap my head around it yet: does this code already implement AVX3 equivalent does with return IfThenElse(overflow, Set(di64, LimitsMax<int64_t>()), PromoteInRangeTo(di64, v));? Or it needs some similar handling as well?

from highway.

trofi avatar trofi commented on June 3, 2024

Oh, I see that const auto f32_to_i32_result = ConvertTo(di32, adj_v); is expected to do masking. I'll spend some time understanding why gcc-15 compiles it incorrectly.

from highway.

jan-wassenberg avatar jan-wassenberg commented on June 3, 2024

FYI @johnplatts has a workaround in #2189, entirely avoiding the UB via inline assembly :) Although regrettable this is necessary, I think it is a safe solution.

from highway.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.