Git Product home page Git Product logo

Comments (10)

saem avatar saem commented on May 17, 2024 1

For those of us peripherally following along, it seems there are two lessons here:

  1. literal typing/conversion is broken, we should track width and detect narrowing
  2. VM math needs tighter specification, see below

The direction with VM math operations in the compiler/VM, especially floating point need to be agnostic of the compiler host platform.

Instead they should work as per some sort of canonical understanding of the target back-end or a single agreed upon understanding.

That way when I compile code on some funky CPU, all the compile time evaluation should be identical to some more popular CPU/mArchitecture for the same code and parameters.

This especially matters for floating point numbers, very wide types that are rarely natively supported (128bit integers), or targets that treat don't have real int support (JS).


That sounds about right?

from nimskull.

alaviss avatar alaviss commented on May 17, 2024 1

Right now the only defined rules in Nim are:

  • intX <-> intY is range-checked.
  • intX -> uintY is not range-checked and is a bit-reinterpretation iirc.
  • uintX -> intY is range-checked.
  • floatX -> intY is undefined if the floatX with the part after the . dropped cannot be represented by intY (inherited from C).
  • intX -> floatX produces the nearest integer representable by the target floating point type (inherited from C). The resulting float might be larger or smaller than intX itself.

I think the right thing to do here is to make floatX -> intY defined by enforcing range check for it across runtime and compile time.

from nimskull.

krux02 avatar krux02 commented on May 17, 2024
diff --git a/compiler/int128.nim b/compiler/int128.nim
index ee88be97c..efe890d13 100644
--- a/compiler/int128.nim
+++ b/compiler/int128.nim
@@ -3,7 +3,7 @@
 ## hold all from ``low(BiggestInt)`` to ``high(BiggestUInt)``, This
 ## type is for that purpose.
 
-from math import trunc
+from math import trunc, pow
 
 type
   Int128* = object
@@ -541,7 +541,14 @@ proc ldexp(x: float64, exp: cint): float64 {.importc: "ldexp", header: "<math.h>
 
 template bitor(a,b,c: Int128): Int128 = bitor(bitor(a,b), c)
 
+proc inInt128Range*(arg: float64): bool =
+  const lowerBound = -pow(2'f64, 127) 
+  const upperBound = +pow(2'f64, 127) # 2^127-1 can't be represented with float64 precision
+  lowerBound <= arg and arg < upperBound
+
 proc toInt128*(arg: float64): Int128 =
+  assert(inInt128Range(arg), "out of range")
+
   let isNegative = arg < 0
   let v0 = ldexp(abs(arg), -100)
   let w0 = uint64(trunc(v0))
diff --git a/compiler/semexprs.nim b/compiler/semexprs.nim
index e3defe03a..9e82902e8 100644
--- a/compiler/semexprs.nim
+++ b/compiler/semexprs.nim
@@ -156,10 +156,11 @@ proc checkConvertible(c: PContext, targetTyp: PType, src: PNode): TConvStatus =
       if src.kind in nkCharLit..nkUInt64Lit and
           src.getInt notin firstOrd(c.config, targetTyp)..lastOrd(c.config, targetTyp):
         result = convNotInRange
-      elif src.kind in nkFloatLit..nkFloat64Lit and
-          (classify(src.floatVal) in {fcNan, fcNegInf, fcInf} or
-            src.floatVal.int64 notin firstOrd(c.config, targetTyp)..lastOrd(c.config, targetTyp)):
-        result = convNotInRange
+      elif src.kind in nkFloatLit..nkFloat64Lit:
+        if not src.floatVal.inInt128Range:
+          result = convNotInRange
+        elif src.floatVal.toInt128 notin firstOrd(c.config, targetTyp)..lastOrd(c.config, targetTyp):
+          result = convNotInRange
     elif targetBaseTyp.kind in tyFloat..tyFloat64:
       if src.kind in nkFloatLit..nkFloat64Lit and
           not floatRangeCheck(src.floatVal, targetTyp):
diff --git a/todo.org b/todo.org
index 51576fa2a..28a152335 100644

from nimskull.

krux02 avatar krux02 commented on May 17, 2024

here is more inconsistency:

proc foo1(arg: float64): string =
  let x = int64(arg)
  $x

proc foo2(arg: float64): string {.compileTime.} =
  let x = int64(arg)
  $x

echo foo1(1e100)
echo foo2(1e100)
echo int64(1e100)

output:

-9223372036854775808
-9223372036854775808
0

from nimskull.

zacharycarter avatar zacharycarter commented on May 17, 2024

I'm currently working on replacing the compiler's int128 implementation with a signed and unsigned int128 implementation. I know we want to revisit the design of the compiler overall, but for now I think providing a better implementation will make these casts easier and understandable and also simpler to spec. Once I have finished the implementations, and the existing tests are passing I will add new ones for the float64 to int128.

from nimskull.

krux02 avatar krux02 commented on May 17, 2024

@zacharycarter I've written the int128 type and of course I am biased here, but I don't think the root cause of this bug here is the int128 type or its implementation. I've worked a lot on this type, I know how it works, and I've written a lot of integer handling with different sizes in the VM. There is certainly some shared knowledge between this type and the VM type. Reimplementing int128 to a signed int128 type (sign + absolute value) would remove all of it and therefore I don't like the idea.

So I would really like to unstand your problems with the int128 type as it is.

Yes there are bugs, but they are to my knowledge not because of the int128 type, they are because of a problematic specification of Nim and developers pulling the language in different directions.

My suggestion for this fix would be the following:

  • Make damn sure that compile time and runtime behaves exactly the same.

This means int64(1e100) would become -9223372036854775808, because that is what it is at runtime, at compiletime (VM) and that can be implemented in semfold as well. Yes this means less error detection at semfold.

from nimskull.

alaviss avatar alaviss commented on May 17, 2024

This means int64(1e100) would become -9223372036854775808, because that is what it is at runtime, at compiletime (VM) and that can be implemented in semfold as well.

Note that this is not guaranteed. Nim does not have float range check so the behavior varies wildly between architectures. We should look into defining runtime checks for float conversions as well.

from nimskull.

zacharycarter avatar zacharycarter commented on May 17, 2024

@zacharycarter I've written the int128 type and of course I am biased here, but I don't think the root cause of this bug here is the int128 type or its implementation. I've worked a lot on this type, I know how it works, and I've written a lot of integer handling with different sizes in the VM. There is certainly some shared knowledge between this type and the VM type. Reimplementing int128 to a signed int128 type (sign + absolute value) would remove all of it and therefore I don't like the idea.

So I would really like to unstand your problems with the int128 type as it is.

@krux02 - thanks for the reply my dude! I can't say I necessarily have a problem with the implementation, I just don't understand parts of it, and since I needed some frame of reference to test my findings against, I thought the best way would be to port some implementation I knew that was correct and that I could test against.

Yes there are bugs, but they are to my knowledge not because of the int128 type, they are because of a problematic specification of Nim and developers pulling the language in different directions.

@krux02 - good to know! I can stop wasting my time here then xD

My suggestion for this fix would be the following:

  • Make damn sure that compile time and runtime behaves exactly the same.

This means int64(1e100) would become -9223372036854775808, because that is what it is at runtime, at compiletime (VM) and that can be implemented in semfold as well. Yes this means less error detection at semfold.

Okay, I see - thanks for the explanation. I don't know anything about the compiler backend really other than it's a compiler and I have uni textbook knowledge of them. I've barely done anything with Nim's, but I'll look into it.

This means int64(1e100) would become -9223372036854775808, because that is what it is at runtime, at compiletime (VM) and that can be implemented in semfold as well.

Note that this is not guaranteed. Nim does not have float range check so the behavior varies wildly between architectures. We should look into defining runtime checks for float conversions as well.

Okay first I will check out what Arne is suggesting regarding semfold and then I'll look into this.

from nimskull.

zacharycarter avatar zacharycarter commented on May 17, 2024

That sounds about right?

It does but I'm still not sure what to do here in order to make any progress. I think I'm in slightly over my head 😅

I might try to find something simpler to work on - lately I haven't had much time to work on anything and I don't want to commit to something I barely understand to begin with.

from nimskull.

krux02 avatar krux02 commented on May 17, 2024

Okay, I see - thanks for the explanation. I don't know anything about the compiler backend really other than it's a compiler and I have uni textbook knowledge of them. I've barely done anything with Nim's, but I'll look into it.

Well I can give you a brief overview of how Nim works here. There are three ways to compute an expression in Nim:

  1. compile to C/C++/JS and execute at runtime
  2. compile to byecode and execute at compile time
  3. semfold

I think option 1 and two are pretty much clear. Not in detail, but generally enough for this conversation. Semfold though is a different thing. It is just a compiler pass that does ast substitutions for simple expressions that can be computed at compile time. This type of code execution is usually implemented in compilers (such as C) as an optimization pass, so that (1+5) becomes a literal 6. I think c++ template metaprogramming is based on something like this. But in Nim, semfold is completely redundant to the VM (and the semfold equivalent in the backend compiler). Except from the part where semfold does error generation for out of range values.

Currently I don't know what to do or if my suggestion is still a good suggestion. After all, maybe it is good if the convesion of one googol (yes 1e100 has a name) to int64 would raise an error at runtime and VM as well.

from nimskull.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.