Git Product home page Git Product logo

libxprec's Introduction

Small double-double library

Emulates quadruple precision with a pair of doubles. This roughly doubles the mantissa bits (and thus squares the precision of double). The range is almost the same as double, with a larger area of denormalized numbers.

The rough cost in floating point operations (flops) and relative error as multiples of u² = 1.32e-32 (round-off error or half the machine epsilon) is as follows:

(op) (op)double error (op)DDouble error
add_small 3 flops 2u² 17 flops 3u²
+ - 10 flops 2u² 20 flops 3u²
* 6 flops 2u² 9 flops 4u²
/ 10 flops 3u² 31 flops 6u²
reciprocal 14 flops 2.3u² 22 flops 2.3u²

The error bounds are tight analytical bounds 1, except in the case of double-double division, where the bound is 10u² but largest observed error is 6u². We report the largest observed error here 2.

Usage

Simple example:

#include <iostream>
#include <xprec/ddouble.h>

int main()
{
  xprec::DDouble x = 1.0;                // emulated quad precision
  x = (4 - x) / (x + 6);                 // arithmetic operators work
  std::cout << x << std::endl;           // output to full precision
  std::cout << x.hi() << std::endl;      // output truncated to double
  std::cout << exp(x) << std::endl;      // higher-precision exp
}

Installation

libxprec has no mandatory dependencies other than a C++11-compliant compiler.

mkdir build
cd build
cmake .. [EXTRA_CMAKE_FLAGS_GO_HERE]
make
./test/tests      # requires -DBUILD_TESTING=ON
make install      # set -DCMAKE_INSTALL_PREFIX to customize install dir

Useful CMake flags:

  • -DBUILD_TESTING=ON: builds unit tests. You need to have the GNU MPFR library installed for this to work.

  • -DCMAKE_CXX_FLAGS=-mfma: the double-double arithmetic in libxprec is much faster when using the fused-multiply add (FMA) instruction, which should be available on most modern CPUs. We recommend adding this flag unless you require portable binaries.

  • -DCMAKE_INSTALL_PREFIX=/path/to/usr: sets the base directory below which to install include files and the shared object.

Header-only mode

libxprec can also be used in header-only mode, which does not require installation. For this, simply drop the full libxprec directory into your project and use the following header:

#include "libxprec/include/xprec/ddouble-header-only.h"

Please note that this will likely lead to considerably longer compile times.

Import in other CMake project

In order to use the library in CMake projects, we recommend using FetchContent:

include(FetchContent)
FetchContent_Declare(XPrec
    GIT_REPOSITORY https://github.com/tuwien-cms/libxprec
    GIT_TAG v0.5.0
    )
FetchContent_MakeAvailable(XPrec)

You then should be able to simply link against the XPrec::xprec target.

License and Copying

Copyright (C) 2023 Markus Wallerberger and others.

Released under the MIT license (see LICENSE for details).

Footnotes

  1. J.-M. Muller and L. Rideau, ACM Trans. Math. Softw. 48, 1, 9 (2022)

  2. M. Joldes, et al., ACM Trans. Math. Softw. 44, 1-27 (2018)

libxprec's People

Contributors

mwallerb avatar nonbasketless avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

nonbasketless

libxprec's Issues

inverse circular functions are inaccurate for large arguments

In general, the trigonometric functions and their inverses are quite accurate (though we are not squeezing the last ounce of accuracy yet).

The exception is asin and acos for values close to plus/minus one and atan for large arguments. I suppose the Taylor expansion is not really working there anymore, but one would have to investigate.

Header only?

Any chance libxprec could be made header-only, considering how delightfully small it is?

I'd be more than happy to take that on myself and share/push/PR or whatever.

Proper output of ddouble numbers

Right now operator<< simply prints the hi and lo part. Ideally, we want to change that so that we can simply print the number.

No automatic tests on Windows

Right now, we don't build or run the unit tests on Windows as part of the CI.

The problem is that I don't know how to install MPFR under Windows. Looking for someone with more Win foo.

PowerOfTwo::PowerOfTwo not const in MSVC

Offending line:

error C3615: constexpr function 'PowerOfTwo::PowerOfTwo' cannot result in a constant expression

For my purposes commenting it out works. I bet this would work otherwise, but haven't tested beyond building:

explicit constexpr PowerOfTwo(int n) : _x((double)((uint64_t)1 << n)) { }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.