Git Product home page Git Product logo

primecount's Introduction

primecount

Build status Build status Github Releases C API Documentation C++ API Documentation

primecount is a command-line program and C/C++ library that counts the number of primes ≤ x (maximum 1031) using highly optimized implementations of the combinatorial prime counting algorithms.

primecount includes implementations of all important combinatorial prime counting algorithms known up to this date all of which have been parallelized using OpenMP. primecount contains the first ever open source implementations of the Deleglise-Rivat algorithm and Xavier Gourdon's algorithm (that works). primecount also features a novel load balancer that is shared amongst all implementations and that scales up to hundreds of CPU cores. primecount has already been used to compute several prime counting function world records.

Installation

The primecount command-line program is available in a few package managers. For doing development with libprimecount you may need to install libprimecount-dev or libprimecount-devel.

Windows: winget install primecount
macOS: brew install primecount
Arch Linux: sudo pacman -S primecount
Debian/Ubuntu: sudo apt install primecount
Fedora: sudo dnf install primecount
FreeBSD: pkg install primecount
openSUSE: sudo zypper install primecount

Build instructions

You need to have installed a C++ compiler and CMake. Ideally primecount should be compiled using GCC or Clang as these compilers support both OpenMP (multi-threading library) and 128-bit integers.

cmake .
cmake --build . --parallel
sudo cmake --install .
sudo ldconfig

Usage examples

# Count the primes ≤ 10^14
primecount 1e14

# Print progress and status information during computation
primecount 1e20 --status

# Count primes using Meissel's algorithm
primecount 2**32 --meissel

# Find the 10^14th prime using 4 threads
primecount 1e14 --nth-prime --threads=4 --time

Command-line options

Usage: primecount x [options]
Count the number of primes less than or equal to x (<= 10^31).

Options:

  -d, --deleglise-rivat    Count primes using the Deleglise-Rivat algorithm
  -g, --gourdon            Count primes using Xavier Gourdon's algorithm.
                           This is the default algorithm.
  -l, --legendre           Count primes using Legendre's formula
      --lehmer             Count primes using Lehmer's formula
      --lmo                Count primes using Lagarias-Miller-Odlyzko
  -m, --meissel            Count primes using Meissel's formula
      --Li                 Eulerian logarithmic integral function
      --Li-inverse         Approximate the nth prime using Li^-1(x)
  -n, --nth-prime          Calculate the nth prime
  -p, --primesieve         Count primes using the sieve of Eratosthenes
      --phi <X> <A>        phi(x, a) counts the numbers <= x that are not
                           divisible by any of the first a primes
  -R, --RiemannR           Approximate pi(x) using the Riemann R function
      --RiemannR-inverse   Approximate the nth prime using R^-1(x)
  -s, --status[=NUM]       Show computation progress 1%, 2%, 3%, ...
                           Set digits after decimal point: -s1 prints 99.9%
      --test               Run various correctness tests and exit
      --time               Print the time elapsed in seconds
  -t, --threads=NUM        Set the number of threads, 1 <= NUM <= CPU cores.
                           By default primecount uses all available CPU cores.
  -v, --version            Print version and license information
  -h, --help               Print this help menu
Advanced options
Advanced options for the Deleglise-Rivat algorithm:

  -a, --alpha=NUM          Set tuning factor: y = x^(1/3) * alpha
      --P2                 Compute the 2nd partial sieve function
      --S1                 Compute the ordinary leaves
      --S2-trivial         Compute the trivial special leaves
      --S2-easy            Compute the easy special leaves
      --S2-hard            Compute the hard special leaves

Advanced options for Xavier Gourdon's algorithm:

      --alpha-y=NUM        Set tuning factor: y = x^(1/3) * alpha_y
      --alpha-z=NUM        Set tuning factor: z = y * alpha_z
      --AC                 Compute the A + C formulas
      --B                  Compute the B formula
      --D                  Compute the D formula
      --Phi0               Compute the Phi0 formula
      --Sigma              Compute the 7 Sigma formulas

Benchmarks

x Prime Count Legendre Meissel Lagarias
Miller
Odlyzko
Deleglise
Rivat
Gourdon
1010 455,052,511 0.01s 0.01s 0.01s 0.01s 0.00s
1011 4,118,054,813 0.01s 0.01s 0.01s 0.01s 0.01s
1012 37,607,912,018 0.03s 0.02s 0.02s 0.01s 0.01s
1013 346,065,536,839 0.09s 0.06s 0.03s 0.02s 0.03s
1014 3,204,941,750,802 0.44s 0.20s 0.08s 0.08s 0.04s
1015 29,844,570,422,669 2.33s 0.89s 0.29s 0.16s 0.11s
1016 279,238,341,033,925 15.49s 5.10s 1.26s 0.58s 0.38s
1017 2,623,557,157,654,233 127.10s 39.39s 5.62s 2.26s 1.34s
1018 24,739,954,287,740,860 1,071.14s 366.93s 27.19s 9.96s 5.35s
1019 234,057,667,276,344,607 NaN NaN NaN 40.93s 20.16s
1020 2,220,819,602,560,918,840 NaN NaN NaN 167.64s 81.98s
1021 21,127,269,486,018,731,928 NaN NaN NaN 706.70s 353.01s
1022 201,467,286,689,315,906,290 NaN NaN NaN 3,012.10s 1,350.47s

The benchmarks above were run on an AMD 7R32 CPU (from 2020) with 16 cores/32 threads clocked at 3.30GHz. Note that Jan Büthe mentions in [11] that he computed $\pi(10^{25})$ in 40,000 CPU core hours using the analytic prime counting function algorithm. Büthe also mentions that by using additional zeros of the zeta function the runtime could have potentially been reduced to 4,000 CPU core hours. However using primecount and Xavier Gourdon's algorithm $\pi(10^{25})$ can be computed in only 460 CPU core hours on an AMD Ryzen 3950X CPU!

Algorithms

Legendre's Formula $\pi(x)=\pi(\sqrt{x})+\phi(x,\pi(\sqrt{x}))-1$
Meissel's Formula $\pi(x)=\pi(\sqrt[3]{x})+\phi(x,\pi(\sqrt[3]{x}))-\mathrm{P_2}(x,\pi(\sqrt[3]{x}))-1$
Lehmer's Formula $\pi(x)=\pi(\sqrt[4]{x})+\phi(x,\pi(\sqrt[4]{x}))-\mathrm{P_2}(x,\pi(\sqrt[4]{x}))-\mathrm{P_3}(x,\pi(\sqrt[4]{x}))-1$
LMO Formula $\pi(x)=\pi(\sqrt[3]{x})+\mathrm{S_1}(x,\pi(\sqrt[3]{x}))+\mathrm{S_2}(x,\pi(\sqrt[3]{x}))-\mathrm{P_2}(x,\pi(\sqrt[3]{x}))-1$

Up until the early 19th century the most efficient known method for counting primes was the sieve of Eratosthenes which has a running time of $O(x\log{\log{x}})$ operations. The first improvement to this bound was Legendre's formula (1830) which uses the inclusion-exclusion principle to calculate the number of primes below x without enumerating the individual primes. Legendre's formula has a running time of $O(x)$ operations and uses $O(\sqrt{x}/\log{x})$ space. In 1870 E. D. F. Meissel improved Legendre's formula by setting $a=\pi(\sqrt[3]{x})$ and by adding the correction term $\mathrm{P_2}(x,a)$, Meissel's formula has a running time of $O(x/\log^3{x})$ operations and uses $O(\sqrt[3]{x})$ space. In 1959 D. H. Lehmer extended Meissel's formula and slightly improved the running time to $O(x/\log^4{x})$ operations and $O(x^{\frac{3}{8}})$ space. In 1985 J. C. Lagarias, V. S. Miller and A. M. Odlyzko published a new algorithm based on Meissel's formula which has a lower runtime complexity of $O(x^{\frac{2}{3}}/\log{x})$ operations and which uses only $O(\sqrt[3]{x}\ \log^2{x})$ space.

primecount's Legendre, Meissel and Lehmer implementations are based on Hans Riesel's book [5], its Lagarias-Miller-Odlyzko and Deleglise-Rivat implementations are based on Tomás Oliveira's paper [9] and the implementation of Xavier Gourdon's algorithm is based on Xavier Gourdon's paper [7]. primecount's implementation of the so-called hard special leaves is different from the algorithms that have been described in any of the combinatorial prime counting papers so far. Instead of using a binary indexed tree for counting which is very cache inefficient primecount uses a linear counter array in combination with the POPCNT instruction which is more cache efficient and much faster. The Hard-Special-Leaves.md document contains more information. primecount's easy special leaf implementation and its partial sieve function implementation also contain significant improvements.

Fast nth prime calculation

The most efficient known method for calculating the nth prime is a combination of the prime counting function and a prime sieve. The idea is to closely approximate the nth prime e.g. using the inverse logarithmic integral $\mathrm{Li}^{-1}(n)$ or the inverse Riemann R function $\mathrm{R}^{-1}(n)$ and then count the primes up to this guess using the prime counting function. Once this is done one starts sieving (e.g. using the segmented sieve of Eratosthenes) from there on until one finds the actual nth prime. The author has implemented primecount::nth_prime(n) this way (option: --nth-prime), it finds the nth prime in $O(x^{\frac{2}{3}}/\log^2{x})$ operations using $O(\sqrt{x})$ space.

C API

Include the <primecount.h> header to use primecount's C API. All functions that are part of primecount's C API return -1 in case an error occurs and print the corresponding error message to the standard error stream.

#include <primecount.h>
#include <stdio.h>

int main()
{
    int64_t pix = primecount_pi(1000);
    printf("primes <= 1000: %ld\n", pix);

    return 0;
}

C++ API

Include the <primecount.hpp> header to use primecount's C++ API. All functions that are part of primecount's C++ API throw a primecount_error exception (which is derived from std::exception) in case an error occurs.

#include <primecount.hpp>
#include <iostream>

int main()
{
    int64_t pix = primecount::pi(1000);
    std::cout << "primes <= 1000: " << pix << std::endl;

    return 0;
}

Bindings for other languages

primesieve natively supports C and C++ and has bindings available for:

Common Lisp: cl-primecount
Julia: primecount_jll.jl
Lua: lua-primecount
Haskell: primecount-haskell
Python: primecountpy
Python: primecount-python
Rust: primecount-rs

Many thanks to the developers of these bindings!

primecount's People

Contributors

jgmbenoit avatar kimwalisch avatar mkoeppe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

primecount's Issues

Formulas for segmenting Gourdon's A formula

Xavier Gourdon's algorithm requires looking up PrimePi(x) values up to x^(1/2). Since a PrimePi(x) lookup table of size x^(1/2) is too large we need to segment the PrimePi(x) lookup to a size of y (which is about x^(1/3) * log(x)).

For any segment bounded by [low, high[ here are the formulas to compute the minimum and maximum values of the 1st prime used in the algorithm:

min_index_1st_prime = pi[x_star] + 1
max_index_1st_prime = pi[min(sqrt(x / low), x^(1/3))]

For any segment bounded by [low, high[ here are the formulas to compute the minimum and maximum values of the 2nd prime used in the algorithm:

1st_prime = primes[min_index_1st_prime]
max_index_2nd_prime = pi[min(x / (low * primes[b]), sqrt(x / primes[b]))]

Enhance code sections in README.md

Instead of:

$ ./configure
$ make
$ sudo make install

Use:

./configure
make
sudo make install

The rationale of this change is to make copy - pasting more convenient.

primecount-backup prints incorrect time elapsed

primecount-backup uses Gourdon's algorithm only above 1e7. If one first computes e.g. pi(1e10) using Gourdon's algorithm and then e.g. pi(100) primecount-backup will print the time elapsed of the previous pi(1e10) computation.

need to set LD_LIBRARY_PATH for default install on Debian 11

Not 100% sure why, but I cannot run primecount installed with the default procedure into /usr/local/ on Debian 11.
Specifically, I see

$ ldd /usr/local/bin/primecount
        linux-vdso.so.1 (0x00007ffc34ff6000)
        libprimecount.so.7 => not found
        libprimesieve.so.9 => not found
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f4428ba7000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f4428a63000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f4428a49000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4428884000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f4428d9d000)

Does one need to set some cmake options (for buidling shared libraries/exacutables with rpath?) while building in such a case?

(Normally speaking one ought to be able to produce an executable knowing location of each dylib - I know that cmake goes its own strange way there, something to do with https://gitlab.kitware.com/cmake/community/-/wikis/doc/cmake/RPATH-handling - but I don't get details).

Reason for nth prime limit?

Hey! Thanks for making primecount, it's amazingly fast. Just curious about one thing--is there any particular reason as to why 216289611853439384 was chosen as a limit?

EDIT: Whoops, meant to comment as JoshuaRLi, not my work account, no biggie though :P

brew install on macOS installs primecount 7.0 and no library

Please update the brew recipe, I have got the following error on M1 Pro Max, Monterey 12.1b3

primecount -n 1000
dyld[11851]: Library not loaded: @rpath/libprimecount.7.dylib
Referenced from: /opt/homebrew/Cellar/primecount/7.0/bin/primecount
Reason: tried: '/usr/local/lib/libprimecount.7.dylib' (no such file), '/usr/lib/libprimecount.7.dylib' (no such file)

Multi-threading data races

There are a few non-critical data races in primecount where multiple threads write/read to some variable without using a mutex or lock. I am not aware of any CPUs where this could cause a segmentation fault but it is still best to fix these data races.

$ valgrind --tool=drd ./primecount --test

==23554== Conflicting store by thread 1 at 0x0036c411 size 1
==23554==    at 0x128080: primecount::set_print(bool) (in /home/kim/Desktop/primecount-master/build3/primecount)
==23554==    by 0x11AEDA: primecount::pi_legendre(long, int) (in /home/kim/Desktop/primecount-master/build3/primecount)

==23554== Conflicting store by thread 1 at 0x0036c418 size 4
==23554==    at 0x14DE88: primesieve::set_num_threads(int) (in /home/kim/Desktop/primecount-master/build3/primecount)
==23554==    by 0x11B950: primecount::pi_primesieve(long, int) (in /home/kim/Desktop/primecount-master/build3/primecount)

Multiple threads are allowed to read the same variable without locking. But multiple threads are not allowed to write/read the same variable without locking. Both primecount::set_print(bool) and primesieve::set_num_threads(int) are write accesses without locking which is not allowed and which causes a data race.

Data Races in LoadBalancerS2.cpp

Using the Coderrect Scanner https://coderrect.com/ I have discovered three data race pairs occurring in LoadBalancerS2.cpp.

  1. Line 96 vs itself
 94|{
 95|  LockGuard lockGuard(lock_);
>96|  sum_ += thread.sum;
 97|
 98|  if (is_print_)
  1. Line 107 vs Line 114
 105| update_load_balancing(thread);
 106|
>107| thread.low = low_;
 108| thread.segments = segments_;
 109| thread.segment_size = segment_size_;
 110| thread.sum = 0;
 111| thread.secs = 0;
 112| thread.init_secs = 0;
 113|
>114| low_ += segments_ * segment_size_;
 115| bool is_work = thread.low < sieve_limit_;
 116|
  1. Line 122 vs Line 124
 120|void LoadBalancerS2::update_load_balancing(const ThreadSettings& thread)
 121|{
>122|  if (thread.low > max_low_)
 123|  {
>124|   max_low_ = thread.low;
 125|   segments_ = thread.segments;
 126|

Idea for improvement to LMO/pi_lmo5.cpp

I was playing around with the timing of your LMO implementation;

int64_t segment_size = Sieve::get_segment_size(isqrt(limit));

I noticed that setting segment_size = 2 * segment_size seems to run ~2-5% faster

A couple of random tests suggest that for 1e11 - 1e16 4 * segment_size is the sweet spot.

Looking at the code I suspect that there's a trade off between overhead of starting sieve.cross_off_count and better localized memory access.

Maybe get_segment_size could be augmented with 'suggested_segment_size(limit)which handles finding a nice power of two or similar number aroundsqrt(limit)`.

Before I went further I thought I should seek your opinion in case this has been tested or there are other issues I'm not aware of.

AddressSanitizer: undefined-behavior

Using the LLVM/clang sanitizer I found a minor issue:

src/S2LoadBalancer.cpp:175:49: runtime error: division by zero
SUMMARY: AddressSanitizer: undefined-behavior src/S2LoadBalancer.cpp:175 

add pkg-config config?

To facilitate things for non-cmake setups, such as checking the version, setting include and library paths, it would be great to have primecount generating configuration, a .pc file, and installing it in the appropriate location (often it's subdirectory pkgconfig of the directory where the library is installed).

Enable GCC/Clang function multiversioning for ARM SVE once the compilers support it

We have ARM SVE support in Sieve_count.cpp however by default it is not enabled since the GCC/Clang compilers do not support function multiversioning for ARM yet.

Once GCC/Clang compilers support it, I can enable it using:

  1. Create multiarch_arm_sve.cmake CMake script similar to multiarch_avx512_vbmi2.cmake.
  2. Use multiarch_arm_sve.cmake in CMakeLists.txt.
  3. Add new ENABLE_MULTIARCH_ARM_SVE macro to all CMakeLists.txt files.
  4. Add new ENABLE_MULTIARCH_ARM_SVE macro to Sieve.hpp and Sieve_count.cpp.

Please create discussions

@kimwalisch Hi, I am Jakub Drozd again, sorry for bothering you again, but I would like you to set up discussions, so I wouldn't have to open issues. Could you set up the button, please?

make install fail

with

cmake . -DCMAKE_INSTALL_PREFIX=local -DBUILD_STATIC_LIBS=OFF

I obtain

[primecount-4.2] CMake Error at lib/primesieve/cmake_install.cmake:95 (file):
[primecount-4.2]   file INSTALL cannot find
[primecount-4.2]   "/opt/sage/local/var/tmp/sage/build/primecount-4.2/src/primesieve.pc".
[primecount-4.2] Call Stack (most recent call first):
[primecount-4.2]   cmake_install.cmake:90 (include)

Integer overflow in phi(x, a)

It is a nice improvement, thank you!

I also saw your main loop in src/phi.cpp stops recursing when we know all further answers come from the cache. That's a big help. I further split it to stop looping entirely when xp <= primes[i] (we can just add in the right number of -SIGN values).

Something I noticed earlier:

time ./primecount --phi 1e9 100858536
1

real	0m0.579s
user	0m0.532s
sys	0m0.046s

time ./primecount --phi 1e9 900858536
-850011001

real	0m5.578s
user	0m5.143s
sys	0m0.422s

I think this is both performance and correctness. Both these should take essentially zero time to return 1.

If you can get a reasonable prime count upper bound or nth prime lower bound, you can use it in the main phi(x,a,threads) function before generating all the primes, to quickly decide you can return 1. E.g.
if (a >= prime_count_upper(x)) return 1;

Whether this is a case worth optimizing is up to you. The correctness problem with negative returns seems to be primes[a] being an int32_t so overflowing to negative values. Therefore the if statement after that doesn't get used like it should. Perhaps some limit checks before generating the primes.

Originally posted by @danaj in danaj/Math-Prime-Util#48 (comment)

By 2027 change default setting to WITH_LIBDIVIDE=OFF in CMakeLists.txt

We currently use libdivide because integer division is very slow on most x64 CPUs. However integer division will likely become much faster in near future and then WITH_LIBDIVIDE=OFF is expected to improve primecount's performance. But carefully benchmark some large inputs e.g. 1e21 --AC -s.

GCC-15 compatibility

A few days ago, GCC merged some stricter template checks into the 15.x branch:

https://trofi.github.io/posts/322-gcc-15-template-checking-improvements.html

Gentoo's CI caught a build failure that looks related in https://bugs.gentoo.org/936498. I've copy/pasted some of the compiler output below:

FAILED: CMakeFiles/libprimecount.dir/src/deleglise-rivat/S2_easy_libdivide.cpp.o 
/usr/bin/x86_64-pc-linux-gnu-g++ -DENABLE_MULTIARCH_AVX512_BMI2 -Dlibprimecount_EXPORTS -I/var/tmp/portage/sci-mathematics/primecount-7.13/work/primecount-7.13/include  -O2 -pipe -march=native -fno-diagnostics-color -fPIC -fno-diagnostics-color -Wno-uninitialized -MD -MT CMakeFiles/libprimecount.dir
In file included from /var/tmp/portage/sci-mathematics/primecount-7.13/work/primecount-7.13/src/deleglise-rivat/S2_easy_libdivide.cpp:35:
/var/tmp/portage/sci-mathematics/primecount-7.13/work/primecount-7.13/include/libdivide.h: In member function 'bool libdivide::divider<T, ALGO>::operator==(const libdivide::divider<T, ALGO>&) const':
/var/tmp/portage/sci-mathematics/primecount-7.13/work/primecount-7.13/include/libdivide.h:2018:41: error: 'const class libdivide::divider<T, ALGO>' has no member named 'denom'
 2018 |         return div.denom.magic == other.denom.magic &&
      |                                         ^~~~~

There's still a long way to go before GCC-15 is released, but we're starting early this time so that all the complaints don't flood in on the same day.

Error: pi_gourdon_64(125)=31 instead of 30

$ ./primecount 125 -s -g

=== pi_gourdon_64(x) ===
pi(x) = A - B + C + D + phi0 + Sigma
x = 125
y = 7
z = 7
k = 4
alpha_y = 1.527
alpha_z = 1.000
threads = 8

=== Sigma(x, y) ===
Status: 100%                                      
Sigma = 8
Seconds: 0.000

=== Phi0(x, y, z) ===
Status: 100%                                      
phi0 = 28
Seconds: 0.000

=== B(x, y) ===
Status: 100%                                      
B = 5
Seconds: 0.000

=== AC(x, y) ===
Status: 100%                                      
A + C = 0
Seconds: 0.000

=== D(x, y) ===
Status: 100%                                      
D = 0
Seconds: 0.000

31
Seconds: 0.001
# B is correct
For prime 7 > && prime <= 11:
   sum += pi[125 / prime]

-> pi[125/11]
-> pi[11] = 5

AC(x, y) and D(x, y) are most likely correct as well since they only apply for primes > primes[k] (which is primes[4]=7) && primes <= y.

Add C API

Hi, @kimwalisch.
Can you add C API as in the case of primesieve?
It's easier for making language bindings with C API.
Thank you.

prime_pi of a negative number fails with Bus error

As reported by @remyoudompheng at sagemath/sage#35332, prime_pi crashes on negative input. (I think this is true of the 128 bit variant too.)

$ cat sage_bug_35332.c
#include <primecount.h>
#include <stdio.h>

int main()
{
    int64_t pix = primecount_pi(-1);
    printf("primes <= -1: %ld\n", pix);

    return 0;
}
$ gcc sage_bug_35332.c -lprimecount && ./a.out
Bus error (core dumped)
(exit 135)

describing how to build dynamically linked (lib)primecount without root

Playing with cmake a bit, I found how to set up libprimecount and primecount without root, something that's valuable in a number of settings, e.g. on HPC clusters, other shared machines - or building libprimecount extensions (e.g. Python ones).
Should this be added to docs?

The following is only tested on Linux so far.

  1. set the desired location, e.g. export WDIR=$HOME/tmp
  2. install libprimesieve as follows:
cd primesieve
cmake -DBUILD_STATIC_LIBS=OFF -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=$WDIR .
make -j  install     # no sudo!
  1. install primecount as follows
cd primecount
cmake -DBUILD_STATIC_LIBS=OFF -DBUILD_SHARED_LIBS=ON \
       -DCMAKE_INSTALL_PREFIX=$WDIR \
      -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=TRUE  -DCMAKE_INSTALL_RPATH=$WDIR \
      -DCMAKE_SKIP_BUILD_RPATH=FALSE  -DBUILD_LIBPRIMESIEVE=OFF  \
      -DCMAKE_FIND_ROOT_PATH=$WDIR/lib/cmake  \
      -DCMAKE_FIND_ROOT_PATH_MODE_LIBRARY=ONLY .
make -j install # no sudo!
  1. done, now you can do $WDIR/bin/primecount 1000000 and it will work as it should.

Riemann R approximation is a little off

The output of --Ri gives different numerical results than Wolfram Alpha and python's mpmath. For example,

For powers of 10 starting at 10^0 primecount --Ri gives:

0
4
25
168
1226
9587
78527
664667
5761551
50847455
455050683
4118052494
37607910542
346065531065
3204941731601
29844570495886
279238341360977
2623557157055978
24739954284239494
234057667300228940
2220819602556027016
21127269485932299722
201467286689188773984
1925320391607837270272
18435599767347541916672
176846309399141934432256
1699246750872419992338432
16352460426841662907482112
157589269275973235320553472
1520698109714271829697232896

Using mpmath riemannr I get:

from mpmath import riemannr, mp
mp.dps = 40 # set high deliberately

for i in range(0, 30):
print(riemannr(10**i))

1.0
4.564583141005090239865774695584049809675
25.66163326692418259322679794035569814997
168.3594462811673480649133109867329108466
1226.93121834343310855421625817211837034
9587.431738841973414351612923908229431098
78527.39942912770485887029214095925103488
664667.4475647477679853466998874388326848
5761551.86732016956230886495973466773615
50847455.42772142751394887577256049489582
455050683.3068469244631532415819991388608
4118052494.63140044176104610770875738881
37607910542.22591023474569601742946144013
346065531065.8260271978929257301899631105
3204941731601.689034750500754116280826964
29844570495886.92737822228672779202881061
279238341360977.1872302539272988649936199
2623557157055978.003875460015662127309679
24739954284239494.40252165144480328003146
234057667300228940.2346566885611062528803
2220819602556027015.401217592243516061431
21127269485932299723.73386404400837512661
201467286689188773625.1590118748480817499
1925320391607837268776.080252873251190086
18435599767347541878146.80335901928241809
176846309399141934626965.8309690195779633
1699246750872419991992147.2218584438395
16352460426841662910939464.57821529157634
157589269275973235652219770.5690766206166
1520698109714271830281953370.160723380532

Ignoring the few first values, these differ in the following ways,

a) The values are much the same until 2220819602556027015.4 which ends with a 6 in the primecount version.
b) From there the values start to drift apart. For example the final value 1520698109714271830281953370.16 is 584720474.16 far from the primecount version.

I checked wolfram alpha and it gives results very close to mpmath and hence quite far from primecount.

results.txt: incorrect seconds

The primecount-backup version prints the correct elapsed seconds to the screen after stop & resume:

$ ./primecount 1e19 --S2_hard -s

=== S2_hard(x, y) ===
Computation of the hard special leaves
x = 10000000000000000000
y = 66422883
z = 150550526390
c = 6
alpha = 30.831
threads = 8

Status: 93%^C
$ ./primecount 1e19 --S2_hard -s

=== S2_hard(x, y) ===
Computation of the hard special leaves
x = 10000000000000000000
y = 66422883
z = 150550526390
c = 6
alpha = 30.831
threads = 8

Resume from primecount.backup
Status: 100%                                      
S2_hard = 254486506300390351
Seconds: 86.761

But in the results.txt file the seconds are not the total seconds, it guess it is the seconds since the resume:

primecount 1e19 --S2_hard -s
Result: 254486506300390351
Threads: 8
Seconds: 26.758
Date: 2017-07-12 16:36:47

Legendre O(x)

Please forgive me since I'm not great at figuring out these formulas, but
your table says that pi_legendre solves 10^11 in .03 seconds, but the comments in pi_legendre say:

/// Count the number of primes <= x using Legendre's formula.
/// Run time: O(x)
/// Memory usage: O(x^(1/2))

I may not be interpreting O() correctly, but I would assume O(x) means x operations. If I write a program that just iterates to 10^11, it takes a long time, many seconds. Is this because I am misinterpreting how O() works or is it because my program is single threaded and slower than the computer used to produce the results in the table?

Thanks in advance!

Will

Formulas for segmenting Gourdon's C formula

uint64_t max_b = pi[x_star];

// For primes[b] > sqrt(z):

// derived from high
uint64_t x_div_y = x / y;
uint64_t min_1st_prime1 = min(x_div_y / high, primes[pi_x_star]);
uint64_t min_index_1st_prime1 = pi[min_1st_prime1] + 1;

// derived from low
uint64_t min_1st_prime2 = min(isqrt(low), primes[pi_x_star]);
uint64_t min_index_1st_prime2 = pi[min_1st_prime2] + 1;

// Just like in the A formula the result of the formula above may be
// off by 1. Below is a workaround to fix this issue. I am currently
// unsure if the result could ever be off by more than 1.
uint64_t prime2 = primes[min_index_1st_prime2];
uint64_t m2 = max(prime2, x / ipow<T>(prime2, 3));
if (m2 < primes.back() && x / (prime2 * primes[pi[m2] + 1]) < low)
    min_index_1st_prime2 += 1;

// Another option would be to add a check when computing the
// second prime inside the actual algorithm
if (x / low < primes[b] * primes[min_m + 1])
    continue;

// derived from m > (x / prime^3)
uint64_t min_1st_prime3 = min(iroot<3>(x_div_y), primes[pi[x_star]]);
uint64_t min_index_1st_prime3 = pi[min_1st_prime3] + 1;

// derived from m > (z / prime)
uint64_t min_index_1st_prime4 = pi[isqrt(z)] + 1
uint64_t min_index_1st_prime = max(min_index_1st_prime1, ..2, ..3, ..4)

TODO: test with c < PhiTiny::get_c(y)

While working on primesum I tested this briefly and the results were not correct! So it is possible that there is a bug.

// Current code
int64_t c = PhiTiny::get_c(y);

// Test with e.g.
int64_t c = 2;

Primecount fails at runtime with GCC-12 & MinGW/MSYS2 (Windows 10)

This seems to be an OpenMP issue of GCC-12 & MinGW/MSYS2. I have debugged the issue and found that the error occurs on exit of an omp parallel section in P2.cpp:

  std::cout << "P2_OpenMP()5" << std::endl;

  #pragma omp parallel num_threads(threads) reduction(+:sum)
  {
  std::cout << "P2_OpenMP()6" << std::endl;

    int64_t low, high;
    while (loadBalancer.get_work(low, high))
      sum += P2_thread(x, y, low, high);

  std::cout << "P2_OpenMP()7" << std::endl;
  }

  std::cout << "P2_OpenMP()8" << std::endl;
cd primecount/build
cmake .. -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTS=ON
./primecount.exe 20000 --P2

...
P2_OpenMP()5
P2_OpenMP()6
P2_OpenMP()7

Since P2_OpenMP()8 is not printed the exit clearly happens when the OpenMP parallel section finishes, so this seems to be a MinGW OpenMP issue. Note that primesieve's test suite completes successfully using the lastest GCC-12 & MinGW/MSYS2, hence this does not seem to be a libprimesieve issue. I have also tried running an older primecount release i.e. v6.4 from 2020 and it causes the same issue, hence the issue is not caused by a bug that I have recently introduced.

xavier gourdon's paper

I can't seem to find reference 7 anywhere on the internet. Does anyone have a link I can go to or a pdf I can read to get a better understanding of the algorithm?

Potential bug

In P2.cpp we have the following comment:

 // \sum_{i=a+1}^{b} pi(x / primes[i])

But then I use the following code at line 78:

// a = pi(y)
int64_t start = (int64_t) max(x / z + 1, y);

This code has been used since 2013 and I have never seen an erroneous result but I think the condition is an off by 1 error and it should be fixed by:

int64_t start = (int64_t) max(x / z + 1, y + 1);

It is possible the I can even simplify the code to:

int64_t start = (int64_t) (x / z + 1);

Build on windows using mingw fail

cmake .

-- The CXX compiler identification is GNU 8.1.0
-- Check for working CXX compiler: W:/xx_zd/x86_64-8.1.0-release-win32-sjlj-rt_v6-rev0/mingw64/bin/g++.exe
CMake Error: Generator: execution of make failed. Make command was: nmake -f Makefile cmTC_e23fb/fast &&
-- Check for working CXX compiler: W:/xx_zd/x86_64-8.1.0-release-win32-sjlj-rt_v6-rev0/mingw64/bin/g++.exe - broken
CMake Error at W:/python388/Lib/site-packages/cmake/data/share/cmake-3.20/Modules/CMakeTestCXXCompiler.cmake:59 (message):
  The C++ compiler

    "W:/xx_zd/x86_64-8.1.0-release-win32-sjlj-rt_v6-rev0/mingw64/bin/g++.exe"

  is not able to compile a simple test program.

  It fails with the following output:

    Change Dir: W:/cl/primecount/CMakeFiles/CMakeTmp

    Run Build Command(s):nmake -f Makefile cmTC_e23fb/fast 
    Generator: execution of make failed. Make command was: nmake -f Makefile cmTC_e23fb/fast &&




  CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
  CMakeLists.txt:2 (project)


-- Configuring incomplete, errors occurred!
See also "W:/cl/primecount/CMakeFiles/CMakeOutput.log".
See also "W:/cl/primecount/CMakeFiles/CMakeError.log".

I try this command:
cmake . -G "MingGW Makefiles"

CMake Error: Could not create named generator MingGW Makefiles

Generators
  Visual Studio 16 2019        = Generates Visual Studio 2019 project files.
                                 Use -A option to specify architecture.
  Visual Studio 15 2017 [arch] = Generates Visual Studio 2017 project files.
                                 Optional [arch] can be "Win64" or "ARM".
  Visual Studio 14 2015 [arch] = Generates Visual Studio 2015 project files.
                                 Optional [arch] can be "Win64" or "ARM".
  Visual Studio 12 2013 [arch] = Generates Visual Studio 2013 project files.
                                 Optional [arch] can be "Win64" or "ARM".
  Visual Studio 11 2012 [arch] = Generates Visual Studio 2012 project files.
                                 Optional [arch] can be "Win64" or "ARM".
  Visual Studio 10 2010 [arch] = Generates Visual Studio 2010 project files.
                                 Optional [arch] can be "Win64" or "IA64".
  Visual Studio 9 2008 [arch]  = Generates Visual Studio 2008 project files.
                                 Optional [arch] can be "Win64" or "IA64".
  Borland Makefiles            = Generates Borland makefiles.
* NMake Makefiles              = Generates NMake makefiles.
  NMake Makefiles JOM          = Generates JOM makefiles.
  MSYS Makefiles               = Generates MSYS makefiles.
  MinGW Makefiles              = Generates a make file for use with
                                 mingw32-make.
  Green Hills MULTI            = Generates Green Hills MULTI files
                                 (experimental, work-in-progress).
  Unix Makefiles               = Generates standard UNIX makefiles.
  Ninja                        = Generates build.ninja files.
  Ninja Multi-Config           = Generates build-<Config>.ninja files.
  Watcom WMake                 = Generates Watcom WMake makefiles.
  CodeBlocks - MinGW Makefiles = Generates CodeBlocks project files.
  CodeBlocks - NMake Makefiles = Generates CodeBlocks project files.
  CodeBlocks - NMake Makefiles JOM
                               = Generates CodeBlocks project files.
  CodeBlocks - Ninja           = Generates CodeBlocks project files.
  CodeBlocks - Unix Makefiles  = Generates CodeBlocks project files.
  CodeLite - MinGW Makefiles   = Generates CodeLite project files.
  CodeLite - NMake Makefiles   = Generates CodeLite project files.
  CodeLite - Ninja             = Generates CodeLite project files.
  CodeLite - Unix Makefiles    = Generates CodeLite project files.
  Eclipse CDT4 - NMake Makefiles
                               = Generates Eclipse CDT 4.0 project files.
  Eclipse CDT4 - MinGW Makefiles
                               = Generates Eclipse CDT 4.0 project files.
  Eclipse CDT4 - Ninja         = Generates Eclipse CDT 4.0 project files.
  Eclipse CDT4 - Unix Makefiles= Generates Eclipse CDT 4.0 project files.
  Kate - MinGW Makefiles       = Generates Kate project files.
  Kate - NMake Makefiles       = Generates Kate project files.
  Kate - Ninja                 = Generates Kate project files.
  Kate - Unix Makefiles        = Generates Kate project files.
  Sublime Text 2 - MinGW Makefiles
                               = Generates Sublime Text 2 project files.
  Sublime Text 2 - NMake Makefiles
                               = Generates Sublime Text 2 project files.
  Sublime Text 2 - Ninja       = Generates Sublime Text 2 project files.
  Sublime Text 2 - Unix Makefiles
                               = Generates Sublime Text 2 project files.

with this cmd message.
Some error with setting of c/c++ compiler?

Test results.txt on Windows

I have written a basename() function which needs to be tested on Windows using both MSVC and GCC:

string basename(string path)
{
  while (path.back() == '/' ||
         path.back() == '\\')
  {
    path.pop_back();
  }

  size_t pos = path.find_last_of("/\\");

  if (pos != string::npos)
    return path.substr(pos + 1);

  return path;
}

The basename() function should produce entries like the one below (primecount without any slashes or backslashes) in results.txt:

primecount 1e17 -s --S2_trivial

build requires cmake >= 3.7

Building under Ubuntu 16.04 with cmake version 3.5.1 yielded the error:

   if given arguments:

     "CMAKE_VERSION" "VERSION_GREATER_EQUAL" "3.9"

   Unknown arguments specified

According to this thread:

Dav1dde/glad#134

VERSION_GREATER_EQUAL is only supported in versions 3.7 or later.

Suggest altering the minimum required version of cmake to cmake_minimum_required(VERSION 3.7).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.