This project is dependent on BLAS
, which is used inside of LAPACK
. Currently, the only BLAS
routine used in cpgfunction
is dgesv_ (though there will be more routines used in the future as specific sections of the code are transitioned from multi-threading to vector linear algebra). pygfunction makes use of the LAPACK
routine _gesv.
At first, the Python
matrix inversion in np.linalg.solve()
was much faster than what I was making use of in C++
. The root cause turned out to the use of Netlib BLAS
library. Based on comparison of timing tests, it is possible the numpy
package queries the chip-set to determine which routine is best optimized based on the structure of the chip. On my Intel i7, I was able to get matrix inversion speeds as fast as np.linalg.solve()
by making use of the OpenBlas library.
I was reminded of this point when our UGRA Timothy West began computing g-functions on the cluster. He is building on the cluster using CMake
, which currently makes use of the Netlib BLAS library. The submission scripts Timothy submitted were failing due to time limits. The reason why the jobs on the cluster can fail based on run-time are discussed in the following section.
This describes the creation of submission scripts for use on the cluster. The path to a directory of configurations
is supplied. A wide range of timings have been previously recorded, and a function has been created that takes in the number of sources, and returns estimated amount of time to run. Then, based on those timings, the ideal subset-sums are found based on a given input number of days N
. Typically, I will request each submission "bucket" to be one less day than what I am submitting to on the cluster, so that I know the jobs will not run out of time.
A possible outcome of this issue is an automatic selection of the fastest BLAS
library, though that may not end up being the case. The following libraries will be tested, and the fastest will be requested in CMakeLists.txt
:
- Netlib BLAS
- OpenBLAS
- ATLAS