scikits-sparse seem to be leaking memory at an alarming rate :-( After some valgrind'

Should be fixed by <a class="commit-link" data-hovercard-type="commit" data-hovercard-

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Memory leak about scikit-sparse HOT 28 CLOSED

scikit-sparse commented on July 30, 2024

Memory leak

from scikit-sparse.

Comments (28)

njsmith commented on July 30, 2024

Hi Antony,
I haven't used this code in years, and it sounds like you do... any objection to my just giving you write access to fix things? :-)

from scikit-sparse.

anntzer commented on July 30, 2024

Heh, I guess I got what I asked for :-)

from scikit-sparse.

njsmith commented on July 30, 2024

Well, I would love to make it work better, I just absolutely don't have the time, so :-) Gave you access, have fun :-)

from scikit-sparse.

anntzer commented on July 30, 2024

Should be fixed by 88392d7.

Ping'ing all contributors, @jluttine @rainwoodman @jsalvatier @chiffa @pf4d @kforeman @pv @yurivict; please let me know if you have any patches you're interested in getting merged in before I bug @njsmith to make a 0.3 release.

from scikit-sparse.

jluttine commented on July 30, 2024

I don't have any. Actually, I haven't used this package for a few years now, but maybe someday again. Great package anyway, and nice job @anntzer 👍

from scikit-sparse.

chiffa commented on July 30, 2024

@anntzer : do you have your own branch with this commit merged so that I can pull it and stress-test it with my application?

from scikit-sparse.

anntzer commented on July 30, 2024

@chiffa Try the pre-0.3 branch.

from scikit-sparse.

rainwoodman commented on July 30, 2024

Hi Antony,

I haven't used this package for a few years neither. But glancing through
the code, it is not so clear to me why a pool is necessary.

Could you add a few lines of comments in the code to explain the memory
management model?

Thanks for the work!

On Tue, Feb 9, 2016 at 9:32 AM, Antony Lee [email protected] wrote:

@chiffa https://github.com/chiffa Try the pre-0.3 branch.

—
Reply to this email directly or view it on GitHub
#15 (comment)
.

from scikit-sparse.

anntzer commented on July 30, 2024

Indeed, there was a simpler solution: allocate the matrice structs on the stack.

In the previous solution I was basically too lazy to figure out what the required lifetimes of each object was, so I bound all of them to the Factor object (everything gets GC'd/free()'d when the Factor is GC'd). But it seems that a tighter coupling is possible (which is basically what @njsmith used to do).

PS: reset'ed master to the @njsmith's latest version; I'll make a properly rebased PR once we can agree on everything.

from scikit-sparse.

rainwoodman commented on July 30, 2024

Thanks for the clarification. This is indeed quicker than figuring it all out. Nevertheless, some explanation in the code would avoid the "ends cut off" scenario (c.f. http://selfdefinedleadership.com/blog/?p=158 ) :)

from scikit-sparse.

anntzer commented on July 30, 2024

I squashed out the memorypool out of the history, so now the relevant patch is fairly trivial (795e599).

from scikit-sparse.

anntzer commented on July 30, 2024

@chiffa Did you have any chance to check I didn't completely break your code?

from scikit-sparse.

chiffa commented on July 30, 2024

@anntzer : failing to pip-install it for now. Ubuntu 14.04, scipy and numpy through Anaconda Python.

ccd@BSNO7397-Ubuntu:~$ pip install -U git+https://github.com/scikit-sparse/[email protected]
Collecting git+https://github.com/scikit-sparse/[email protected]
  Cloning https://github.com/scikit-sparse/scikit-sparse.git (to pre-0.3) to /tmp/pip-QCiLno-build
Requirement already up-to-date: numpy in ./anaconda2/lib/python2.7/site-packages (from scikit-sparse==0.2+dev)
Requirement already up-to-date: scipy in ./anaconda2/lib/python2.7/site-packages (from scikit-sparse==0.2+dev)
Installing collected packages: scikit-sparse
  Running setup.py install for scikit-sparse ... error
    Complete output from command /home/ccd/anaconda2/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-QCiLno-build/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-t_nk2E-record/install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-2.7
    creating build/lib.linux-x86_64-2.7/sksparse
    copying sksparse/test_cholmod.py -> build/lib.linux-x86_64-2.7/sksparse
    copying sksparse/__init__.py -> build/lib.linux-x86_64-2.7/sksparse
    creating build/lib.linux-x86_64-2.7/sksparse/test_data
    copying sksparse/test_data/illc1850.mtx.gz -> build/lib.linux-x86_64-2.7/sksparse/test_data
    copying sksparse/test_data/illc1033.mtx.gz -> build/lib.linux-x86_64-2.7/sksparse/test_data
    copying sksparse/test_data/illc1850_rhs1.mtx.gz -> build/lib.linux-x86_64-2.7/sksparse/test_data
    copying sksparse/test_data/well1850.mtx.gz -> build/lib.linux-x86_64-2.7/sksparse/test_data
    copying sksparse/test_data/well1850_rhs1.mtx.gz -> build/lib.linux-x86_64-2.7/sksparse/test_data
    copying sksparse/test_data/well1033.mtx.gz -> build/lib.linux-x86_64-2.7/sksparse/test_data
    copying sksparse/test_data/illc1033_rhs1.mtx.gz -> build/lib.linux-x86_64-2.7/sksparse/test_data
    copying sksparse/test_data/well1033_rhs1.mtx.gz -> build/lib.linux-x86_64-2.7/sksparse/test_data
    running build_ext
    building 'sksparse.cholmod' extension
    creating build/temp.linux-x86_64-2.7
    creating build/temp.linux-x86_64-2.7/sksparse
    gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/include/suitesparse -I/home/ccd/anaconda2/include/python2.7 -c sksparse/cholmod.c -o build/temp.linux-x86_64-2.7/sksparse/cholmod.o
    sksparse/cholmod.c:259:31: fatal error: numpy/arrayobject.h: No such file or directory
     #include "numpy/arrayobject.h"
                                   ^
    compilation terminated.
    error: command 'gcc' failed with exit status 1

    ----------------------------------------
Command "/home/ccd/anaconda2/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-QCiLno-build/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-t_nk2E-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-QCiLno-build

This error does not appear on the 0.2 branch:

 pip install -U git+https://github.com/scikit-sparse/scikit-sparse.git
Collecting git+https://github.com/scikit-sparse/scikit-sparse.git
  Cloning https://github.com/scikit-sparse/scikit-sparse.git to /tmp/pip-HRJvNO-build
Requirement already up-to-date: numpy in ./anaconda2/lib/python2.7/site-packages (from scikits.sparse==0.2+dev)
Requirement already up-to-date: scipy in ./anaconda2/lib/python2.7/site-packages (from scikits.sparse==0.2+dev)
Installing collected packages: scikits.sparse
  Found existing installation: scikits.sparse 0.2
    Uninstalling scikits.sparse-0.2:
      Successfully uninstalled scikits.sparse-0.2
  Running setup.py install for scikits.sparse ... done
Successfully installed scikits.sparse-0.2+dev

from scikit-sparse.

anntzer commented on July 30, 2024

I messed up a bit with how the include path was passed to gcc... try again? (Note: I rewrote the history so you'll need to reset --hard.)

from scikit-sparse.

chiffa commented on July 30, 2024

I am using the pre-0.3 head right away, so it should be ok. I will try it out tomorrow once I get my hands on a memory-bound instance.

from scikit-sparse.

chiffa commented on July 30, 2024

Can you benchmark the CPU usage on large matrices (>10 000x10 000)? Right now the CPU usage pattern and time it takes to compute is significantly worse than the previous one. I am using a different machine and it is possible that the difference is due to the difference in the bus CPU/bus. I will check that more in details when the computation on the other machine finishes.

from scikit-sparse.

anntzer commented on July 30, 2024

Do you have a test case? Thanks.

from scikit-sparse.

chiffa commented on July 30, 2024

Yes, but it will take a while to set-up.

from scikit-sparse.

anntzer commented on July 30, 2024

No hurry.

from scikit-sparse.

chiffa commented on July 30, 2024

https://github.com/chiffa/BioFlow => follow the instructions to set up the databases, then build the matrix for humans. by that time I will provide you with gists to execute to reproduce the analysis pipeline I am currently running.

from scikit-sparse.

anntzer commented on July 30, 2024

This is... big. Any chance you can run the workflow while hooking calls to cholmod to save the matrices (say, as pickles) as they are analyzed and put them on Dropbox?

from scikit-sparse.

chiffa commented on July 30, 2024

Sure. I'll extract a couple of them for you tomorrow.

from scikit-sparse.

chiffa commented on July 30, 2024

You can find the matrix dumped here: https://www.dropbox.com/s/ydwcbbghz7u4b12/debug_for_cholesky.dmp?dl=0 I re-installed the 0.2 version and am currently running the same pipeline to see how the machine will perform. I will report tomorrow. I apply Cholesky to that matrix with a fudge of 1e-10

Best,

from scikit-sparse.

anntzer commented on July 30, 2024

0.2 and pre-0.3 seem to perform the same for me for the cholesky step:

In [6]: %time sksparse.cholmod.cholesky(m, 1e-10)
CPU times: user 2.58 s, sys: 1.14 s, total: 3.72 s
Wall time: 1.11 s
Out[6]: <sksparse.cholmod.Factor at 0x7fdfbf1be690>

In [7]: %time scikits.sparse.cholmod.cholesky(m, 1e-10)
CPU times: user 2.52 s, sys: 1.28 s, total: 3.79 s
Wall time: 1.12 s
Out[7]: <scikits.sparse.cholmod.Factor at 0x7fdfbf1be4d0>

However, I guess you should also give me the RHS so that I actually solve the system.

from scikit-sparse.

chiffa commented on July 30, 2024

If finished verifying the performanice issue on my side: it seems that it is due to the machine and not Cholesky decomposition. I did however see a pretty significant RAM performance improvement on the pre-0.3 branch and something that looked like a leakage closing (RAM usage after a night of execution was 2x lower in the pre-0.3 branch compared to the 0.2).

I will go ahead and install the pre-0.3 on all my machines, thank you very much for the patch!

from scikit-sparse.

chiffa commented on July 30, 2024

After some examination the main reason for the performance drop was due to the fact that I was comparing processors that were spread by about 6 years in architecture launch dates and that in the meantime linear algebra and threading implementation seem to have made quite a lot of progress.

from scikit-sparse.

anntzer commented on July 30, 2024

scikit-sparse 0.3 has been released on PyPI. And there was a great joy etc. (but please check that I haven't messed up something as that's my first release on PyPI ever)
I'll probably leave the old scikits.sparse as it is for now (modulo a small note regarding its deprecation).

from scikit-sparse.

chiffa commented on July 30, 2024

Good job, thanks for your contribution!

from scikit-sparse.

Memory leak about scikit-sparse HOT 28 CLOSED

Comments (28)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent