If I wrap the code inside TestRandomForest1 inside a

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

OK, so <a class="issue-link js-issue-link" data-error-text="Failed to load title" data

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Unable to allocate memory running `TestRandomForest1` in a loop about golearn HOT 14 CLOSED

sjwhitworth commented on September 26, 2024

Unable to allocate memory running `TestRandomForest1` in a loop

from golearn.

Comments (14)

Sentimentron commented on September 26, 2024

This is a known problem which has been affecting some Travis builds. Just to confirm:

If you're retaining a reference to each new DenseInstances created inside the loop, then that might cause the problem.
If you're waiting for garbage collection to deallocate the DenseInstances, this might also cause the problem as it currently needs three cycles (one to deallocate DenseInstances, another to call the finaliser on EdfMap and I think another to actually unmap the memory).
We might have to introduce a Deallocate method on DenseInstances or a finalizer to actually ensure the memory gets unmapped.
Additionally, because EdfMap manages lots of pages outside go's garbage collector (the EdfMap structure is actually pretty small), this may mean that garbage collection doesn't run often enough to release all of the memory each time.

Additionally, try cherry picking commit 8e20799 from #69. Also try reducing EDF_SIZE. Hope those things help.

from golearn.

amitkgupta commented on September 26, 2024

Hi @Sentimentron, thanks for the response. I believe your first bullet point does not apply, but your second does. Here's the code:

package ensemble

import (
    "fmt"
    base "github.com/sjwhitworth/golearn/base"
    eval "github.com/sjwhitworth/golearn/evaluation"
    filters "github.com/sjwhitworth/golearn/filters"
    "testing"
)

func TestRandomForest1(testEnv *testing.T) {
    for i := 0; i < 10; i++ {
        inst, err := base.ParseCSVToInstances("../examples/datasets/iris_headers.csv", true)
        if err != nil {
            panic(err)
        }

        filt := filters.NewChiMergeFilter(inst, 0.90)
        for _, a := range base.NonClassFloatAttributes(inst) {
            filt.AddAttribute(a)
        }
        filt.Train()
        instf := base.NewLazilyFilteredInstances(inst, filt)

        trainData, testData := base.InstancesTrainTestSplit(instf, 0.60)

        rf := NewRandomForest(10, 3)
        rf.Fit(trainData)
        predictions := rf.Predict(testData)
        fmt.Println(predictions)
        confusionMat := eval.GetConfusionMatrix(testData, predictions)
        fmt.Println(confusionMat)
        fmt.Println(eval.GetSummary(confusionMat))
    }
}

I don't know the implementation details well enough to comment on the third and fourth bullet, but I do think it's the library's responsibility to ensure memory is freed quickly and reliably.

Here's a target use-case: I want to be able to run the RandomForest classifier on the iris dataset so that it achieves a reasaonable level of accuracy, and I want to see that it is able to achieve this reasonable level of accuracy in a reasonable amount of time. In order to test that it's performing reasonably accurately reasonably fast, I want to write a benchmark test that asserts that the overall accuracy is greater than some threshold, and the average execution time is less than some threshold when running the code over several iterations.

On the same dataset, the knn-classifier consistently achieves about 95% accuracy. For random forest, in order to get above even 70% accuracy, I had to bump the forest size to something like 50 (it's currently at 10 in the test). However, at that forest size, I can only iterate about 3 times before I get the memory panic. I'd say a minimal target would be that the Random Forest prediction can be run on the iris dataset 10 times, with an accuracy consistently over 70%, and an average runtime of at most 0.5s, without any memory panics.

I'll experiment with cherry-picking that commit and reducing EDF_SIZE tomorrow to see if the above target can be reached without changing or adding anything to the edf implementation.

from golearn.

Sentimentron commented on September 26, 2024

Sounds like a challenge. I'll experiment with adding a finalizer to
DenseInstances and tweaking the allocation tonight.

On 20 August 2014 09:34, Amit Gupta [email protected] wrote:

Hi @Sentimentron https://github.com/Sentimentron, thanks for the
response. I believe your first bullet point does not apply, but your second
does. Here's the code:

package ensemble

import (
"fmt"
base "github.com/sjwhitworth/golearn/base"
eval "github.com/sjwhitworth/golearn/evaluation"
filters "github.com/sjwhitworth/golearn/filters"
"testing"
)

func TestRandomForest1(testEnv *testing.T) {
for i := 0; i < 10; i++ {
inst, err := base.ParseCSVToInstances("../examples/datasets/iris_headers.csv", true)
if err != nil {
panic(err)
}
    filt := filters.NewChiMergeFilter(inst, 0.90)
    for _, a := range base.NonClassFloatAttributes(inst) {
        filt.AddAttribute(a)
    }
    filt.Train()
    instf := base.NewLazilyFilteredInstances(inst, filt)

    trainData, testData := base.InstancesTrainTestSplit(instf, 0.60)

    rf := NewRandomForest(10, 3)
    rf.Fit(trainData)
    predictions := rf.Predict(testData)
    fmt.Println(predictions)
    confusionMat := eval.GetConfusionMatrix(testData, predictions)
    fmt.Println(confusionMat)
    fmt.Println(eval.GetSummary(confusionMat))
}
}

I don't know the implementation details well enough to comment on the
third and fourth bullet, but I do think it's the library's responsibility
to ensure memory is freed quickly and reliably.

Here's a target use-case: I want to be able to run the RandomForest
classifier on the iris dataset so that it achieves a reasaonable level of
accuracy, and I want to see that it is able to achieve this reasonable
level of accuracy in a reasonable amount of time. In order to test that
it's performing reasonably accurately reasonably fast, I want to write a
benchmark test that asserts that the overall accuracy is greater than some
threshold, and the average execution time is less than some threshold when
running the code over several iterations.

On the same dataset, the knn-classifier consistently achieves about 95%
accuracy. For random forest, in order to get above even 70% accuracy, I had
to bump the forest size to something like 50 (it's currently at 10 in the
test). However, at that forest size, I can only iterate about 3 times
before I get the memory panic. I'd say a minimal target would be that the
Random Forest prediction can be run on the iris dataset 10 times, with an
accuracy consistently over 70%, and an average runtime of at most 0.5s,
without any memory panics.

I'll experiment with cherry-picking that commit and reducing EDF_SIZE
tomorrow to see if the above target can be reached without changing or
adding anything to the edf implementation.

—
Reply to this email directly or view it on GitHub
#73 (comment).

from golearn.

Sentimentron commented on September 26, 2024

OK: so I've spent lots of time looking at stack traces and I know what the problem is. By default, each call to NewDenseInstances (e.g. every time a GeneratePredictionVector is called in a tree) maps in 128 MB by default (EDF_SIZE), but there's only a few hundred bytes of tracking structures allocated to keep track of that memory. Because there's no memory pressure on Go's working set, it doesn't run garbage collection often enough to call the finalizers which unmap that memory, so the virtual memory allocated just balloons until everything falls over. The solution, I suspect, is to change the implementation of EdfAnonMap so that it allocates byte slices using make from Go's working set.

from golearn.

Sentimentron commented on September 26, 2024

OK, so #75 changes the backing of EdfAnonMap to use make, my quick survey of top indicate that the working set and VMem remain stable, so see if this fixes the problem, and if not, let me know.

from golearn.

mish15 commented on September 26, 2024

@Sentimentron i bet that took a while to find!! Out of interest, is the main reason for using mmap to analyze data greater than the available memory?

@Amit-PivotalLabs if you want to solve this in the short term, given the above comments, setting the overcommit=1 on your AWS instance should do the trick. By default it is disabled.

from golearn.

Sentimentron commented on September 26, 2024

@mish15 Basically yes, but I still need to do some more work to support that.

I overcommit on my Linode, but with a gig of RAM it only gets to about 400 GB or so of overcommit before it falls over. That's not quite enough, so I think this is the way forward for the time being.

from golearn.

amitkgupta commented on September 26, 2024

Reducing EDF_SIZE to 8MB "worked" in that I was able to run the code in this comment with the size of the random forest increased from 10 to 50. However it's still just a bandaid, because increasing the forest size further would eventually cause the memory panics again.

@Sentimentron I'll give your PR a shot and see if it's a more stable solution.

from golearn.

mish15 commented on September 26, 2024

@Sentimentron makes perfect sense. Might be worth noting in the install docs about the overcommit setting. Or stick to mem if overcommit is not set.

400GB virtual from 1GB is pretty amazing really, i'm surprised you got that far! :) Out of interest, why the need for more than 400GB?

from golearn.

amitkgupta commented on September 26, 2024

@Sentimentron Thanks! It appears #75 is a stable solution, I can increase the number of iterations and the size of the forest quite a bit and I don't see any memory panics.

@Sentimentron, @mish15 To be clear about my purpose, I'm not personally trying to use this library at the moment to do any machine learning. I'm actually trying to write a benchmarking framework/test suite for the golearn library itself, so I'm not looking for workarounds.

from golearn.

Sentimentron commented on September 26, 2024

Great. @sjwhitworth ready to merge #75 if you have no objections.

from golearn.

sjwhitworth commented on September 26, 2024

Looks good to me.

from golearn.

mish15 commented on September 26, 2024

@Amit-PivotalLabs no problem and nice work!

from golearn.

Amit-PivotalLabs commented on September 26, 2024

#75 has been merged, closing issue.

from golearn.

Unable to allocate memory running `TestRandomForest1` in a loop about golearn HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent