Comments (7)
We have fixed the issues of using multi:softprob
and n_gpus
. You should see an error message saying the number of available GPUs is smaller than n_gpus
, if you request using more GPUs than available. Please update thundergbm to the latest version. If the problems still exist, feel free to let us know.
Regarding the data set size, we cannot reproduce the problem. Would you please provide more information about your data set? or even better directly share the data set here.
from thundergbm.
Thanks for the feedback. We will work on it and get back to you, once the problem is fixed. Please stay tuned.
from thundergbm.
Thank you very much for quickly replying and fixing the bug.
And now, the probability value can be get from Parameter set:
objective='multi:softprob'
However, when I set the "n_gpus" more than 1, thunderGBM still shutdown.
Moreover, even the "n_gpus=1", when the TRAIN DATA amount is larger it will shutdown with error:
[error == cudaSuccess] out of memory.
At last, when I del the variable "model", my GPU memory will not release.How can i release it in my code?
Thanks again.
Here is my code and data:
from __future__ import division
import numpy as np
import thundergbm
from sklearn.model_selection import train_test_split
import numpy as np
import gc
from scipy import sparse
from scipy.sparse import csr_matrix, hstack, vstack
import warnings
import random
warnings.filterwarnings('ignore')
label= pd.read_csv("label.csv", header = None)
csr_trainData = sparse.load_npz('csr_trainData13100.npz')
csr_trainData = csr_trainData[:, :5000]
csr_trainData.shape
trainData, valData, trainLabel, valLabel = train_test_split(csr_trainData, label.iloc[:, 1], test_size=0.2, random_state=0)
clf = thundergbm.TGBMClassifier(bagging=1, lambda_tgbm=1, learning_rate=0.07, min_child_weight=1.2, n_gpus=1, verbose=0,
n_parallel_trees=40, gamma=0.2, depth=7, n_trees=4000, tree_method='hist', objective='multi:softprob')
clf.fit(trainData, trainLabel)
clf.score(valData, valLabel)
pridect = clf.predict(valData)
pridect
del clf
gc.collect()
Label and data:
https://pan.baidu.com/s/1rssIuuL3icYHsNnlWfHWew
extract code:0gux
from thundergbm.
The code runs fine on our machine. What OS, GPUs, and CUDA version, do you use?
from thundergbm.
Ubuntu 18.04
NVIDIA:
NVIDIA-SMI 390.67 Driver Version: 390.67
CUDA:
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
Can you fill up the CUDA Memory-Usage in your machine?
It's perform well in small scale. But broken in large scale Train Data.
from thundergbm.
Thanks for your work, @zeyiwen.
As @VoyagerIII, I find a similar error. When I execute my code with 1 GPU, no problems at all. But if I set n_gpus to 2 or 3, I find an " illegal memory access was encountered". Evidently, my computer is a 3 GPU one.
It seems to ocurr at predict time: fitting is completed succesfully. I just stoped the code after fitting and before predicting.
This is the code:
import numpy as np
import sys
from thundergbm import TGBMClassifier
from sklearn import datasets as dts
from sklearn.model_selection import train_test_split
#Overall parameters
train_ratio=0.75
random_state=123457
limit=None
num_classes=10
num_estimators=10
num_parallel_trees=100
objective='multi:softmax'
max_depth=6
#number of GPU's
num_gpus=3
#Loads dataset digits
digits=dts.load_digits()
X=digits.data
y=digits.target
# Create 0.75/0.25 train/test split
X_train, X_test, y_train, y_test = train_test_split(X, y, \
test_size=(1-train_ratio), \
train_size=train_ratio, \
random_state=random_state, \
shuffle=True, \
stratify=None)
#Classfier
clf = TGBMClassifier(objective=objective, \
n_trees=num_estimators, \
n_parallel_trees=num_parallel_trees, \
n_gpus=num_gpus, \
depth=max_depth,
num_class=num_classes,
tree_method='auto')
#Fitting
clf.fit(X_train, y_train)
#sys.exit(0)
#Predicting
y_pred = clf.predict(X_test)
#Score
print("Score: %10.5f"%(np.count_nonzero(np.equal(y_pred, y_test)) / y_test.shape[0]))
Ubuntu 18.04.4 TLS
NVIDIA-SMI 396.54 3 TITAN Xp GPUs
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
Thanks.
from thundergbm.
Hi @fjgmoya, the issue that " illegal memory access was encountered" when applying prediction on multiple GPUs is fixed. You can reinstall ThunderGBM and have a try. Thank you!
from thundergbm.
Related Issues (20)
- any status update on issue #63? HOT 1
- ImportError: cannot import name 'TGBMClassifier' from 'thundergbm' (unknown location)
- FileNotFoundError: Could not find module 'c:\users\donald seger\miniconda3\envs\tensorflow\lib\site-packages\thundergbm-0.3.16-py3.8.egg\thundergbm\thundergbm.dll' (or one of its dependencies). Try using the full path with constructor syntax.
- Can you please answer issue #63?
- the Random Forest classifies everything to be 1 HOT 2
- Can you please answer issue #68?
- Is there a parameter to set "minimum samples per leaf" for Random Forest? HOT 1
- Why did you remove my issue? It is completely legit. HOT 1
- Please why are you removing my issue without any reason? HOT 1
- Debug Assertion Failed
- AttributeError: 'TGBMClassifier' object has no attribute 'save' HOT 1
- issue with saving and reloading model? HOT 2
- Please answer issue #75
- How to visualize the ThunderGBM? HOT 3
- random seed generator HOT 1
- Output of multi:softprob is probably wrong HOT 4
- Problems with random forest classifier when using more and deeper leaners
- Check failed: [max_elem + n_columns*(max_elem + 1) < 0x7fffffff] Max_values is too large to be transformed.请问这个问题怎么解决? HOT 2
- Can the Negative Weights be Input? HOT 2
- Cuda 12
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from thundergbm.