Comments (8)
catboost version: 1.26.4
Please specify a correct version. CatBoost does not have version '1.26.4' yet.
from catboost.
Apologies, it's 1.2.3
from catboost.
At the moment I just catch CatBoostError and continue with a different study.
If you can provide a set of hyperparameters on which such an error occurs that would be helpful. If you have a fully reproducible code example that would be even more helpful.
from catboost.
The hyperparameters where this has occurred in the last 5 instances are as follows:
'learning_rate': 0.024307262528096778, 'depth': 3, 'l2_leaf_reg': 6.906966305187461, 'boosting_type': 'Plain', 'iterations': 1538, 'task_type': 'GPU', 'devices': '0' + 'num_features_to_select': 211, 'steps': 1
'learning_rate': 0.096895104759761, 'depth': 4, 'l2_leaf_reg': 9.694786305854311, 'boosting_type': 'Plain', 'iterations': 2113, 'task_type': 'GPU', 'devices': '0' + 'num_features_to_select': 250, 'steps': 3
'learning_rate': 0.05149527087891092, 'depth': 4, 'l2_leaf_reg': 8.187545785164538, 'boosting_type': 'Plain', 'iterations': 2495, 'task_type': 'GPU', 'devices': '0' + 'num_features_to_select': 256, 'steps': 10
'learning_rate': 0.096895104759761, 'depth': 4, 'l2_leaf_reg': 9.694786305854311, 'boosting_type': 'Plain', 'iterations': 2113, 'task_type': 'GPU', 'devices': '0' + 'num_features_to_select': 250, 'steps': 3
'learning_rate': 0.05149527087891092, 'depth': 4, 'l2_leaf_reg': 8.187545785164538, 'boosting_type': 'Plain', 'iterations': 2495, 'task_type': 'GPU', 'devices': '0' + 'num_features_to_select': 256, 'steps': 10
I'll see if I can get a concise case to manifest that doesn't leak too much data, the odd thing is its usually within 10-20 iterations randomly.
from catboost.
'num_features_to_select'
Are you calling select_features
?
from catboost.
I am, call looks like so:
try:
summary = model.select_features(
train_pool,
features_for_select=X_train.columns.values,
num_features_to_select=catboost_select_features_params[
NUM_FEATURES_TO_SELECT_KEY
],
steps=catboost_select_features_params[STEPS_KEY],
algorithm=EFeaturesSelectionAlgorithm.RecursiveByShapValues,
shap_calc_type=EShapCalcType.Regular,
train_final_model=True,
logging_level="Verbose",
)
except CatBoostError as e:
print(f"CatBoostError: {e}")
return None, [], {}
from catboost.
I have a reproducible test case:
With the following code:
from catboost import CatBoostRegressor # type: ignore
from catboost import EFeaturesSelectionAlgorithm, EShapCalcType, Pool
import pandas as pd
params = {'catboost': {'learning_rate': 0.024517856609649665, 'depth': 4, 'l2_leaf_reg': 3.279624422858039, 'boosting_type': 'Ordered', 'iterations': 720}, 'catboost_select_features': {'num_features_to_select': 0.11037514116430513, 'steps': 7}, 'task_type': 'GPU', 'devices': '0'}
X = pd.read_parquet("X.parquet")
y = pd.read_parquet("y.parquet").squeeze()
W = pd.read_parquet("W.parquet").squeeze()
model = CatBoostRegressor(**params["catboost"])
train_pool = Pool(data=X, label=y, weight=W)
catboost_select_features_params = params["catboost_select_features"]
summary = model.select_features(
train_pool,
features_for_select=X.columns.values,
num_features_to_select=max(
1,
int(
catboost_select_features_params["num_features_to_select"]
* len(X.columns.values)
),
),
steps=catboost_select_features_params["steps"],
algorithm=EFeaturesSelectionAlgorithm.RecursiveByShapValues,
shap_calc_type=EShapCalcType.Regular,
train_final_model=True,
)
Happens on Linux and Mac.
from catboost.
I played around with this, and if I perform the following modification to the weights in order to avoid 0.0 appearing I don't see this issue:
W = W.clip(lower=0.00000001)
from catboost.
Related Issues (20)
- Major difference between predictions from trained model HOT 4
- "Plain" train mode still build the oblivious tree HOT 3
- Question for building ordered boosting tree
- Get difference tree result when converting cat_features to numerical values HOT 2
- Why does leaf value in plot tree is related to learning rate?
- How to recursive remove features by best loss ? HOT 2
- Issue with Categorical Feature Encoding in Binary Classification HOT 1
- Request to enable sample weights for Cox and AFT objectives HOT 1
- C++ standalone evaluator multiclass support HOT 1
- The results calculated according to the formula described in the doc are different from the results displayed by the model.
- Documentation: broken links HOT 1
- SetPredictionType(modelHandle, APT_CLASS) is broken HOT 1
- Where is the place for calculating the score function? HOT 2
- Build catboost python package with custom glibc HOT 3
- Custom RMSE loss in tutorial get difference tree structure with the original RMSE loss!!! HOT 3
- _catboost.CatBoostError: /src/catboost/catboost/libs/model/model.cpp:564: Too many features in model, ask catboost team for support HOT 3
- If multiple GPUs are present at server and devices parameter is set to specific GPU, catboost allocates GPU memory at other GPUs HOT 1
- Different results with Python and command line HOT 1
- PySpark ML CrossValidator cannot load serialized CrossValidator because it cannot find CatBoostRegressor class HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from catboost.