I'm a data scientist / machine learning engineer.
abhishekkrthakur / autoxgb Goto Github PK
View Code? Open in Web Editor NEWXGBoost + Optuna
License: Apache License 2.0
XGBoost + Optuna
License: Apache License 2.0
Hey Abhishek, great work in setting up this really useful library, certainly makes the implementation of XGBoost much simpler. I ran a mini project using AutoXGB by objectively evaluating its use against the standard XGBoost. Happy to hear your thoughts. The writeup can be found here: https://towardsdatascience.com/autoxgb-for-financial-fraud-detection-f88f30d4734a?sk=13bbbe9761698db8d4c0ffef661db916
the autoxgb study does not launch when one of the targets is missing .
is there anywork around ?
executed code:
>>> axgb = AutoXGB(
... train_filename="X_train.csv",
... output="output",
... test_filename="X_valid.csv",
... task="classification",
... idx=None,
... targets=["label"],
... features=['feat1', 'feat2', 'feat3', 'feat4', 'feat5'],
... categorical_features=None,
... use_gpu=False,
... num_folds=5,
... seed=42,
... num_trials=100,
... time_limit=360,
... fast=False,
... )
error log:
2023-04-21 04:37:30.385 | INFO | autoxgb.autoxgb:_process_data:149 - Reading training data
2023-04-21 04:37:30.727 | INFO | autoxgb.utils:reduce_memory_usage:48 - Mem. usage decreased to 5.07 Mb (80.9% reduction)
2023-04-21 04:37:30.732 | INFO | autoxgb.autoxgb:_determine_problem_type:140 - Problem type: multi_class_classification
2023-04-21 04:37:30.851 | INFO | autoxgb.utils:reduce_memory_usage:48 - Mem. usage decreased to 1.87 Mb (82.4% reduction)
2023-04-21 04:37:30.851 | INFO | autoxgb.autoxgb:_create_folds:58 - Creating folds
2023-04-21 04:37:30.868 | INFO | autoxgb.autoxgb:_process_data:170 - Encoding target(s)
2023-04-21 04:37:30.875 | INFO | autoxgb.autoxgb:_process_data:195 - Found 0 categorical features.
2023-04-21 04:37:31.084 | INFO | autoxgb.autoxgb:_process_data:236 - Model config: train_filename='X_train.csv' test_filename='X_valid.csv' idx='id' targets=['label'] problem_type=<ProblemType.multi_class_classification: 2> output='output' features=['feat1', 'feat2', 'feat3', 'feat4', 'feat5'] num_folds=5 use_gpu=False seed=42 categorical_features=[] num_trials=100 time_limit=360 fast=False
2023-04-21 04:37:31.084 | INFO | autoxgb.autoxgb:_process_data:237 - Saving model config
2023-04-21 04:37:31.085 | INFO | autoxgb.autoxgb:_process_data:241 - Saving encoders
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/workspace/models/venv-autoxgb/lib/python3.8/site-packages/autoxgb/autoxgb.py", line 247, in train
best_params = train_model(self.model_config)
File "/workspace/models/venv-autoxgb/lib/python3.8/site-packages/autoxgb/utils.py", line 207, in train_model
study = optuna.create_study(
File "/workspace/models/venv-autoxgb/lib/python3.8/site-packages/optuna/study/study.py", line 1136, in create_study
storage = storages.get_storage(storage)
File "/workspace/models/venv-autoxgb/lib/python3.8/site-packages/optuna/storages/__init__.py", line 31, in get_storage
return _CachedStorage(RDBStorage(storage))
File "/workspace/models/venv-autoxgb/lib/python3.8/site-packages/optuna/storages/_rdb/storage.py", line 187, in __init__
self._version_manager.check_table_schema_compatibility()
File "/workspace/models/venv-autoxgb/lib/python3.8/site-packages/optuna/storages/_rdb/storage.py", line 1310, in check_table_schema_compatibility
current_version = self.get_current_version()
File "/workspace/models/venv-autoxgb/lib/python3.8/site-packages/optuna/storages/_rdb/storage.py", line 1337, in get_current_version
assert version is not None
AssertionError
pip freeze:
alembic==1.10.3
anyio==3.6.2
asgiref==3.6.0
attrs==23.1.0
autopage==0.5.1
autoxgb==0.2.2
click==8.1.3
cliff==4.2.0
cmaes==0.9.1
cmd2==2.4.3
colorlog==6.7.0
fastapi==0.70.0
greenlet==2.0.2
h11==0.14.0
idna==3.4
importlib-metadata==6.5.0
importlib-resources==5.12.0
joblib==1.1.0
loguru==0.5.3
Mako==1.2.4
MarkupSafe==2.1.2
numpy==1.21.3
optuna==2.10.0
packaging==23.1
pandas==1.3.4
pbr==5.11.1
prettytable==3.7.0
pyarrow==6.0.0
pydantic==1.8.2
pyperclip==1.8.2
python-dateutil==2.8.2
pytz==2023.3
PyYAML==6.0
scikit-learn==1.0.1
scipy==1.10.1
six==1.16.0
sniffio==1.3.0
SQLAlchemy==2.0.9
starlette==0.16.0
stevedore==5.0.0
threadpoolctl==3.1.0
tqdm==4.65.0
typing_extensions==4.5.0
uvicorn==0.15.0
wcwidth==0.2.6
xgboost==1.5.0
zipp==3.15.0
Hi
As per the subject, I am getting the error when I am running in local:
2021-11-01 15:45:04.651 | INFO | autoxgb.autoxgb:__post_init__:42 - Output directory: output3
2021-11-01 15:45:04.652 | WARNING | autoxgb.autoxgb:__post_init__:49 - No id column specified. Will default to `id`.
2021-11-01 15:45:04.653 | INFO | autoxgb.autoxgb:_process_data:149 - Reading training data
2021-11-01 15:45:04.885 | INFO | autoxgb.utils:reduce_memory_usage:48 - Mem. usage decreased to 2.19 Mb (76.0% reduction)
2021-11-01 15:45:04.891 | INFO | autoxgb.autoxgb:_determine_problem_type:140 - Problem type: multi_class_classification
2021-11-01 15:45:04.892 | INFO | autoxgb.autoxgb:_create_folds:58 - Creating folds
2021-11-01 15:45:04.922 | INFO | autoxgb.autoxgb:_process_data:170 - Encoding target(s)
2021-11-01 15:45:04.931 | INFO | autoxgb.autoxgb:_process_data:195 - Found 0 categorical features.
2021-11-01 15:45:05.054 | INFO | autoxgb.autoxgb:_process_data:236 - Model config: train_filename='train.csv' test_filename=None idx='id' targets=['label'] problem_type=<ProblemType.multi_class_classification: 2> output='output3' features=['x1', 'x2', 'x3', 'x4', 'x5', 'x6', 'y1', 'z1', 'z2', 'z3', 'z4'] num_folds=5 use_gpu=False seed=42 categorical_features=[] num_trials=100 time_limit=360 fast=False
2021-11-01 15:45:05.054 | INFO | autoxgb.autoxgb:_process_data:237 - Saving model config
2021-11-01 15:45:05.055 | INFO | autoxgb.autoxgb:_process_data:241 - Saving encoders
[I 2021-11-01 15:45:05,230] A new study created in RDB with name: autoxgb
[W 2021-11-01 15:45:05,339] Trial 0 failed because of the following error: AttributeError('dlsym(0x7fd108ca6760, XGDMatrixCreateFromDense): symbol not found')
Traceback (most recent call last):
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/optuna/study/_optimize.py", line 213, in _run_trial
value_or_values = func(trial)
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/autoxgb/utils.py", line 172, in optimize
model.fit(
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/core.py", line 506, in inner_f
return f(**kwargs)
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/sklearn.py", line 1231, in fit
train_dmatrix, evals = _wrap_evaluation_matrices(
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/sklearn.py", line 286, in _wrap_evaluation_matrices
train_dmatrix = create_dmatrix(
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/sklearn.py", line 1245, in <lambda>
create_dmatrix=lambda **kwargs: DMatrix(nthread=self.n_jobs, **kwargs),
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/core.py", line 506, in inner_f
return f(**kwargs)
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/core.py", line 616, in __init__
handle, feature_names, feature_types = dispatch_data_backend(
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/data.py", line 707, in dispatch_data_backend
return _from_pandas_df(data, enable_categorical, missing, threads,
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/data.py", line 299, in _from_pandas_df
return _from_numpy_array(data, missing, nthread, feature_names,
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/data.py", line 179, in _from_numpy_array
_LIB.XGDMatrixCreateFromDense(
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__
func = self.__getitem__(name)
File "/Users/A124661/opt/anaconda3/envs/deep_py38/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: dlsym(0x7fd108ca6760, XGDMatrixCreateFromDense): symbol not found
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/var/folders/pp/ym01m3sx0hg3my_gzpsdl8680000gp/T/ipykernel_728/1462055845.py in <module>
16 fast=fast,
17 )
---> 18 axgb.train()
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/autoxgb/autoxgb.py in train(self)
245 def train(self):
246 self._process_data()
--> 247 best_params = train_model(self.model_config)
248 logger.info("Training complete")
249 self.predict(best_params)
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/autoxgb/utils.py in train_model(model_config)
211 load_if_exists=True,
212 )
--> 213 study.optimize(optimize_func, n_trials=model_config.num_trials, timeout=model_config.time_limit)
214 return study.best_params
215
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/optuna/study/study.py in optimize(self, func, n_trials, timeout, n_jobs, catch, callbacks, gc_after_trial, show_progress_bar)
398 )
399
--> 400 _optimize(
401 study=self,
402 func=func,
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/optuna/study/_optimize.py in _optimize(study, func, n_trials, timeout, n_jobs, catch, callbacks, gc_after_trial, show_progress_bar)
64 try:
65 if n_jobs == 1:
---> 66 _optimize_sequential(
67 study,
68 func,
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/optuna/study/_optimize.py in _optimize_sequential(study, func, n_trials, timeout, catch, callbacks, gc_after_trial, reseed_sampler_rng, time_start, progress_bar)
161
162 try:
--> 163 trial = _run_trial(study, func, catch)
164 except Exception:
165 raise
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/optuna/study/_optimize.py in _run_trial(study, func, catch)
262
263 if state == TrialState.FAIL and func_err is not None and not isinstance(func_err, catch):
--> 264 raise func_err
265 return trial
266
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/optuna/study/_optimize.py in _run_trial(study, func, catch)
211
212 try:
--> 213 value_or_values = func(trial)
214 except exceptions.TrialPruned as e:
215 # TODO(mamu): Handle multi-objective cases.
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/autoxgb/utils.py in optimize(trial, xgb_model, use_predict_proba, eval_metric, model_config)
170
171 else:
--> 172 model.fit(
173 xtrain,
174 ytrain,
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/core.py in inner_f(*args, **kwargs)
504 for k, arg in zip(sig.parameters, args):
505 kwargs[k] = arg
--> 506 return f(**kwargs)
507
508 return inner_f
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/sklearn.py in fit(self, X, y, sample_weight, base_margin, eval_set, eval_metric, early_stopping_rounds, verbose, xgb_model, sample_weight_eval_set, base_margin_eval_set, feature_weights, callbacks)
1229
1230 model, feval, params = self._configure_fit(xgb_model, eval_metric, params)
-> 1231 train_dmatrix, evals = _wrap_evaluation_matrices(
1232 missing=self.missing,
1233 X=X,
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/sklearn.py in _wrap_evaluation_matrices(missing, X, y, group, qid, sample_weight, base_margin, feature_weights, eval_set, sample_weight_eval_set, base_margin_eval_set, eval_group, eval_qid, create_dmatrix, enable_categorical, label_transform)
284
285 """
--> 286 train_dmatrix = create_dmatrix(
287 data=X,
288 label=label_transform(y),
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/sklearn.py in <lambda>(**kwargs)
1243 eval_group=None,
1244 eval_qid=None,
-> 1245 create_dmatrix=lambda **kwargs: DMatrix(nthread=self.n_jobs, **kwargs),
1246 enable_categorical=self.enable_categorical,
1247 label_transform=label_transform,
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/core.py in inner_f(*args, **kwargs)
504 for k, arg in zip(sig.parameters, args):
505 kwargs[k] = arg
--> 506 return f(**kwargs)
507
508 return inner_f
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/core.py in __init__(self, data, label, weight, base_margin, missing, silent, feature_names, feature_types, nthread, group, qid, label_lower_bound, label_upper_bound, feature_weights, enable_categorical)
614 return
615
--> 616 handle, feature_names, feature_types = dispatch_data_backend(
617 data,
618 missing=self.missing,
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/data.py in dispatch_data_backend(data, missing, threads, feature_names, feature_types, enable_categorical)
705 return _from_tuple(data, missing, threads, feature_names, feature_types)
706 if _is_pandas_df(data):
--> 707 return _from_pandas_df(data, enable_categorical, missing, threads,
708 feature_names, feature_types)
709 if _is_pandas_series(data):
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/data.py in _from_pandas_df(data, enable_categorical, missing, nthread, feature_names, feature_types)
297 data, feature_names, feature_types = _transform_pandas_df(
298 data, enable_categorical, feature_names, feature_types)
--> 299 return _from_numpy_array(data, missing, nthread, feature_names,
300 feature_types)
301
~/opt/anaconda3/envs/deep_py38/lib/python3.8/site-packages/xgboost/data.py in _from_numpy_array(data, missing, nthread, feature_names, feature_types)
177 config = bytes(json.dumps(args), "utf-8")
178 _check_call(
--> 179 _LIB.XGDMatrixCreateFromDense(
180 _array_interface(data),
181 config,
~/opt/anaconda3/envs/deep_py38/lib/python3.8/ctypes/__init__.py in __getattr__(self, name)
384 if name.startswith('__') and name.endswith('__'):
385 raise AttributeError(name)
--> 386 func = self.__getitem__(name)
387 setattr(self, name, func)
388 return func
~/opt/anaconda3/envs/deep_py38/lib/python3.8/ctypes/__init__.py in __getitem__(self, name_or_ordinal)
389
390 def __getitem__(self, name_or_ordinal):
--> 391 func = self._FuncPtr((name_or_ordinal, self))
392 if not isinstance(name_or_ordinal, int):
393 func.__name__ = name_or_ordinal
AttributeError: dlsym(0x7fd108ca6760, XGDMatrixCreateFromDense): symbol not found
Hi, I want to see the parameters of the trained model after the training is complete. Can anyone help me out with it ?
I am trying to run:
autoxgb predict --model_path output/ --test_filename test_file.csv --out_filename tmp.csv
test_file.csv is
where test_file.csv is:
id,L0_n,L0_r,L0_w,L0_s,L0_freq,L0_L,L0_Q
700,2.25,67,15,2.1,2.25,1.406883,17.5144
701,5.75,69,22,2.1,2.25,14.00953,14.61921
I get the following error:
File "/home/neo/wellth-wrk/env/lib/python3.8/site-packages/autoxgb/predict.py", line 85, in _predict_df
final_preds = pd.DataFrame(final_preds, columns=self.model_config.target_cols)
AttributeError: 'ModelConfig' object has no attribute 'target_cols'
@abhishekkrthakur
Thanks for your great repo. I have written the following brief post for introducing your great repo:
AutoXGB: XGBoost + Optuna
Best
autoxgb serve --model_path outputs/mll --host 0.0.0.0 --debug
I don't see the option for port
Hi,
I try autoxgb but very quickly it returns that error:
id
.TypeError Traceback (most recent call last)
in ()
37 fast=fast,
38 )
---> 39 axgb.train()
10 frames
/usr/local/lib/python3.7/dist-packages/autoxgb/autoxgb.py in train(self)
244
245 def train(self):
--> 246 self._process_data()
247 best_params = train_model(self.model_config)
248 logger.info("Training complete")
/usr/local/lib/python3.7/dist-packages/autoxgb/autoxgb.py in _process_data(self)
148 def _process_data(self):
149 logger.info("Reading training data")
--> 150 train_df = pd.read_csv(self.train_filename)
151 train_df = reduce_memory_usage(train_df)
152 problem_type = self._determine_problem_type(train_df)
/usr/local/lib/python3.7/dist-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
309 stacklevel=stacklevel,
310 )
--> 311 return func(*args, **kwargs)
312
313 return wrapper
/usr/local/lib/python3.7/dist-packages/pandas/io/parsers/readers.py in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
584 kwds.update(kwds_defaults)
585
--> 586 return _read(filepath_or_buffer, kwds)
587
588
/usr/local/lib/python3.7/dist-packages/pandas/io/parsers/readers.py in _read(filepath_or_buffer, kwds)
480
481 # Create the parser.
--> 482 parser = TextFileReader(filepath_or_buffer, **kwds)
483
484 if chunksize or iterator:
/usr/local/lib/python3.7/dist-packages/pandas/io/parsers/readers.py in init(self, f, engine, **kwds)
809 self.options["has_index_names"] = kwds["has_index_names"]
810
--> 811 self._engine = self._make_engine(self.engine)
812
813 def close(self):
/usr/local/lib/python3.7/dist-packages/pandas/io/parsers/readers.py in _make_engine(self, engine)
1038 )
1039 # error: Too many arguments for "ParserBase"
-> 1040 return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
1041
1042 def _failover_to_python(self):
/usr/local/lib/python3.7/dist-packages/pandas/io/parsers/c_parser_wrapper.py in init(self, src, **kwds)
49
50 # open handles
---> 51 self._open_handles(src, kwds)
52 assert self.handles is not None
53
/usr/local/lib/python3.7/dist-packages/pandas/io/parsers/base_parser.py in _open_handles(self, src, kwds)
227 memory_map=kwds.get("memory_map", False),
228 storage_options=kwds.get("storage_options", None),
--> 229 errors=kwds.get("encoding_errors", "strict"),
230 )
231
/usr/local/lib/python3.7/dist-packages/pandas/io/common.py in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
583
584 # read_csv does not know whether the buffer is opened in binary/text mode
--> 585 if _is_binary_mode(path_or_buf, mode) and "b" not in mode:
586 mode += "b"
587
/usr/local/lib/python3.7/dist-packages/pandas/io/common.py in _is_binary_mode(handle, mode)
960 # classes that expect bytes
961 binary_classes = (BufferedIOBase, RawIOBase)
--> 962 return isinstance(handle, binary_classes) or "b" in getattr(handle, "mode", mode)
TypeError: argument of type 'method' is not iterable
Don't know why..??? Thank you all for your help
My code:
from autoxgb import AutoXGB
train_filename = df.iloc[:round(df.shape[0]*.8)]
output = "output2"
test_filename = None
task = None
idx = df.index
targets = ["Goal"]
features = None
categorical_features = None
use_gpu = False
num_folds = 5
seed = 42
num_trials = 100
time_limit = 360
fast = False
axgb = AutoXGB(
train_filename=train_filename,
output=output,
test_filename=test_filename,
task=task,
idx=idx,
targets=targets,
features=features,
categorical_features=categorical_features,
use_gpu=use_gpu,
num_folds=num_folds,
seed=seed,
num_trials=num_trials,
time_limit=time_limit,
fast=fast,
)
axgb.train()
I am getting:
local variable 'test_pred_temp' referenced before assignment
https://www.kaggle.com/somesh88/playground-feb
the database and code notebook linked above can anyone please help me regarding the same.
Hi,
Thanks for building a very useful package. I have two simple questions:
How come only the following params are tuned:
{'colsample_bytree': 0.18270180565544739,
'early_stopping_rounds': 401,
'learning_rate': 0.013529250923369278,
'max_depth': 6,
'n_estimators': 20000,
'reg_alpha': 0.0019387086612090178,
'reg_lambda': 5.879563892375361e-08,
'subsample': 0.8925701729066172}
what about gamma and other xgBoost parameters? Are they assumed to be default values?
How do I access the best model from the output directory? I plugged in the above best params in my xgb model, but didn't get the same result as autoxgb result showed. Is there a way to access these models and/or the best model in the output directory, so I can run the model on any data to see the results?
thank you so much, any help will be greatly appreciated.
p.s. any docs on how to use the output files? There are lot of useful info there, but don't know how to access them smartly.
On some of the linux machines I have, it was installed with pip:
pip3 install autoxgb
but simply doing:
import autoxgb
ModuleNotFoundError: No module named 'autoxgb'
executed code
# Now its time to train the model!
axgb = AutoXGB(
train_filename=train_filename,
output=output,
test_filename=test_filename,
task=task,
idx=idx,
targets=targets,
features=features,
categorical_features=categorical_features,
use_gpu=use_gpu,
num_folds=num_folds,
seed=seed,
num_trials=num_trials,
time_limit=time_limit,
fast=fast,
)
axgb.train()
log:
2023-08-04 14:31:55.744 | INFO | autoxgb.autoxgb:__post_init__:42 - Output directory: output3
2023-08-04 14:31:55.744 | INFO | autoxgb.autoxgb:_process_data:149 - Reading training data
2023-08-04 14:31:55.765 | INFO | autoxgb.utils:reduce_memory_usage:48 - Mem. usage decreased to 0.79 Mb (37.5% reduction)
2023-08-04 14:31:55.767 | INFO | autoxgb.autoxgb:_determine_problem_type:140 - Problem type: multi_label_classification
2023-08-04 14:31:55.767 | INFO | autoxgb.autoxgb:_create_folds:58 - Creating folds
2023-08-04 14:31:55.772 | INFO | autoxgb.autoxgb:_process_data:195 - Found 18 categorical features.
2023-08-04 14:31:55.772 | INFO | autoxgb.autoxgb:_process_data:198 - Encoding categorical features
2023-08-04 14:31:55.924 | INFO | autoxgb.autoxgb:_process_data:236 - Model config: train_filename='/data_16t/hongziwen/autoxgb-main/data_samples/multi_label_classification.csv' test_filename=None idx='id' targets=['service_a', 'service_b'] problem_type=<ProblemType.multi_label_classification: 3> output='output3' features=['release', 'n_0047', 'n_0050', 'n_0052', 'n_0061', 'n_0067', 'n_0075', 'n_0078', 'n_0091', 'n_0108', 'n_0109', 'o_0176', 'o_0264', 'c_0466', 'c_0500', 'c_0638', 'c_0699', 'c_0738', 'c_0761', 'c_0770', 'c_0838', 'c_0870', 'c_0980', 'c_1145', 'c_1158', 'c_1189', 'c_1223', 'c_1227', 'c_1244', 'c_1259'] num_folds=5 use_gpu=True seed=42 categorical_features=['release', 'c_0466', 'c_0500', 'c_0638', 'c_0699', 'c_0738', 'c_0761', 'c_0770', 'c_0838', 'c_0870', 'c_0980', 'c_1145', 'c_1158', 'c_1189', 'c_1223', 'c_1227', 'c_1244', 'c_1259'] num_trials=100 time_limit=360 fast=False
2023-08-04 14:31:55.924 | INFO | autoxgb.autoxgb:_process_data:237 - Saving model config
2023-08-04 14:31:55.925 | INFO | autoxgb.autoxgb:_process_data:241 - Saving encoders
error
Exception has occurred: AssertionError
exception: no description
File "/data_16t//autoxgb-main/examples/multi_label_classification.py", line 39, in <module>
axgb.train()
AssertionError:
I reproduce it according to the readme.md file.
Getting error while using TPS November data on Kaggle conda env (my GPU is on)
https://www.kaggle.com/yogeshkalauni/tps-nov-21-auto-xgboost-error
Getting error while using pip install in Kaggle kernel.
Collecting autoxgb
Downloading autoxgb-0.2.1-py3-none-any.whl (20 kB)
Collecting scikit-learn==1.0.1
Downloading scikit_learn-1.0.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (23.2 MB)
|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 23.2 MB 1.3 MB/s eta 0:00:01
Requirement already satisfied: optuna==2.10.0 in /opt/conda/lib/python3.7/site-packages (from autoxgb) (2.10.0)
Collecting pyarrow==6.0.0
Downloading pyarrow-6.0.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25.5 MB)
|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 25.5 MB 43.9 MB/s eta 0:00:01
Requirement already satisfied: pydantic==1.8.2 in /opt/conda/lib/python3.7/site-packages (from autoxgb) (1.8.2)
Collecting loguru==0.5.3
Downloading loguru-0.5.3-py3-none-any.whl (57 kB)
|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 57 kB 4.9 MB/s eta 0:00:01
Collecting xgboost==1.5.0
Downloading xgboost-1.5.0-py3-none-manylinux2014_x86_64.whl (173.5 MB)
|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 173.5 MB 66 kB/s s eta 0:00:01 |โโโโโโโโโโโโโโโโโโ | 97.9 MB 59.6 MB/s eta 0:00:02
Collecting pandas==1.3.4
Downloading pandas-1.3.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.3 MB)
|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 11.3 MB 46.0 MB/s eta 0:00:01
Requirement already satisfied: fastapi==0.70.0 in /opt/conda/lib/python3.7/site-packages (from autoxgb) (0.70.0)
Requirement already satisfied: uvicorn==0.15.0 in /opt/conda/lib/python3.7/site-packages (from autoxgb) (0.15.0)
Collecting numpy==1.21.3
Downloading numpy-1.21.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 15.7 MB 39.9 MB/s eta 0:00:01
Collecting joblib==1.1.0
Downloading joblib-1.1.0-py2.py3-none-any.whl (306 kB)
|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 306 kB 39.9 MB/s eta 0:00:01
Requirement already satisfied: starlette==0.16.0 in /opt/conda/lib/python3.7/site-packages (from fastapi==0.70.0->autoxgb) (0.16.0)
Requirement already satisfied: scipy!=1.4.0 in /opt/conda/lib/python3.7/site-packages (from optuna==2.10.0->autoxgb) (1.7.1)
Requirement already satisfied: cliff in /opt/conda/lib/python3.7/site-packages (from optuna==2.10.0->autoxgb) (3.9.0)
Requirement already satisfied: colorlog in /opt/conda/lib/python3.7/site-packages (from optuna==2.10.0->autoxgb) (6.5.0)
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.7/site-packages (from optuna==2.10.0->autoxgb) (21.0)
Requirement already satisfied: tqdm in /opt/conda/lib/python3.7/site-packages (from optuna==2.10.0->autoxgb) (4.62.3)
Requirement already satisfied: alembic in /opt/conda/lib/python3.7/site-packages (from optuna==2.10.0->autoxgb) (1.7.4)
Requirement already satisfied: cmaes>=0.8.2 in /opt/conda/lib/python3.7/site-packages (from optuna==2.10.0->autoxgb) (0.8.2)
Requirement already satisfied: sqlalchemy>=1.1.0 in /opt/conda/lib/python3.7/site-packages (from optuna==2.10.0->autoxgb) (1.4.25)
Requirement already satisfied: PyYAML in /opt/conda/lib/python3.7/site-packages (from optuna==2.10.0->autoxgb) (5.4.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/lib/python3.7/site-packages (from pandas==1.3.4->autoxgb) (2.8.0)
Requirement already satisfied: pytz>=2017.3 in /opt/conda/lib/python3.7/site-packages (from pandas==1.3.4->autoxgb) (2021.1)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.7/site-packages (from pydantic==1.8.2->autoxgb) (3.10.0.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from scikit-learn==1.0.1->autoxgb) (2.2.0)
Requirement already satisfied: anyio<4,>=3.0.0 in /opt/conda/lib/python3.7/site-packages (from starlette==0.16.0->fastapi==0.70.0->autoxgb) (3.3.0)
Requirement already satisfied: click>=7.0 in /opt/conda/lib/python3.7/site-packages (from uvicorn==0.15.0->autoxgb) (8.0.1)
Requirement already satisfied: asgiref>=3.4.0 in /opt/conda/lib/python3.7/site-packages (from uvicorn==0.15.0->autoxgb) (3.4.1)
Requirement already satisfied: h11>=0.8 in /opt/conda/lib/python3.7/site-packages (from uvicorn==0.15.0->autoxgb) (0.12.0)
Requirement already satisfied: sniffio>=1.1 in /opt/conda/lib/python3.7/site-packages (from anyio<4,>=3.0.0->starlette==0.16.0->fastapi==0.70.0->autoxgb) (1.2.0)
Requirement already satisfied: idna>=2.8 in /opt/conda/lib/python3.7/site-packages (from anyio<4,>=3.0.0->starlette==0.16.0->fastapi==0.70.0->autoxgb) (2.10)
Requirement already satisfied: importlib-metadata in /opt/conda/lib/python3.7/site-packages (from click>=7.0->uvicorn==0.15.0->autoxgb) (4.8.1)
Requirement already satisfied: pyparsing>=2.0.2 in /opt/conda/lib/python3.7/site-packages (from packaging>=20.0->optuna==2.10.0->autoxgb) (2.4.7)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas==1.3.4->autoxgb) (1.16.0)
Requirement already satisfied: greenlet!=0.4.17 in /opt/conda/lib/python3.7/site-packages (from sqlalchemy>=1.1.0->optuna==2.10.0->autoxgb) (1.1.1)
Requirement already satisfied: Mako in /opt/conda/lib/python3.7/site-packages (from alembic->optuna==2.10.0->autoxgb) (1.1.5)
Requirement already satisfied: importlib-resources in /opt/conda/lib/python3.7/site-packages (from alembic->optuna==2.10.0->autoxgb) (5.2.2)
Requirement already satisfied: PrettyTable>=0.7.2 in /opt/conda/lib/python3.7/site-packages (from cliff->optuna==2.10.0->autoxgb) (2.2.0)
Requirement already satisfied: cmd2>=1.0.0 in /opt/conda/lib/python3.7/site-packages (from cliff->optuna==2.10.0->autoxgb) (2.2.0)
Requirement already satisfied: autopage>=0.4.0 in /opt/conda/lib/python3.7/site-packages (from cliff->optuna==2.10.0->autoxgb) (0.4.0)
Requirement already satisfied: stevedore>=2.0.1 in /opt/conda/lib/python3.7/site-packages (from cliff->optuna==2.10.0->autoxgb) (3.4.0)
Requirement already satisfied: pbr!=2.1.0,>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from cliff->optuna==2.10.0->autoxgb) (5.6.0)
Requirement already satisfied: colorama>=0.3.7 in /opt/conda/lib/python3.7/site-packages (from cmd2>=1.0.0->cliff->optuna==2.10.0->autoxgb) (0.4.4)
Requirement already satisfied: attrs>=16.3.0 in /opt/conda/lib/python3.7/site-packages (from cmd2>=1.0.0->cliff->optuna==2.10.0->autoxgb) (21.2.0)
Requirement already satisfied: pyperclip>=1.6 in /opt/conda/lib/python3.7/site-packages (from cmd2>=1.0.0->cliff->optuna==2.10.0->autoxgb) (1.8.2)
Requirement already satisfied: wcwidth>=0.1.7 in /opt/conda/lib/python3.7/site-packages (from cmd2>=1.0.0->cliff->optuna==2.10.0->autoxgb) (0.2.5)
Requirement already satisfied: zipp>=0.5 in /opt/conda/lib/python3.7/site-packages (from importlib-metadata->click>=7.0->uvicorn==0.15.0->autoxgb) (3.5.0)
Requirement already satisfied: MarkupSafe>=0.9.2 in /opt/conda/lib/python3.7/site-packages (from Mako->alembic->optuna==2.10.0->autoxgb) (2.0.1)
Installing collected packages: numpy, joblib, xgboost, scikit-learn, pyarrow, pandas, loguru, autoxgb
Attempting uninstall: numpy
Found existing installation: numpy 1.19.5
Uninstalling numpy-1.19.5:
Successfully uninstalled numpy-1.19.5
Attempting uninstall: joblib
Found existing installation: joblib 1.0.1
Uninstalling joblib-1.0.1:
Successfully uninstalled joblib-1.0.1
Attempting uninstall: xgboost
Found existing installation: xgboost 1.4.2
Uninstalling xgboost-1.4.2:
Successfully uninstalled xgboost-1.4.2
Attempting uninstall: scikit-learn
Found existing installation: scikit-learn 0.23.2
Uninstalling scikit-learn-0.23.2:
Successfully uninstalled scikit-learn-0.23.2
Attempting uninstall: pyarrow
Found existing installation: pyarrow 5.0.0
Uninstalling pyarrow-5.0.0:
Successfully uninstalled pyarrow-5.0.0
Attempting uninstall: pandas
Found existing installation: pandas 1.3.3
Uninstalling pandas-1.3.3:
Successfully uninstalled pandas-1.3.3
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-io 0.18.0 requires tensorflow-io-gcs-filesystem==0.18.0, which is not installed.
explainable-ai-sdk 1.3.2 requires xai-image-widget, which is not installed.
dask-cudf 21.8.3 requires cupy-cuda114, which is not installed.
cudf 21.8.3 requires cupy-cuda110, which is not installed.
beatrix-jupyterlab 3.1.1 requires google-cloud-bigquery-storage, which is not installed.
yellowbrick 1.3.post1 requires numpy<1.20,>=1.16.0, but you have numpy 1.21.3 which is incompatible.
tfx-bsl 1.3.0 requires absl-py<0.13,>=0.9, but you have absl-py 0.14.0 which is incompatible.
tfx-bsl 1.3.0 requires numpy<1.20,>=1.16, but you have numpy 1.21.3 which is incompatible.
tfx-bsl 1.3.0 requires pyarrow<3,>=1, but you have pyarrow 6.0.0 which is incompatible.
tensorflow 2.6.0 requires numpy~=1.19.2, but you have numpy 1.21.3 which is incompatible.
tensorflow 2.6.0 requires six~=1.15.0, but you have six 1.16.0 which is incompatible.
tensorflow 2.6.0 requires typing-extensions~=3.7.4, but you have typing-extensions 3.10.0.2 which is incompatible.
tensorflow-transform 1.3.0 requires absl-py<0.13,>=0.9, but you have absl-py 0.14.0 which is incompatible.
tensorflow-transform 1.3.0 requires numpy<1.20,>=1.16, but you have numpy 1.21.3 which is incompatible.
tensorflow-transform 1.3.0 requires pyarrow<3,>=1, but you have pyarrow 6.0.0 which is incompatible.
tensorflow-io 0.18.0 requires tensorflow<2.6.0,>=2.5.0, but you have tensorflow 2.6.0 which is incompatible.
pdpbox 0.2.1 requires matplotlib==3.1.1, but you have matplotlib 3.4.3 which is incompatible.
numba 0.54.0 requires numpy<1.21,>=1.17, but you have numpy 1.21.3 which is incompatible.
matrixprofile 1.1.10 requires protobuf==3.11.2, but you have protobuf 3.18.1 which is incompatible.
hypertools 0.7.0 requires scikit-learn!=0.22,<0.24,>=0.19.1, but you have scikit-learn 1.0.1 which is incompatible.
dask-cudf 21.8.3 requires dask<=2021.07.1,>=2021.6.0, but you have dask 2021.9.1 which is incompatible.
dask-cudf 21.8.3 requires pandas<1.3.0dev0,>=1.0, but you have pandas 1.3.4 which is incompatible.
cudf 21.8.3 requires pandas<1.3.0dev0,>=1.0, but you have pandas 1.3.4 which is incompatible.
apache-beam 2.32.0 requires dill<0.3.2,>=0.3.1.1, but you have dill 0.3.4 which is incompatible.
apache-beam 2.32.0 requires numpy<1.21.0,>=1.14.3, but you have numpy 1.21.3 which is incompatible.
apache-beam 2.32.0 requires pyarrow<5.0.0,>=0.15.1, but you have pyarrow 6.0.0 which is incompatible.
apache-beam 2.32.0 requires typing-extensions<3.8.0,>=3.7.0, but you have typing-extensions 3.10.0.2 which is incompatible.
Successfully installed autoxgb-0.2.1 joblib-1.1.0 loguru-0.5.3 numpy-1.21.3 pandas-1.3.4 pyarrow-6.0.0 scikit-learn-1.0.1 xgboost-1.5.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
from autoxgb import AutoXGB
# required parameters:
train_filename = "../input/tabular-playground-series-nov-2021/train.csv"
output = "outputt"
# optional parameters
test_filename = '../input/tabular-playground-series-nov-2021/test.csv'
task = 'classification'
idx = None
targets = ["target"]
features = None
categorical_features = None
use_gpu = True
num_folds = 5
seed = 42
num_trials = 100
time_limit = 7*60*60
fast = False
# Now its time to train the model!
axgb = AutoXGB(
train_filename=train_filename,
output=output,
test_filename=test_filename,
task=task,
idx=idx,
targets=targets,
features=features,
categorical_features=categorical_features,
use_gpu=use_gpu,
num_folds=num_folds,
seed=seed,
num_trials=num_trials,
time_limit=time_limit,
fast=fast,
)
axgb.train()
2021-11-01 07:03:06.106 | INFO | autoxgb.autoxgb:__post_init__:42 - Output directory: outputt
2021-11-01 07:03:06.108 | WARNING | autoxgb.autoxgb:__post_init__:49 - No id column specified. Will default to `id`.
2021-11-01 07:03:06.110 | INFO | autoxgb.autoxgb:_process_data:149 - Reading training data
2021-11-01 07:03:22.502 | INFO | autoxgb.utils:reduce_memory_usage:50 - Mem. usage decreased to 117.30 Mb (74.9% reduction)
2021-11-01 07:03:22.583 | INFO | autoxgb.autoxgb:_determine_problem_type:140 - Problem type: binary_classification
2021-11-01 07:03:38.131 | INFO | autoxgb.utils:reduce_memory_usage:50 - Mem. usage decreased to 105.06 Mb (74.8% reduction)
2021-11-01 07:03:38.132 | INFO | autoxgb.autoxgb:_create_folds:58 - Creating folds
2021-11-01 07:03:38.248 | INFO | autoxgb.autoxgb:_process_data:170 - Encoding target(s)
2021-11-01 07:03:38.282 | INFO | autoxgb.autoxgb:_process_data:195 - Found 0 categorical features.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/tmp/ipykernel_38/3565386527.py in <module>
37 fast=fast,
38 )
---> 39 axgb.train()
/opt/conda/lib/python3.7/site-packages/autoxgb/autoxgb.py in train(self)
244
245 def train(self):
--> 246 self._process_data()
247 best_params = train_model(self.model_config)
248 logger.info("Training complete")
/opt/conda/lib/python3.7/site-packages/autoxgb/autoxgb.py in _process_data(self)
210 test_fold[categorical_features] = ord_encoder.transform(test_fold[categorical_features].values)
211 categorical_encoders[fold] = ord_encoder
--> 212 fold_train.to_feather(os.path.join(self.output, f"train_fold_{fold}.feather"))
213 fold_valid.to_feather(os.path.join(self.output, f"valid_fold_{fold}.feather"))
214 if self.test_filename is not None:
/opt/conda/lib/python3.7/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
205 else:
206 kwargs[new_arg_name] = new_arg_value
--> 207 return func(*args, **kwargs)
208
209 return cast(F, wrapper)
/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in to_feather(self, path, **kwargs)
2517 from pandas.io.feather_format import to_feather
2518
-> 2519 to_feather(self, path, **kwargs)
2520
2521 @doc(
/opt/conda/lib/python3.7/site-packages/pandas/io/feather_format.py in to_feather(df, path, storage_options, **kwargs)
44 """
45 import_optional_dependency("pyarrow")
---> 46 from pyarrow import feather
47
48 if not isinstance(df, DataFrame):
/opt/conda/lib/python3.7/site-packages/pyarrow/feather.py in <module>
23 concat_tables, schema)
24 import pyarrow.lib as ext
---> 25 from pyarrow import _feather
26 from pyarrow._feather import FeatherError # noqa: F401
27 from pyarrow.vendored.version import Version
/opt/conda/lib/python3.7/site-packages/pyarrow/_feather.pyx in init pyarrow._feather()
AttributeError: module 'pyarrow.lib' has no attribute 'MonthDayNanoIntervalArray'
Issue Description:
Hello.
I have discovered a performance degradation in the read_csv
function of pandas version 1.3.4 when handling CSV files with a large number of columns. This problem significantly increases the loading time from just a few seconds in the previous version 1.2.5 to several minutes, almost 60x diff. I found some discussions on GitHub related to this issue, including #44106 and #44192.
I found that autoxgb/src/autoxgb/predict.py
and autoxgb/src/autoxgb/autoxgb.py
both used the influenced api.
Steps to Reproduce:
I have created a small reproducible example to better illustrate this issue.
# v1.3.4
import os
import pandas
import numpy
import timeit
def generate_sample():
if os.path.exists("test_small.csv.gz") == False:
nb_col = 100000
nb_row = 5
feature_list = {'sample': ['s_' + str(i+1) for i in range(nb_row)]}
for i in range(nb_col):
feature_list.update({'feature_' + str(i+1): list(numpy.random.uniform(low=0, high=10, size=nb_row))})
df = pandas.DataFrame(feature_list)
df.to_csv("test_small.csv.gz", index=False, float_format="%.6f")
def load_csv_file():
col_names = pandas.read_csv("test_small.csv.gz", low_memory=False, nrows=1).columns
types_dict = {col: numpy.float32 for col in col_names}
types_dict.update({'sample': str})
feature_df = pandas.read_csv("test_small.csv.gz", index_col="sample", na_filter=False, dtype=types_dict, low_memory=False)
print("loaded dataframe shape:", feature_df.shape)
generate_sample()
timeit.timeit(load_csv_file, number=1)
# results
loaded dataframe shape: (5, 100000)
120.37690759263933
# v1.3.5
import os
import pandas
import numpy
import timeit
def generate_sample():
if os.path.exists("test_small.csv.gz") == False:
nb_col = 100000
nb_row = 5
feature_list = {'sample': ['s_' + str(i+1) for i in range(nb_row)]}
for i in range(nb_col):
feature_list.update({'feature_' + str(i+1): list(numpy.random.uniform(low=0, high=10, size=nb_row))})
df = pandas.DataFrame(feature_list)
df.to_csv("test_small.csv.gz", index=False, float_format="%.6f")
def load_csv_file():
col_names = pandas.read_csv("test_small.csv.gz", low_memory=False, nrows=1).columns
types_dict = {col: numpy.float32 for col in col_names}
types_dict.update({'sample': str})
feature_df = pandas.read_csv("test_small.csv.gz", index_col="sample", na_filter=False, dtype=types_dict, low_memory=False)
print("loaded dataframe shape:", feature_df.shape)
generate_sample()
timeit.timeit(load_csv_file, number=1)
# results
loaded dataframe shape: (5, 100000)
2.8567268839105964
Suggestion
I would recommend considering an upgrade to a different version of pandas >= 1.3.5 or exploring other solutions to optimize the performance of loading CSV files.
Any other workarounds or solutions would be greatly appreciated.
Thank you!
xgboost.plot_importance() has been quite handy to plot important features. Is there a way to do that?
Thanks!
Dear autoxgb Developers,
I am reaching out to report an installation issue encountered with the autoxgb package within the Kaggle notebook environment. During the installation process via pip, the operation fails due to a Cython compilation error related to the sklearn.ensemble._hist_gradient_boosting.splitting.pyx module.
Cython.Compiler.Errors.CompileError: sklearn/ensemble/_hist_gradient_boosting/splitting.pyx
...
error: metadata-generation-failed
I have been dealing with TPS in Kaggle and I have tried auto xgboost. I have set the time limit to 3600*4.
But the training didn't stop at 4 hours. Now is at 6.5 hours and still going. Is anything I am doing wrong?
ps. the first trial took 4 hours to complete
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.