Git Product home page Git Product logo

kaggler's Introduction

PyPI version CI Downloads codecov

Kaggler

Kaggler is a Python package for lightweight online machine learning algorithms and utility functions for ETL and data analysis. It is distributed under the MIT License.

Its online learning algorithms are inspired by Kaggle user tinrtgu's code. It uses the sparse input format that handles large sparse data efficiently. Core code is optimized for speed by using Cython.

Installation

Dependencies

Python packages required are listed in requirements.txt

  • cython
  • h5py
  • hyperopt
  • lightgbm
  • ml_metrics
  • numpy/scipy
  • pandas
  • scikit-learn

Using pip

Python package is available at PyPi for pip installation:

pip install -U Kaggler

If installation fails because it cannot find MurmurHash3.h, please add . to LD_LIBRARY_PATH as described here.

From source code

If you want to install it from source code:

python setup.py build_ext --inplace
python setup.py install

Feature Engineering

One-Hot, Label, Target, Frequency, and Embedding Encoders for Categorical Features

import pandas as pd
from kaggler.preprocessing import OneHotEncoder, LabelEncoder, TargetEncoder, FrequencyEncoder, EmbeddingEncoder

trn = pd.read_csv('train.csv')
target_col = trn.columns[-1]
cat_cols = [col for col in trn.columns if trn[col].dtype == 'object']

ohe = OneHotEncoder(min_obs=100) # grouping all categories with less than 100 occurences
lbe = LabelEncoder(min_obs=100)  # grouping all categories with less than 100 occurences
te = TargetEncoder()			 # replacing each category with the average target value of the category
fe = FrequencyEncoder()	         # replacing each category with the frequency value of the category
ee = EmbeddingEncoder()          # mapping each category to a vector of real numbers

X_ohe = ohe.fit_transform(trn[cat_cols])	    # X_ohe is a scipy sparse matrix
trn[cat_cols] = lbe.fit_transform(trn[cat_cols])
trn[cat_cols] = te.fit_transform(trn[cat_cols])
trn[cat_cols] = fe.fit_transform(trn[cat_cols])
X_ee = ee.fit_transform(trn[cat_cols], trn[target_col])          # X_ee is a numpy matrix

tst = pd.read_csv('test.csv')
X_ohe = ohe.transform(tst[cat_cols])
tst[cat_cols] = lbe.transform(tst[cat_cols])
tst[cat_cols] = te.transform(tst[cat_cols])
tst[cat_cols] = fe.transform(tst[cat_cols])
X_ee = ee.transform(tst[cat_cols])

Denoising AutoEncoder (DAE)

For reference for DAE, please check out Vincent et al. (2010), "Stacked Denoising Autoencoders".

import pandas as pd
from kaggler.preprocessing import DAE

trn = pd.read_csv('train.csv')
tst = pd.read_csv('test.csv')
target_col = trn.columns[-1]
cat_cols = [col for col in trn.columns if trn[col].dtype == 'object']
num_cols = [col for col in trn.columns if col not in cat_cols + [target_col]]

# Default DAE with only the swapping noise and a single encoder/decoder pair.
dae = DAE(cat_cols=cat_cols, num_cols=num_cols, n_encoding=128)
X = dae.fit_transform(pd.concat([trn, tst], axis=0))    # encoding input features into the encoding vectors with size of 128

# Stacked DAE with the Gaussian noise, swapping noise and zero masking in 3 pairs of the encoder/decoder.
sdae = DAE(cat_cols=cat_cols, num_cols=num_cols, n_encoding=128, n_layer=3,
           noise_std=.05, swap_prob=.2, mask_prob=.1)
X = sdae.fit_transform(pd.concat([trn, tst], axis=0))

# Supervised DAE with the Gaussian noise, swapping noise and zero masking in 3 encoders in the encoder/decoder pair.
sdae = SDAE(cat_cols=cat_cols, num_cols=num_cols, n_encoding=128, n_encoder=3,
           noise_std=.05, swap_prob=.2, mask_prob=.1)
X = sdae.fit_transform(trn, trn[target_col])

AutoML

Feature Selection & Hyperparameter Tuning

import pandas as pd
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from kaggler.metrics import auc
from kaggler.model import AutoLGB


RANDOM_SEED = 42
N_OBS = 10000
N_FEATURE = 100
N_IMP_FEATURE = 20

X, y = make_classification(n_samples=N_OBS,
                            n_features=N_FEATURE,
                            n_informative=N_IMP_FEATURE,
                            random_state=RANDOM_SEED)
X = pd.DataFrame(X, columns=['x{}'.format(i) for i in range(X.shape[1])])
y = pd.Series(y)

X_trn, X_tst, y_trn, y_tst = train_test_split(X, y,
                                                test_size=.2,
                                                random_state=RANDOM_SEED)

model = AutoLGB(objective='binary', metric='auc')
model.tune(X_trn, y_trn)
model.fit(X_trn, y_trn)
p = model.predict(X_tst)
print('AUC: {:.4f}'.format(auc(y_tst, p)))

Ensemble

Netflix Blending

import numpy as np
from kaggler.ensemble import netflix
from kaggler.metrics import rmse

# Load the predictions of input models for ensemble
p1 = np.loadtxt('model1_prediction.txt')
p2 = np.loadtxt('model2_prediction.txt')
p3 = np.loadtxt('model3_prediction.txt')

# Calculate RMSEs of model predictions and all-zero prediction.
# At a competition, RMSEs (or RMLSEs) of submissions can be used.
y = np.loadtxt('target.txt')
e0 = rmse(y, np.zeros_like(y))
e1 = rmse(y, p1)
e2 = rmse(y, p2)
e3 = rmse(y, p3)

p, w = netflix([e1, e2, e3], [p1, p2, p3], e0, l=0.0001) # l is an optional regularization parameter.

Algorithms

Currently algorithms available are as follows:

Online learning algorithms

  • Stochastic Gradient Descent (SGD)
  • Follow-the-Regularized-Leader (FTRL)
  • Factorization Machine (FM)
  • Neural Networks (NN) - with a single (NN) or two (NN_H2) ReLU hidden layers
  • Decision Tree

Batch learning algorithm

  • Neural Networks (NN) - with a single hidden layer and L-BFGS optimization

Examples

from kaggler.online_model import SGD, FTRL, FM, NN

# SGD
clf = SGD(a=.01,                # learning rate
          l1=1e-6,              # L1 regularization parameter
          l2=1e-6,              # L2 regularization parameter
          n=2**20,              # number of hashed features
          epoch=10,             # number of epochs
          interaction=True)     # use feature interaction or not

# FTRL
clf = FTRL(a=.1,                # alpha in the per-coordinate rate
           b=1,                 # beta in the per-coordinate rate
           l1=1.,               # L1 regularization parameter
           l2=1.,               # L2 regularization parameter
           n=2**20,             # number of hashed features
           epoch=1,             # number of epochs
           interaction=True)    # use feature interaction or not

# FM
clf = FM(n=1e5,                 # number of features
         epoch=100,             # number of epochs
         dim=4,                 # size of factors for interactions
         a=.01)                 # learning rate

# NN
clf = NN(n=1e5,                 # number of features
         epoch=10,              # number of epochs
         h=16,                  # number of hidden units
         a=.1,                  # learning rate
         l2=1e-6)               # L2 regularization parameter

# online training and prediction directly with a libsvm file
for x, y in clf.read_sparse('train.sparse'):
    p = clf.predict_one(x)      # predict for an input
    clf.update_one(x, p - y)    # update the model with the target using error

for x, _ in clf.read_sparse('test.sparse'):
    p = clf.predict_one(x)

# online training and prediction with a scipy sparse matrix
from kaggler import load_data

X, y = load_data('train.sps')

clf.fit(X, y)
p = clf.predict(X)

Data I/O

Kaggler supports CSV (.csv), LibSVM (.sps), and HDF5 (.h5) file formats:

# CSV format: target,feature1,feature2,...
1,1,0,0,1,0.5
0,0,1,0,0,5

# LibSVM format: target feature-index1:feature-value1 feature-index2:feature-value2
1 1:1 4:1 5:0.5
0 2:1 5:1

# HDF5
- issparse: binary flag indicating whether it stores sparse data or not.
- target: stores a target variable as a numpy.array
- shape: available only if issparse == 1. shape of scipy.sparse.csr_matrix
- indices: available only if issparse == 1. indices of scipy.sparse.csr_matrix
- indptr: available only if issparse == 1. indptr of scipy.sparse.csr_matrix
- data: dense feature matrix if issparse == 0 else data of scipy.sparse.csr_matrix
from kaggler.data_io import load_data, save_data

X, y = load_data('train.csv')	# use the first column as a target variable
X, y = load_data('train.h5')	# load the feature matrix and target vector from a HDF5 file.
X, y = load_data('train.sps')	# load the feature matrix and target vector from LibSVM file.

save_data(X, y, 'train.csv')
save_data(X, y, 'train.h5')
save_data(X, y, 'train.sps')

Documentation

Package documentation is available at here

kaggler's People

Contributors

clarkgrubb avatar dickmao avatar fullstart avatar jeongyoonlee avatar ppstacy avatar stegben avatar yejiming avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kaggler's Issues

DAE/SDAE's `transform` changes the input dataframe

From the comment by Bruce Harold at Kaggle:

Thank you for sharing both your utility code and this example. I tried using your SDAE with logistic regression and it helped significantly. I look forward to trying some of your other tools as well.

But I found one thing that surprised me: two different runs of the transform function on the same array yielded two different results. It turns out that this method is changing the data passed into it.

Here is an example:

trans_test_1 = sdae.transform(tr2_X)
print(tr2_X.iloc[0:3, 72:], '\n\n')

trans_test_2 = sdae.transform(tr2_X)
print(tr2_X.iloc[0:3, 72:])

Output

   feature_72  feature_73  feature_74
2           0           7           6
3           4          13           6
4           0           7           6 

   feature_72  feature_73  feature_74
2           0           9           8
3           4          15           8
4           0           9           8

I assume that this is an oversight and that you would like to change this behavior.

pip install on ubuntu 16.04

By
pip install kaggler

on ubuntu 16.04 one would get:

creating build/temp.linux-x86_64-2.7/kaggler/online_model/murmurhash
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I. -I/usr/include/python2.7 -c kaggler/online_model/ftrl.c -o build/temp.linux-x86_64-2.7/kaggler/online_model/ftrl.o -O3
In file included from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ndarraytypes.h:1788:0,
                 from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ndarrayobject.h:18,
                 from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/arrayobject.h:4,
                 from kaggler/online_model/ftrl.c:275:
/usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
 #warning "Using deprecated NumPy API, disable it by " \
  ^
kaggler/online_model/ftrl.c:277:36: fatal error: murmurhash/MurmurHash3.h: No such file or directory
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

And I have found no way to install MurmurHash3.h on ubuntu 16.04

Columns and DataType Not Explicitly Set on line 13 of test_ohe.py

Hello!

I found an AI-Specific Code smell in your project.
The smell is called: Columns and DataType Not Explicitly Set

You can find more information about it in this paper: https://dl.acm.org/doi/abs/10.1145/3522664.3528620.

According to the paper, the smell is described as follows:

Problem If the columns are not selected explicitly, it is not easy for developers to know what to expect in the downstream data schema. If the datatype is not set explicitly, it may silently continue the next step even though the input is unexpected, which may cause errors later. The same applies to other data importing scenarios.
Solution It is recommended to set the columns and DataType explicitly in data processing.
Impact Readability

Example:

### Pandas Column Selection
import pandas as pd
df = pd.read_csv('data.csv')
+ df = df[['col1', 'col2', 'col3']]

### Pandas Set DataType
import pandas as pd
- df = pd.read_csv('data.csv')
+ df = pd.read_csv('data.csv', dtype={'col1': 'str', 'col2': 'int', 'col3': 'float'})

You can find the code related to this smell in this link:

import pandas as pd
from kaggler.preprocessing import OneHotEncoder
N_OBS = int(1e6)
N_FEATURE = 10
N_CATEGORY = 1000
def test():
df = pd.DataFrame(
np.random.randint(0, N_CATEGORY, size=(N_OBS, N_FEATURE)),
columns=["c{}".format(x) for x in range(N_FEATURE)],
)
profiler = cProfile.Profile(subcalls=True, builtins=True, timeunit=0.001)
ohe = OneHotEncoder(min_obs=100)
profiler.enable()
ohe.fit(df)
X_new = ohe.transform(df)
profiler.disable()
profiler.print_stats()
.

I also found instances of this smell in other files, such as:

File: https://github.com/jeongyoonlee/Kaggler/blob/master/kaggler/metrics/plot.py#L84-L94 Line: 89
File: https://github.com/jeongyoonlee/Kaggler/blob/master/kaggler/metrics/plot.py#L87-L97 Line: 92
File: https://github.com/jeongyoonlee/Kaggler/blob/master/kaggler/metrics/plot.py#L98-L108 Line: 103
File: https://github.com/jeongyoonlee/Kaggler/blob/master/kaggler/model/automl.py#L205-L215 Line: 210
File: https://github.com/jeongyoonlee/Kaggler/blob/master/kaggler/preprocessing/categorical.py#L303-L313 Line: 308
.

I hope this information is helpful!

LabelEncoder Usage

Hi,
The following piece of code throws an error. Why?

from kaggler.preprocessing import LabelEncoder
le = LabelEncoder()
le.fit_transform(pd.Series([1,1,1,2,2,2,3,3,3]))

Error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
c:\Users\semic\Desktop\dsi19-oct\main.py in 
      1 le = LabelEncoder()
----> 2 le.fit_transform(pd.Series([1,1,1,2,2,2,3,3,3]))

~\Anaconda3\lib\site-packages\kaggler\preprocessing\categorical.py in fit_transform(self, X, y)
    121         """
    122 
--> 123         self.label_encoders = [None] * X.shape[1]
    124         self.label_maxes = [None] * X.shape[1]
    125 

IndexError: tuple index out of range

ValueError: For early stopping, at least one dataset and eval metric is required for evaluation

When I run AutoLGB using objective="regression" and metric="neg_mean_absolute_error", I get an ValueError: For early stopping, at least one dataset and eval metric is required for evaluation error.
Here is the complete stacktrace:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-4052be1dbbca> in <module>
      3 model = AutoLGB(metric="neg_mean_absolute_error", 
      4                 objective="regression")
----> 5 model.tune(X_train, y_train)
      6 model.fit(X_train, y_train)

/opt/conda/lib/python3.6/site-packages/kaggler/model/automl.py in tune(self, X, y)
    114             self.features = self.select_features(X_s,
    115                                                  y_s,
--> 116                                                  n_eval=self.n_fs)
    117             logger.info('selecting {} out of {} features'.format(
    118                 len(self.features), X.shape[1])

/opt/conda/lib/python3.6/site-packages/kaggler/model/automl.py in select_features(self, X, y, n_eval)
    164             random_cols.append(random_col)
    165 
--> 166         _, trials = self.optimize_hyperparam(X.values, y.values, n_eval=n_eval)
    167 
    168         feature_importances = self._get_feature_importance(

/opt/conda/lib/python3.6/site-packages/kaggler/model/automl.py in optimize_hyperparam(self, X, y, test_size, n_eval)
    258         best = hyperopt.fmin(fn=objective, space=self.space, trials=trials,
    259                              algo=tpe.suggest, max_evals=n_eval, verbose=1,
--> 260                              rstate=self.random_state)
    261 
    262         hyperparams = space_eval(self.space, best)

/opt/conda/lib/python3.6/site-packages/hyperopt/fmin.py in fmin(fn, space, algo, max_evals, trials, rstate, allow_trials_fmin, pass_expr_memo_ctrl, catch_eval_exceptions, verbose, return_argmin, points_to_evaluate, max_queue_len, show_progressbar)
    387             catch_eval_exceptions=catch_eval_exceptions,
    388             return_argmin=return_argmin,
--> 389             show_progressbar=show_progressbar,
    390         )
    391 

/opt/conda/lib/python3.6/site-packages/hyperopt/base.py in fmin(self, fn, space, algo, max_evals, max_queue_len, rstate, verbose, pass_expr_memo_ctrl, catch_eval_exceptions, return_argmin, show_progressbar)
    641             catch_eval_exceptions=catch_eval_exceptions,
    642             return_argmin=return_argmin,
--> 643             show_progressbar=show_progressbar)
    644 
    645 

/opt/conda/lib/python3.6/site-packages/hyperopt/fmin.py in fmin(fn, space, algo, max_evals, trials, rstate, allow_trials_fmin, pass_expr_memo_ctrl, catch_eval_exceptions, verbose, return_argmin, points_to_evaluate, max_queue_len, show_progressbar)
    406                     show_progressbar=show_progressbar)
    407     rval.catch_eval_exceptions = catch_eval_exceptions
--> 408     rval.exhaust()
    409     if return_argmin:
    410         return trials.argmin

/opt/conda/lib/python3.6/site-packages/hyperopt/fmin.py in exhaust(self)
    260     def exhaust(self):
    261         n_done = len(self.trials)
--> 262         self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
    263         self.trials.refresh()
    264         return self

/opt/conda/lib/python3.6/site-packages/hyperopt/fmin.py in run(self, N, block_until_done)
    225                     else:
    226                         # -- loop over trials and do the jobs directly
--> 227                         self.serial_evaluate()
    228 
    229                     try:

/opt/conda/lib/python3.6/site-packages/hyperopt/fmin.py in serial_evaluate(self, N)
    139                 ctrl = base.Ctrl(self.trials, current_trial=trial)
    140                 try:
--> 141                     result = self.domain.evaluate(spec, ctrl)
    142                 except Exception as e:
    143                     logger.info('job exception: %s' % str(e))

/opt/conda/lib/python3.6/site-packages/hyperopt/base.py in evaluate(self, config, ctrl, attach_attachments)
    846                 memo=memo,
    847                 print_node_on_error=self.rec_eval_print_node_on_error)
--> 848             rval = self.fn(pyll_rval)
    849 
    850         if isinstance(rval, (float, int, np.number)):

/opt/conda/lib/python3.6/site-packages/kaggler/model/automl.py in objective(hyperparams)
    248                               valid_data,
    249                               early_stopping_rounds=self.n_stop,
--> 250                               verbose_eval=0)
    251 
    252             score = (model.best_score["valid_0"][self.params["metric"]] *

/opt/conda/lib/python3.6/site-packages/lightgbm/engine.py in train(params, train_set, num_boost_round, valid_sets, valid_names, fobj, feval, init_model, feature_name, categorical_feature, early_stopping_rounds, evals_result, verbose_eval, learning_rates, keep_training_booster, callbacks)
    231                                         begin_iteration=init_iteration,
    232                                         end_iteration=init_iteration + num_boost_round,
--> 233                                         evaluation_result_list=evaluation_result_list))
    234         except callback.EarlyStopException as earlyStopException:
    235             booster.best_iteration = earlyStopException.best_iteration + 1

/opt/conda/lib/python3.6/site-packages/lightgbm/callback.py in _callback(env)
    209     def _callback(env):
    210         if not cmp_op:
--> 211             _init(env)
    212         if not enabled[0]:
    213             return

/opt/conda/lib/python3.6/site-packages/lightgbm/callback.py in _init(env)
    190             return
    191         if not env.evaluation_result_list:
--> 192             raise ValueError('For early stopping, '
    193                              'at least one dataset and eval metric is required for evaluation')
    194 

ValueError: For early stopping, at least one dataset and eval metric is required for evaluation

The pandas version is: 0.23.4
The ligthgbm version is: 2.2.3
The error might be due to the lightgbm version?

Development

Is this package still in development? i found many error while implementing dae and sdae.

Errors may have originated from an input operation.
Input Source operations connected to node decoder_model/CabinType_emb/embedding_lookup:
 decoder_model/CabinType_emb/embedding_lookup/1202 (defined at /opt/conda/lib/python3.7/contextlib.py:112)

Function call stack:
train_function

NN has a mistake

dl/dz = dl/dy * dy/dz = dl/dy * w1

dl_dz = dl_dy * self.w1[j]

I think it's
dl_dz = dl_dy * self.w1[j] * (0 or 1)

am I right?

Set embedding layer to n_uniq + 1

Hi @jeongyoonlee, I am getting following error when using EmbeddingEncoder()

InvalidArgumentError: indices[389,0] = 3 is not in [0, 3) [[{{node prior_rider_segment_emb/embedding_lookup}}]]

I might be wrong but I think this happened because index starting from 0. Could we set the embedding layer to n_uniq +1 that will handle the out of bound index error.

DAE References and Performance

Hi @jeongyoonlee, I saw you added the DAE in the recent release! I didn't find a lot of references for DAE, so wondering if you could share a bit more? Additionally for the probability to add swap noise to features, how do we decide the probability to use here? Is there any rule of thumb to follow?

I assume DAE will perform better on certain datasets with noise in the features, so by any chance do you some examples to share and potentially comparing the performance with other feature engineering methods we have in the pacakge?

Thanks a lot!!

macos pip install failure

Within a Python 3.6 environment (managed by conda), when I run pip install Kaggler, pip install -U Kaggler, or pip install --no-cache-dir Kaggler I get the following error message (see below).
I also ran pip install -U cython beforehand to update cython, but same error msg occurs.

(python3.6) mike-yung$ pip install -no-cache-dir Kaggler

Usage:
  pip install [options] <requirement specifier> [package-index-options] ...
  pip install [options] -r <requirements file> [package-index-options] ...
  pip install [options] [-e] <vcs project url> ...
  pip install [options] [-e] <local project path> ...
  pip install [options] <archive url/path> ...

no such option: -n
(python3.6) mike-yung-C02WC0F4HTDG:ltvent mike.yung$ pip install --no-cache-dir Kaggler
Looking in indexes: https://yoober7:****@pypi.uberinternal.com/index, https://pypi.python.org/simple
Collecting Kaggler
  Downloading https://pypi.uberinternal.com/packages/af/98/25d2c773369ba56b2e70e584f5ab4ab1ed1708df6ec8dcc153d77f03607e/Kaggler-0.6.9.tar.gz (812kB)
    100% |████████████████████████████████| 819kB 14.3MB/s
Requirement already satisfied: setuptools>=41.0.0 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (41.0.1)
Requirement already satisfied: cython in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (0.29.7)
Requirement already satisfied: h5py in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (2.9.0)
Requirement already satisfied: ml_metrics in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (0.1.4)
Requirement already satisfied: numpy in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (1.16.2)
Requirement already satisfied: pandas in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (0.24.2)
Requirement already satisfied: matplotlib in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (2.2.4)
Requirement already satisfied: scipy>=0.14.0 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (1.2.1)
Requirement already satisfied: scikit-learn>=0.15.0 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (0.20.3)
Requirement already satisfied: statsmodels>=0.5.0 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (0.9.0)
Requirement already satisfied: kaggle in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (1.5.3)
Requirement already satisfied: tensorflow in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (1.13.1)
Requirement already satisfied: keras in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from Kaggler) (2.2.4)
Requirement already satisfied: six in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from h5py->Kaggler) (1.12.0)
Requirement already satisfied: pytz>=2011k in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from pandas->Kaggler) (2018.9)
Requirement already satisfied: python-dateutil>=2.5.0 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from pandas->Kaggler) (2.8.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from matplotlib->Kaggler) (2.3.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from matplotlib->Kaggler) (1.0.1)
Requirement already satisfied: cycler>=0.10 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from matplotlib->Kaggler) (0.10.0)
Requirement already satisfied: requests in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from kaggle->Kaggler) (2.21.0)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from kaggle->Kaggler) (1.24.1)
Requirement already satisfied: python-slugify in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from kaggle->Kaggler) (3.0.2)
Requirement already satisfied: certifi in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from kaggle->Kaggler) (2019.3.9)
Requirement already satisfied: tqdm in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from kaggle->Kaggler) (4.32.1)
Requirement already satisfied: wheel>=0.26 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow->Kaggler) (0.33.1)
Requirement already satisfied: tensorboard<1.14.0,>=1.13.0 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow->Kaggler) (1.13.1)
Requirement already satisfied: gast>=0.2.0 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow->Kaggler) (0.2.2)
Requirement already satisfied: tensorflow-estimator<1.14.0rc0,>=1.13.0 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow->Kaggler) (1.13.0)
Requirement already satisfied: astor>=0.6.0 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow->Kaggler) (0.7.1)
Requirement already satisfied: termcolor>=1.1.0 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow->Kaggler) (1.1.0)
Requirement already satisfied: keras-preprocessing>=1.0.5 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow->Kaggler) (1.0.9)
Requirement already satisfied: protobuf>=3.6.1 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow->Kaggler) (3.7.1)
Requirement already satisfied: grpcio>=1.8.6 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow->Kaggler) (1.20.1)
Requirement already satisfied: absl-py>=0.1.6 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow->Kaggler) (0.7.1)
Requirement already satisfied: keras-applications>=1.0.6 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow->Kaggler) (1.0.7)
Requirement already satisfied: pyyaml in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from keras->Kaggler) (5.1)
Requirement already satisfied: idna<2.9,>=2.5 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from requests->kaggle->Kaggler) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from requests->kaggle->Kaggler) (3.0.4)
Requirement already satisfied: text-unidecode==1.2 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from python-slugify->kaggle->Kaggler) (1.2)
Requirement already satisfied: werkzeug>=0.11.15 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorboard<1.14.0,>=1.13.0->tensorflow->Kaggler) (0.14.1)
Requirement already satisfied: markdown>=2.6.8 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorboard<1.14.0,>=1.13.0->tensorflow->Kaggler) (3.1)
Requirement already satisfied: mock>=2.0.0 in /anaconda2/envs/python3.6/lib/python3.6/site-packages (from tensorflow-estimator<1.14.0rc0,>=1.13.0->tensorflow->Kaggler) (3.0.5)
Installing collected packages: Kaggler
  Running setup.py install for Kaggler ... error
    Complete output from command /anaconda2/envs/python3.6/bin/python -u -c "import setuptools, tokenize;__file__='/private/var/folders/1y/btgkmt992l94_1d37rkvhc380000gn/T/pip-install-ttr3it94/Kaggler/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /private/var/folders/1y/btgkmt992l94_1d37rkvhc380000gn/T/pip-record-g7a_hyv1/install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    creating build
    creating build/lib.macosx-10.7-x86_64-3.6
    creating build/lib.macosx-10.7-x86_64-3.6/kaggler
    copying kaggler/data_io.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler
    copying kaggler/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler
    copying kaggler/const.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler
    creating build/lib.macosx-10.7-x86_64-3.6/kaggler/feature_selection
    copying kaggler/feature_selection/feature_selection.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/feature_selection
    copying kaggler/feature_selection/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/feature_selection
    creating build/lib.macosx-10.7-x86_64-3.6/kaggler/ensemble
    copying kaggler/ensemble/linear.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/ensemble
    copying kaggler/ensemble/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/ensemble
    creating build/lib.macosx-10.7-x86_64-3.6/kaggler/model
    copying kaggler/model/nn.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/model
    copying kaggler/model/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/model
    creating build/lib.macosx-10.7-x86_64-3.6/kaggler/metrics
    copying kaggler/metrics/regression.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/metrics
    copying kaggler/metrics/classification.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/metrics
    copying kaggler/metrics/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/metrics
    creating build/lib.macosx-10.7-x86_64-3.6/kaggler/online_model
    copying kaggler/online_model/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/online_model
    creating build/lib.macosx-10.7-x86_64-3.6/kaggler/preprocessing
    copying kaggler/preprocessing/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/preprocessing
    copying kaggler/preprocessing/data.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/preprocessing
    creating build/lib.macosx-10.7-x86_64-3.6/kaggler/test
    copying kaggler/test/test_sgd.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/test
    copying kaggler/test/test_ftrl.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/test
    copying kaggler/test/test_lbe.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/test
    copying kaggler/test/__init__.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/test
    copying kaggler/test/test_ohe.py -> build/lib.macosx-10.7-x86_64-3.6/kaggler/test
    running build_ext
    skipping 'kaggler/online_model/ftrl.c' Cython extension (up-to-date)
    building 'kaggler.online_model.ftrl' extension
    creating build/temp.macosx-10.7-x86_64-3.6
    creating build/temp.macosx-10.7-x86_64-3.6/kaggler
    creating build/temp.macosx-10.7-x86_64-3.6/kaggler/online_model
    creating build/temp.macosx-10.7-x86_64-3.6/kaggler/online_model/murmurhash
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/anaconda2/envs/python3.6/include -arch x86_64 -I/anaconda2/envs/python3.6/include -arch x86_64 -I. -I/anaconda2/envs/python3.6/include/python3.6m -I/anaconda2/envs/python3.6/lib/python3.6/site-packages/numpy/core/include -c kaggler/online_model/ftrl.c -o build/temp.macosx-10.7-x86_64-3.6/kaggler/online_model/ftrl.o -O3
    In file included from kaggler/online_model/ftrl.c:594:
    In file included from /anaconda2/envs/python3.6/lib/python3.6/site-packages/numpy/core/include/numpy/arrayobject.h:4:
    In file included from /anaconda2/envs/python3.6/lib/python3.6/site-packages/numpy/core/include/numpy/ndarrayobject.h:12:
    In file included from /anaconda2/envs/python3.6/lib/python3.6/site-packages/numpy/core/include/numpy/ndarraytypes.h:1824:
    /anaconda2/envs/python3.6/lib/python3.6/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: "Using deprecated NumPy API, disable it with "          "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings]
    #warning "Using deprecated NumPy API, disable it with " \
     ^
    1 warning generated.
    gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/anaconda2/envs/python3.6/include -arch x86_64 -I/anaconda2/envs/python3.6/include -arch x86_64 -I. -I/anaconda2/envs/python3.6/include/python3.6m -I/anaconda2/envs/python3.6/lib/python3.6/site-packages/numpy/core/include -c kaggler/online_model/murmurhash/MurmurHash3.cpp -o build/temp.macosx-10.7-x86_64-3.6/kaggler/online_model/murmurhash/MurmurHash3.o -O3
    warning: include path for stdlibc++ headers not found; pass '-stdlib=libc++' on the command line to use the libc++ standard library instead [-Wstdlibcxx-not-found]
    1 warning generated.
    g++ -bundle -undefined dynamic_lookup -L/anaconda2/envs/python3.6/lib -arch x86_64 -L/anaconda2/envs/python3.6/lib -arch x86_64 -arch x86_64 build/temp.macosx-10.7-x86_64-3.6/kaggler/online_model/ftrl.o build/temp.macosx-10.7-x86_64-3.6/kaggler/online_model/murmurhash/MurmurHash3.o -o build/lib.macosx-10.7-x86_64-3.6/kaggler/online_model/ftrl.cpython-36m-darwin.so
    clang: warning: libstdc++ is deprecated; move to libc++ with a minimum deployment target of OS X 10.9 [-Wdeprecated]
    ld: library not found for -lstdc++
    clang: error: linker command failed with exit code 1 (use -v to see invocation)
    error: command 'g++' failed with exit status 1

    ----------------------------------------
Command "/anaconda2/envs/python3.6/bin/python -u -c "import setuptools, tokenize;__file__='/private/var/folders/1y/btgkmt992l94_1d37rkvhc380000gn/T/pip-install-ttr3it94/Kaggler/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /private/var/folders/1y/btgkmt992l94_1d37rkvhc380000gn/T/pip-record-g7a_hyv1/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/var/folders/1y/btgkmt992l94_1d37rkvhc380000gn/T/pip-install-ttr3it94/Kaggler/

ERROR: Command errored out with exit status 1

Hello,

I try to install kaggler with pip under Mac OS: pip install -U Kaggler, but I get the errors below.

Could anyone help me to fix it? Thanks.

`Building wheels for collected packages: Kaggler
Building wheel for Kaggler (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /usr/local/opt/python/bin/python3.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_g/2rrxbrnj1xb_h5k70xgtdss00000gn/T/pip-install-f8todfi2/Kaggler/setup.py'"'"'; file='"'"'/private/var/folders/_g/2rrxbrnj1xb_h5k70xgtdss00000gn/T/pip-install-f8todfi2/Kaggler/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /private/var/folders/_g/2rrxbrnj1xb_h5k70xgtdss00000gn/T/pip-wheel-ah9hp8b7 --python-tag cp37
cwd: /private/var/folders/_g/2rrxbrnj1xb_h5k70xgtdss00000gn/T/pip-install-f8todfi2/Kaggler/
Complete output (43 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-10.14-x86_64-3.7
creating build/lib.macosx-10.14-x86_64-3.7/kaggler
copying kaggler/data_io.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler
copying kaggler/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler
copying kaggler/const.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/feature_selection
copying kaggler/feature_selection/feature_selection.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/feature_selection
copying kaggler/feature_selection/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/feature_selection
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/ensemble
copying kaggler/ensemble/linear.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/ensemble
copying kaggler/ensemble/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/ensemble
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/model
copying kaggler/model/nn.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/model
copying kaggler/model/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/model
copying kaggler/model/automl.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/model
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/metrics
copying kaggler/metrics/regression.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/metrics
copying kaggler/metrics/classification.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/metrics
copying kaggler/metrics/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/metrics
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/online_model
copying kaggler/online_model/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/online_model
copying kaggler/online_model/classification_tree.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/online_model
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/preprocessing
copying kaggler/preprocessing/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/preprocessing
copying kaggler/preprocessing/data.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/preprocessing
running build_ext
skipping 'kaggler/online_model/ftrl.c' Cython extension (up-to-date)
building 'kaggler.online_model.ftrl' extension
creating build/temp.macosx-10.14-x86_64-3.7
creating build/temp.macosx-10.14-x86_64-3.7/kaggler
creating build/temp.macosx-10.14-x86_64-3.7/kaggler/online_model
creating build/temp.macosx-10.14-x86_64-3.7/kaggler/online_model/murmurhash
clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -I. -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/include/python3.7m -I/usr/local/lib/python3.7/site-packages/numpy/core/include -c kaggler/online_model/ftrl.c -o build/temp.macosx-10.14-x86_64-3.7/kaggler/online_model/ftrl.o -O3 -mmacosx-version-min=10.9
In file included from kaggler/online_model/ftrl.c:4:
/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/include/python3.7m/Python.h:25:10: fatal error: 'stdio.h' file not found
#include <stdio.h>
^~~~~~~~~
1 error generated.
error: command 'clang' failed with exit status 1

ERROR: Failed building wheel for Kaggler
Running setup.py clean for Kaggler
Failed to build Kaggler
Installing collected packages: Kaggler
Running setup.py install for Kaggler ... error
ERROR: Command errored out with exit status 1:
command: /usr/local/opt/python/bin/python3.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_g/2rrxbrnj1xb_h5k70xgtdss00000gn/T/pip-install-f8todfi2/Kaggler/setup.py'"'"'; file='"'"'/private/var/folders/_g/2rrxbrnj1xb_h5k70xgtdss00000gn/T/pip-install-f8todfi2/Kaggler/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /private/var/folders/_g/2rrxbrnj1xb_h5k70xgtdss00000gn/T/pip-record-qohfjlru/install-record.txt --single-version-externally-managed --compile
cwd: /private/var/folders/_g/2rrxbrnj1xb_h5k70xgtdss00000gn/T/pip-install-f8todfi2/Kaggler/
Complete output (43 lines):
running install
running build
running build_py
creating build
creating build/lib.macosx-10.14-x86_64-3.7
creating build/lib.macosx-10.14-x86_64-3.7/kaggler
copying kaggler/data_io.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler
copying kaggler/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler
copying kaggler/const.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/feature_selection
copying kaggler/feature_selection/feature_selection.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/feature_selection
copying kaggler/feature_selection/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/feature_selection
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/ensemble
copying kaggler/ensemble/linear.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/ensemble
copying kaggler/ensemble/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/ensemble
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/model
copying kaggler/model/nn.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/model
copying kaggler/model/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/model
copying kaggler/model/automl.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/model
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/metrics
copying kaggler/metrics/regression.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/metrics
copying kaggler/metrics/classification.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/metrics
copying kaggler/metrics/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/metrics
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/online_model
copying kaggler/online_model/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/online_model
copying kaggler/online_model/classification_tree.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/online_model
creating build/lib.macosx-10.14-x86_64-3.7/kaggler/preprocessing
copying kaggler/preprocessing/init.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/preprocessing
copying kaggler/preprocessing/data.py -> build/lib.macosx-10.14-x86_64-3.7/kaggler/preprocessing
running build_ext
skipping 'kaggler/online_model/ftrl.c' Cython extension (up-to-date)
building 'kaggler.online_model.ftrl' extension
creating build/temp.macosx-10.14-x86_64-3.7
creating build/temp.macosx-10.14-x86_64-3.7/kaggler
creating build/temp.macosx-10.14-x86_64-3.7/kaggler/online_model
creating build/temp.macosx-10.14-x86_64-3.7/kaggler/online_model/murmurhash
clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -I. -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/include/python3.7m -I/usr/local/lib/python3.7/site-packages/numpy/core/include -c kaggler/online_model/ftrl.c -o build/temp.macosx-10.14-x86_64-3.7/kaggler/online_model/ftrl.o -O3 -mmacosx-version-min=10.9
In file included from kaggler/online_model/ftrl.c:4:
/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/include/python3.7m/Python.h:25:10: fatal error: 'stdio.h' file not found
#include <stdio.h>
^~~~~~~~~
1 error generated.
error: command 'clang' failed with exit status 1
----------------------------------------
ERROR: Command errored out with exit status 1: /usr/local/opt/python/bin/python3.7 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/_g/2rrxbrnj1xb_h5k70xgtdss00000gn/T/pip-install-f8todfi2/Kaggler/setup.py'"'"'; file='"'"'/private/var/folders/_g/2rrxbrnj1xb_h5k70xgtdss00000gn/T/pip-install-f8todfi2/Kaggler/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /private/var/folders/_g/2rrxbrnj1xb_h5k70xgtdss00000gn/T/pip-record-qohfjlru/install-record.txt --single-version-externally-managed --compile Check the logs for full command output.`

API interface modified

After some modification, I forgot to make update_one and predict_one robust to other use case. The interfaces should be modified

AutoLGB tune is example is not working

Hello, I tried to run the AutoLGB on Kaggle notebook, I downloaded with pip install:

!pip install kaggler

Kaggle Version: '0.9.13'

After that I used the following code snippet and got an error, the same error happened with another dataset.

import pandas as pd
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from kaggler.metrics import auc
from kaggler.model import AutoLGB


RANDOM_SEED = 42
N_OBS = 10000
N_FEATURE = 100
N_IMP_FEATURE = 20

X, y = make_classification(n_samples=N_OBS,
                            n_features=N_FEATURE,
                            n_informative=N_IMP_FEATURE,
                            random_state=RANDOM_SEED)
X = pd.DataFrame(X, columns=['x{}'.format(i) for i in range(X.shape[1])])
y = pd.Series(y)

X_trn, X_tst, y_trn, y_tst = train_test_split(X, y,
                                                test_size=.2,
                                                random_state=RANDOM_SEED)

model = AutoLGB(objective='binary', metric='auc')
model.tune(X_trn, y_trn)
model.fit(X_trn, y_trn)
p = model.predict(X_tst)
print('AUC: {:.4f}'.format(auc(y_tst, p)))

image

How to save FTRL status

if I train FTRL with Kaggler. how to save the FTRL status for next time I use the FTRL. I do not want to train again when I use the FTRL.

ERROR: Could not build wheels for kaggler

After "pip install kaggler", got a "Failed building wheel for kaggler" error.

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for kaggler
Running setup.py clean for kaggler
Failed to build kaggler
ERROR: Could not build wheels for kaggler, which is required to install pyproject.toml-based projects

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.