Git Product home page Git Product logo

nyaggle's People

Contributors

daikikatsuragawa avatar harupy avatar kajyuuen avatar momijiame avatar nyanp avatar ryanrussell avatar tenajima avatar wakame1367 avatar yuta100101 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nyaggle's Issues

requirements.txtでインストールされるライブラリの整理

https://github.com/nyanp/nyaggle/blob/master/requirements-dev.txt からテスト時に必要なライブラリを抽出すべきかと思いました。
例えばpytestについてはテスト時のみ必要なライブラリのため
requirements-test.txtを新たに作成して、そちらに移行したほうがよいとかんじました。

Flatten logging parameters to improve readability

In run_experiment, model_params and fit_params are stored as string to distinguish them from other parameters and it makes difficult to compare params across experiments.

It seems better to flatten the dictionary as in the PR below.
mlflow/mlflow#1863

Before

{
    "fit_params": "{ early_stopping_rounds: 100 }",
    "model_params": "{ max_depth: 3, objective: \"binary\" }",
    "algorithm_type": "lgbm"
}

After

{
    "fit_params.early_stopping_rounds": 100,
    "model_params.max_depth": 3,
    "model_params.objective": "binary",
    "algorithm_type": "lgbm"
}

Drop Python 3.5 support temporally

nyaggle needs to drop Python3.5 support because the latest version of xgboost is incompatible with Python 3.5 (they use f-strings). I guess the change is not intended, but we need to drop Python 3.5 from our CI to avoid failing test continuously.

cross_validate don't work with LightGBM v4.0.0

Thanks for publishing such a useful tool!

A few days ago, LightGBM's new version 4.0.0 has been released.
In this release, early_stopping_rounds argument in fit() was removed.

So, functions that use cross_validate() such as run_experiment don't work.
(There may be other functions that don't work, I haven't investigated yet.)

Of cource, there is no probrem with versions before 3.3.5.

pytest log
(nyaggle) yuta100101:~/nyaggle(master =)$ pytest tests/validation/test_cross_validate.py::test_cv_lgbm
========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.9.17, pytest-7.4.0, pluggy-1.2.0
rootdir: /home/yuta100101/practice/nyaggle
collected 1 item                                                                                                                                                                                         

tests/validation/test_cross_validate.py F                                                                                                                                                          [100%]

================================================================================================ FAILURES ================================================================================================
______________________________________________________________________________________________ test_cv_lgbm ______________________________________________________________________________________________

    def test_cv_lgbm():
        X, y = make_classification(n_samples=1024, n_features=20, class_sep=0.98, random_state=0)
        X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0)
    
        models = [LGBMClassifier(n_estimators=300) for _ in range(5)]
    
>       pred_oof, pred_test, scores, importance = cross_validate(models, X_train, y_train, X_test, cv=5,
                                                                 eval_func=roc_auc_score,
                                                                 fit_params={'early_stopping_rounds': 200})

tests/validation/test_cross_validate.py:52: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

estimator = [LGBMClassifier(n_estimators=300), LGBMClassifier(n_estimators=300), LGBMClassifier(n_estimators=300), LGBMClassifier(n_estimators=300), LGBMClassifier(n_estimators=300)]
X_train =            0         1         2         3         4         5         6         7         8   ...        11        12... ... -0.109782 -0.412230  1.707714 -0.240937 -0.276747  0.481276 -0.278111  1.304773 -0.139538

[512 rows x 20 columns]
y = 0      0
1      0
2      0
3      1
4      0
      ..
507    0
508    1
509    0
510    1
511    0
Name: target, Length: 512, dtype: int64
X_test =            0         1         2         3         4         5         6         7         8   ...        11        12... ... -2.598922 -0.351561  0.233836 -1.873634 -1.089221  0.373956 -0.520939 -0.489945  2.452996

[512 rows x 20 columns]
cv = KFold(n_splits=5, random_state=0, shuffle=True), groups = None, eval_func = <function roc_auc_score at 0x7fe910196ee0>, logger = <Logger nyaggle.validation.cross_validate (WARNING)>
on_each_fold = None, fit_params = {'early_stopping_rounds': 200}, importance_type = 'gain', early_stopping = True, type_of_target = 'binary'

    def cross_validate(estimator: Union[BaseEstimator, List[BaseEstimator]],
                       X_train: Union[pd.DataFrame, np.ndarray], y: Union[pd.Series, np.ndarray],
                       X_test: Union[pd.DataFrame, np.ndarray] = None,
                       cv: Optional[Union[int, Iterable, BaseCrossValidator]] = None,
                       groups: Optional[pd.Series] = None,
                       eval_func: Optional[Callable] = None, logger: Optional[Logger] = None,
                       on_each_fold: Optional[Callable[[int, BaseEstimator, pd.DataFrame, pd.Series], None]] = None,
                       fit_params: Optional[Union[Dict[str, Any], Callable]] = None,
                       importance_type: str = 'gain',
                       early_stopping: bool = True,
                       type_of_target: str = 'auto') -> CVResult:
        """
        Evaluate metrics by cross-validation. It also records out-of-fold prediction and test prediction.
    
        Args:
            estimator:
                The object to be used in cross-validation. For list inputs, ``estimator[i]`` is trained on i-th fold.
            X_train:
                Training data
            y:
                Target
            X_test:
                Test data (Optional). If specified, prediction on the test data is performed using ensemble of models.
            cv:
                int, cross-validation generator or an iterable which determines the cross-validation splitting strategy.
    
                - None, to use the default ``KFold(5, random_state=0, shuffle=True)``,
                - integer, to specify the number of folds in a ``(Stratified)KFold``,
                - CV splitter (the instance of ``BaseCrossValidator``),
                - An iterable yielding (train, test) splits as arrays of indices.
            groups:
                Group labels for the samples. Only used in conjunction with a “Group” cv instance (e.g., ``GroupKFold``).
            eval_func:
                Function used for logging and returning scores
            logger:
                logger
            on_each_fold:
                called for each fold with (idx_fold, model, X_fold, y_fold)
            fit_params:
                Parameters passed to the fit method of the estimator
            importance_type:
                The type of feature importance to be used to calculate result.
                Used only in ``LGBMClassifier`` and ``LGBMRegressor``.
            early_stopping:
                If ``True``, ``eval_set`` will be added to ``fit_params`` for each fold.
                ``early_stopping_rounds = 100`` will also be appended to fit_params if it does not already have one.
            type_of_target:
                The type of target variable. If ``auto``, type is inferred by ``sklearn.utils.multiclass.type_of_target``.
                Otherwise, ``binary``, ``continuous``, or ``multiclass`` are supported.
        Returns:
            Namedtuple with following members
    
            * oof_prediction (numpy array, shape (len(X_train),)):
                The predicted value on put-of-Fold validation data.
            * test_prediction (numpy array, hape (len(X_test),)):
                The predicted value on test data. ``None`` if X_test is ``None``.
            * scores (list of float, shape (nfolds+1,)):
                ``scores[i]`` denotes validation score in i-th fold.
                ``scores[-1]`` is the overall score. `None` if eval is not specified.
            * importance (list of pandas DataFrame, shape (nfolds,)):
                ``importance[i]`` denotes feature importance in i-th fold model.
                If the estimator is not GBDT, empty array is returned.
    
        Example:
            >>> from sklearn.datasets import make_regression
            >>> from sklearn.linear_model import Ridge
            >>> from sklearn.metrics import mean_squared_error
            >>> from nyaggle.validation import cross_validate
    
            >>> X, y = make_regression(n_samples=8)
            >>> model = Ridge(alpha=1.0)
            >>> pred_oof, pred_test, scores, _ = \
            >>>     cross_validate(model,
            >>>                    X_train=X[:3, :],
            >>>                    y=y[:3],
            >>>                    X_test=X[3:, :],
            >>>                    cv=3,
            >>>                    eval_func=mean_squared_error)
            >>> print(pred_oof)
            [-101.1123267 ,   26.79300693,   17.72635528]
            >>> print(pred_test)
            [-10.65095894 -12.18909059 -23.09906427 -17.68360714 -20.08218267]
            >>> print(scores)
            [71912.80290003832, 15236.680239881942, 15472.822033121925, 34207.43505768073]
        """
        cv = check_cv(cv, y)
        n_output_cols = 1
        if type_of_target == 'auto':
            type_of_target = multiclass.type_of_target(y)
        if type_of_target == 'multiclass':
            n_output_cols = y.nunique(dropna=True)
    
        if isinstance(estimator, list):
            assert len(estimator) == cv.get_n_splits(), "Number of estimators should be same to nfolds."
    
        X_train = convert_input(X_train)
        y = convert_input_vector(y, X_train.index)
        if X_test is not None:
            X_test = convert_input(X_test)
    
        if not isinstance(estimator, list):
            estimator = [estimator] * cv.get_n_splits()
    
        assert len(estimator) == cv.get_n_splits()
    
        if logger is None:
            logger = getLogger(__name__)
    
        def _predict(model: BaseEstimator, x: pd.DataFrame, _type_of_target: str):
            if _type_of_target in ('binary', 'multiclass'):
                if hasattr(model, "predict_proba"):
                    proba = model.predict_proba(x)
                elif hasattr(model, "decision_function"):
                    warnings.warn('Since {} does not have predict_proba method, '
                                  'decision_function is used for the prediction instead.'.format(type(model)))
                    proba = model.decision_function(x)
                else:
                    raise RuntimeError('Estimator in classification problem should have '
                                       'either predict_proba or decision_function')
                if proba.ndim == 1:
                    return proba
                else:
                    return proba[:, 1] if proba.shape[1] == 2 else proba
            else:
                return model.predict(x)
    
        oof = np.zeros((len(X_train), n_output_cols)) if n_output_cols > 1 else np.zeros(len(X_train))
        evaluated = np.full(len(X_train), False)
        test = None
        if X_test is not None:
            test = np.zeros((len(X_test), n_output_cols)) if n_output_cols > 1 else np.zeros(len(X_test))
    
        scores = []
        eta_all = []
        importance = []
    
        for n, (train_idx, valid_idx) in enumerate(cv.split(X_train, y, groups)):
            start_time = time.time()
    
            train_x, train_y = X_train.iloc[train_idx], y.iloc[train_idx]
            valid_x, valid_y = X_train.iloc[valid_idx], y.iloc[valid_idx]
    
            if fit_params is None:
                fit_params_fold = {}
            elif callable(fit_params):
                fit_params_fold = fit_params(n, train_idx, valid_idx)
            else:
                fit_params_fold = copy.copy(fit_params)
    
            if is_gbdt_instance(estimator[n], ('lgbm', 'cat', 'xgb')):
                if early_stopping:
                    if 'eval_set' not in fit_params_fold:
                        fit_params_fold['eval_set'] = [(valid_x, valid_y)]
                    if 'early_stopping_rounds' not in fit_params_fold:
                        fit_params_fold['early_stopping_rounds'] = 100
    
>               estimator[n].fit(train_x, train_y, **fit_params_fold)
E               TypeError: fit() got an unexpected keyword argument 'early_stopping_rounds'

nyaggle/validation/cross_validate.py:177: TypeError
======================================================================================== short test summary info =========================================================================================
FAILED tests/validation/test_cross_validate.py::test_cv_lgbm - TypeError: fit() got an unexpected keyword argument 'early_stopping_rounds'
=========================================================================================== 1 failed in 1.90s ============================================================================================

<\details>

Time series split

TimeSeriesSplit in sklearn does not meet a dataset in Kaggle in most case (it assumes fixed time intervals). We need a sklearn compatible, practical CV splitter for general time series problem.

experiment_gbdt raise errors with long parameters and mlflow

mlflow raises error if length of key/value exceeds 250. If the length of gbdt parameters or cat_columns is long, experiment_gbdt will raise an exception.

Possible option:

  • catch and ignore all errors from mlflow
  • truncate logging parameters automatically

Change default parameters of TargetEncoder

nyaggle's TargetEncoder is basically a KFold version of category_encoders, with the same interface.
category_encoders changes the default parameters in scikit-learn-contrib/category_encoders#327, so it would be better to apply the same change for nyaggle. (This also causes some CI tests to fail)

Personally, I don't think this new default parameter is always good, but as long as the nyaggle's implementation of target encoder is a thin wrapper of category_encoders, I think interface consistency should be a priority.

How do I record calculated scores using target variables before conversion? (e.g. log conversion)

This code is recorded the score of the target variable transformed by np.log1p.

train, test = load_dataset()
target_col = "y"
submit = make_sample_submission(test, target_col)
target = train[target_col]
target = target.map(np.log1p)
train.drop(columns=[target_col], inplace=True)
lightgbm_params = {
        "metric": "rmse",
        "objective": 'regression',
        "max_depth": 5,
        "num_leaves": 24,
        "learning_rate": 0.007,
        "n_estimators": 30000,
        "min_child_samples": 80,
        "subsample": 0.8,
        "colsample_bytree": 1,
        "reg_alpha": 0,
        "reg_lambda": 0,
    }

fit_params = {
        "early_stopping_rounds": 100,
        "verbose": 5000
    }

kf = KFold(n_splits=4)
lgb_result = run_experiment(lightgbm_params,
                                X_train=train,
                                y=target,
                                X_test=test,
                                eval_func=rmse,
                                cv=kf,
                                fit_params=fit_params,
                                logging_directory='resources/logs/'
                                                  'lightgbm/{time}',
                                sample_submission=submit)

Support lgb.cv and xgb.cv for cross-validation

Unlike the current implementation of cv in nyaggle, The models trained in lgb.cv and xgb.cv have an equal number of trees in all folds.

Since these “balanced” models may work better when the number of data is small, we sometimes want to extract the trained models from lgb.cv or xgb.cv and use them for test data.

So it would be useful to have the option to use these cv functions in nyaggle's run_experiment and cross_validate as well.

ref:
https://blog.amedama.jp/entry/lightgbm-cv-model
https://blog.amedama.jp/entry/xgboost-cv-model

ValueError: Supported types are: <class 'str'> or typing.Callable. Got <class 'numpy._ArrayFunctionDispatcher'> instead.

概要

実行環境

Python 3.9.10 (tags/v3.9.10:f2f3f53, Jan 17 2022, 15:14:21) [MSC v.1929 64 bit (AMD64)] on win32

エラーメッセージ

pytestを実行したところ次のエラーメッセージが出力されました。

> pytest
========================================================================= short test summary info ========================================================================= 
FAILED tests/feature/test_groupby.py::test_return_type_by_aggregation - ValueError: Supported types are: <class 'str'> or typing.Callable. Got <class 'numpy._ArrayFunctionDispatcher'> instead.
FAILED tests/feature/nlp/test_bert.py::test_bert_jp - requests.exceptions.ConnectionError: HTTPSConnectionPool(host='cdn-lfs.huggingface.co', port=443): Read timed out.    
======================================================== 2 failed, 106 passed, 1084 warnings in 332.27s (0:05:32) ========================================================= 
pytest FAILURES message

_____________________________________________________________________ test_return_type_by_aggregation _____________________________________________________________________

iris_dataframe = ( sl sw pl pw species
0 5.1 3.5 1.4 0.2 0.0
1 4.9 3.0 1.4 0.2 0.0
2 4.7 3.2 1.3... 3.4 5.4 2.3 2.0
149 5.9 3.0 5.1 1.8 2.0

[150 rows x 5 columns], 'species', ['sl', 'sw', 'pl', 'pw'])

def test_return_type_by_aggregation(iris_dataframe):
    df, group_key, group_values = iris_dataframe
    agg_methods = ["max", np.sum, custom_function]
  new_df, new_cols = aggregation(df, group_key, group_values,
                                   agg_methods)

tests\feature\test_groupby.py:27:


input_df = sl sw pl pw species
0 5.1 3.5 1.4 0.2 0.0
1 4.9 3.0 1.4 0.2 0.0
2 4.7 3.2 1.3 ... 6.5 3.0 5.2 2.0 2.0
148 6.2 3.4 5.4 2.3 2.0
149 5.9 3.0 5.1 1.8 2.0

[150 rows x 5 columns]
group_key = 'species', group_values = ['sl', 'sw', 'pl', 'pw']
agg_methods = ['max', <function sum at 0x000002226B6C2CF0>, <function custom_function at 0x000002221318C280>]

def aggregation(
        input_df: pd.DataFrame,
        group_key: str,
        group_values: List[str],
        agg_methods: List[Union[str, FunctionType]],
) -> Tuple[pd.DataFrame, List[str]]:
    """
    Aggregate values after grouping table rows by a given key.

    Args:
        input_df:
            Input data frame.
        group_key:
            Used to determine the groups for the groupby.
        group_values:
            Used to aggregate values for the groupby.
        agg_methods:
            List of function or function names, e.g. ['mean', 'max', 'min', numpy.mean].
            Do not use a lambda function because the name attribute of the lambda function cannot generate a unique string of column names in <lambda>.
    Returns:
        Tuple of output dataframe and new column names.
    """
    new_df = input_df.copy()

    new_cols = []
    for agg_method in agg_methods:
        if _is_lambda_function(agg_method):
            raise ValueError('Not supported lambda function.')
        elif isinstance(agg_method, str):
            pass
        elif isinstance(agg_method, FunctionType):
            pass
        else:
          raise ValueError('Supported types are: {} or {}.'
                             ' Got {} instead.'.format(str, Callable, type(agg_method)))

E ValueError: Supported types are: <class 'str'> or typing.Callable. Got <class 'numpy._ArrayFunctionDispatcher'> instead.

nyaggle\feature\groupby.py:89: ValueError

エラー原因

テストコードではaggregationの引数agg_methodsの期待としてnumpy.sumが渡されています。

def test_return_type_by_aggregation(iris_dataframe):
df, group_key, group_values = iris_dataframe
agg_methods = ["max", np.sum, custom_function]
new_df, new_cols = aggregation(df, group_key, group_values,
agg_methods)
assert isinstance(new_df, pd.DataFrame)
assert isinstance(new_cols, list)

aggregationの引数agg_methodsの期待として以下の3つのみサポートされていますが

  • <class 'str'>
  • <class 'function'>
  • lambda

numpy.sumのクラスは<class 'numpy._ArrayFunctionDispatcher'>であるため、if文ではじかれるようになっています。

for agg_method in agg_methods:
if _is_lambda_function(agg_method):
raise ValueError('Not supported lambda function.')
elif isinstance(agg_method, str):
pass
elif isinstance(agg_method, FunctionType):
pass
else:
raise ValueError('Supported types are: {} or {}.'
' Got {} instead.'.format(str, Callable, type(agg_method)))

修正案

#105

TargetEncoder can't convert to string (AttributeError)

Abstract

AttributeError occurs when TargetEncoder object passed to str() function.
This behavior has the potential to become an issue on debugging etc.

How to reproduce

The steps to reproduce are the following.

>>> from nyaggle.feature.category_encoder.target_encoder import TargetEncoder
>>> encoder = TargetEncoder()
>>> str(encoder)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/amedama/.virtualenvs/py310/lib/python3.10/site-packages/sklearn/base.py", line 279, in __repr__
    repr_ = pp.pformat(self)
  File "/usr/local/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pprint.py", line 157, in pformat
    self._format(object, sio, 0, 0, {}, 0)
  File "/usr/local/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pprint.py", line 174, in _format
    rep = self._repr(object, context, level)
  File "/usr/local/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pprint.py", line 454, in _repr
    repr, readable, recursive = self.format(object, context.copy(),
  File "/Users/amedama/.virtualenvs/py310/lib/python3.10/site-packages/sklearn/utils/_pprint.py", line 189, in format
    return _safe_repr(
  File "/Users/amedama/.virtualenvs/py310/lib/python3.10/site-packages/sklearn/utils/_pprint.py", line 440, in _safe_repr
    params = _changed_params(object)
  File "/Users/amedama/.virtualenvs/py310/lib/python3.10/site-packages/sklearn/utils/_pprint.py", line 93, in _changed_params
    params = estimator.get_params(deep=False)
  File "/Users/amedama/.virtualenvs/py310/lib/python3.10/site-packages/sklearn/base.py", line 211, in get_params
    value = getattr(self, key)
AttributeError: 'TargetEncoder' object has no attribute 'cols'

TargetEncoder takes 'cols' parameter but doesn't save to the attribute.
(That transfers to category_encoders.TargetEncoder internally)
But scikit-learn BaseEstimator#get_params() expects that all __init__() parameters will be saved.
So accessing 'cols' raises AttributeError.

Environment

$ python -V                                   
Python 3.10.9
$ pip list | egrep -i "(nyaggle|scikit-learn)"
nyaggle            0.1.5
scikit-learn       1.2.0

PRのマージ権限についての提案:Collaboratorsの追加

こんにちは、@wakame1367です。

現在、PRをマージする権限はリポジトリの作成者にのみ与えられているようです。現状@nyanp さんのみがマージ権限を持つと@nyanpさんの負担が大きいと考えております。解決策としてコントリビューターにもその権限を与えると、負担を減らし、かつ開発の効率を向上させると考えています。

またPRを長期間放置してしまっているのも気になっております。

例えば次のPR
#96

そこで提案ですが、信頼できるコントリビューターをリポジトリのCollaboratorとして追加することはいかがでしょうか? Collaboratorとして追加されたコントリビューターは、PRのマージ権限を含む追加の権限を得ることができます。

次は具体的なCollaboratorとして追加する方法の紹介です。参考までに。

GitHubにおけるPull Request(PR)のマージ権限を他のユーザーに与えるには、そのユーザーを該当リポジトリの"Collaborator"か"Team Member"として追加する必要があります。以下はその手順です:

Collaboratorとして追加する

  1. まず、GitHubで該当リポジトリに移動します。
  2. リポジトリのメインページから「Settings(設定)」タブを選択します。
  3. 左側のメニューで「Manage access(アクセスの管理)」を選択します。
  4. 「Invite a collaborator(コラボレーターを招待する)」をクリックします。
  5. 追加したいユーザーのGitHubユーザー名を入力し、「Add collaborator to [リポジトリ名]([リポジトリ名]にコラボレーターを追加する)」をクリックします。
  6. ユーザーが招待を受け入れると、そのユーザーはPRをレビューし、マージする権限を得ることができます。

https://docs.github.com/ja/account-and-profile/setting-up-and-managing-your-personal-account-on-github/managing-personal-account-settings/permission-levels-for-a-personal-account-repository

Add hyperparameter zoo

Collecting hyperparameters used in Kaggle top solutions and providing API to use them seems good starting point of a new competition.

Can I resume learning with fold?

I often use preemptive instance.
So, I want to resume learning if the instance is shut down.
nyaggle can resume learning ? If not now, is nyaggle planning to implement it?

Support stacking

Support simple stacked generalization from multiple experiments.

Fix flake8 warnings

A number of flake8 errors were logged during the check by CI.
While not fatal, it is desirable to fix them.

コントリビューション時のガイドをまとめたCONTRIBUTING.md作成の提案

このリポジトリへのPRやIssue、例えばバグ報告の際にどのようなフォーマットで報告してほしいか、バグを見つけてPRを出すときにどのようにテスト環境を構築すればよいかなどを明記したCONTRIBUTING.mdを書いたほうがよいと思いました。

書く内容以下のような項目かと思いました。

  • 環境設定: 新しい貢献者が開発環境を設定する方法についてのガイド。依存関係のインストール、ローカルでのプロジェクトのセットアップ、テストの実行方法など。
  • バグ報告: バグを発見した場合の報告方法。具体的な再現手順、期待する結果、実際の結果などを含む詳細な報告が役立つことを示すこと。
  • 新機能の提案: 新しい機能や改善を提案する方法。具体的なユースケースと提案の詳細を提供することが重要であることを強調すること。
  • Pullリクエストの提出: 既存の問題を解決するためのPRを作成し、提出する方法。PRを提出する前に全てのテストが通ることを確認すること、変更の詳細な説明を提供することなど。
  • コードスタイル: 遵守すべきコーディング規約やスタイルガイドがある場合は、その詳細。
  • コミュニティガイドライン: プロジェクトの参加者が遵守すべき行動規範など。

リポジトリコントリビューターのためのガイドラインを定める

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.