Git Product home page Git Product logo

toad's People

Contributors

kevin-meng avatar onefless avatar padfoot-zhou avatar qianweishuo avatar secbone avatar wolaituodiban avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

toad's Issues

Processing运行报错

image
image
我照着demo写的,不知道为什么会报错AttributeError: 'TimePartition' object has no attribute 'apply',我import了下面这两条
from toad.preprocessing.partition import *
from toad.preprocessing.process import Processing

Time data in numeric type as X axis in plot

data

query_times_30d_pt applymonth sum count badrate
0.[-inf ~ 1.0) 201908 182 3121 0.058315
0.[-inf ~ 1.0) 201909 78 1234 0.063209
1.[1.0 ~ 2.0) 201908 175 2781 0.062927
1.[1.0 ~ 2.0) 201909 79 1053 0.075024
2.[2.0 ~ inf) 201908 328 3428 0.095683
2.[2.0 ~ inf) 201909 129 1262 0.102219

chart

image

Transformer类实例化不会初始化类参数

combiner = toad.transform.Combiner()
combiner.fit(......)
print(combiner.export())

{'a':[1,2,3,4,5]}

combiner2 = toad.transform.Combiner()
print(combiner2.export())

{'a':[1,2,3,4,5]}

combiner2 是一个新的实例,没做任何input,确因为继承父类的原因,和combiner共享部分数据

from toad.plot import badrate_plot, proportion_plot 模块报错

from toad.plot import badrate_plot, proportion_plot
Traceback (most recent call last):
File "D:\program\Anaconda\lib\site-packages\numpy\core\function_base.py", line 117, in linspace
num = operator.index(num)
TypeError: 'float' object cannot be interpreted as an integer

更新numpy 版本未解决>>> numpy.version
'1.18.0'

Gini in quality

Gini returns bad values, its returns 42-43 for all variables (checked on binned, and non binned values)

分箱问题

col_lst = train_selected.columns.values[:-1].tolist()
for col in col_lst:
bin_plot(c.transform(train_selected[[col,'is_bad']], labels=True), x=col, target='is_bad')

在分箱画图的时候弹出的提示,No handles with labels found to put in legend. 虽然不影响出图

combiner的n_bins问题

设置n_bins = 5 结果分了12箱。

`combiner = toad.transform.Combiner()

combiner.fit(dev_slct2,dev_slct2['target'],method='chi',min_samples = 0.05,n_bins = 5,
exclude=ex_lis)

bins = combiner.export()

bins['age']`

[22, 23, 25, 26, 27, 28, 31, 33, 35, 38, 43]

请教3个问题

1、在用toad.quality()计算特征IV值的时候是是基于哪个分箱的依据的,若希望在按照指定要求分箱并计算WOE后再计算出特征的IV值如何操作?
2、在上一步的基础上对每个特征分箱后能否用分箱号替换特征值,而不仅仅用WOE替换特征值?
3、在用toad分箱的时候是否需要考虑分类变量和连续性变量分别分箱,尤其是像身份证所在省份这样类别特别多的分类变量怎么样和连续性变量进行区分?

from toad.plot import bin_plot 报错

from toad.plot import bin_plot
Traceback (most recent call last):
File "C:\miniconda\lib\site-packages\numpy\core\function_base.py", line 117, in linspace
num = operator.index(num)
TypeError: 'float' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "C:\miniconda\lib\site-packages\toad\plot.py", line 6, in
from .tadpole import tadpole
File "C:\miniconda\lib\site-packages\toad\tadpole_init_.py", line 5, in
from .base import Tadpole
File "C:\miniconda\lib\site-packages\toad\tadpole\base.py", line 2, in
from .utils import (
File "C:\miniconda\lib\site-packages\toad\tadpole\utils.py", line 16, in
HEATMAP_CMAP = sns.diverging_palette(240, 10, as_cmap = True)
File "C:\miniconda\lib\site-packages\seaborn\palettes.py", line 744, in diverging_palette
neg = palfunc((h_neg, s, l), 128 - (sep / 2), reverse=True, input="husl")
File "C:\miniconda\lib\site-packages\seaborn\palettes.py", line 641, in light_palette
return blend_palette(colors, n_colors, as_cmap)
File "C:\miniconda\lib\site-packages\seaborn\palettes.py", line 777, in blend_palette
pal = _ColorPalette(pal(np.linspace(0, 1, n_colors)))
File "<array_function internals>", line 6, in linspace
File "C:\miniconda\lib\site-packages\numpy\core\function_base.py", line 121, in linspace
.format(type(num)))
TypeError: object of type <class 'float'> cannot be safely interpreted as an integer.

numpy 和 cpython都是最新

import toad报错,ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

toad和依赖包已经正确安装:
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: toad in /home/mcq/anaconda3/lib/python3.8/site-packages (0.0.63) Requirement already satisfied: numpy<1.20,>=1.18.0 in /home/mcq/anaconda3/lib/python3.8/site-packages (from toad) (1.19.5) Requirement already satisfied: scikit-learn>=0.21 in /home/mcq/anaconda3/lib/python3.8/site-packages (from toad) (0.23.1) Requirement already satisfied: joblib>=0.12 in /home/mcq/anaconda3/lib/python3.8/site-packages (from toad) (0.16.0) Requirement already satisfied: pandas in /home/mcq/anaconda3/lib/python3.8/site-packages (from toad) (1.0.5) Requirement already satisfied: seaborn>=0.10.0 in /home/mcq/anaconda3/lib/python3.8/site-packages (from toad) (0.10.1) Requirement already satisfied: scipy in /home/mcq/anaconda3/lib/python3.8/site-packages (from toad) (1.5.0) Requirement already satisfied: threadpoolctl>=2.0.0 in /home/mcq/anaconda3/lib/python3.8/site-packages (from scikit-learn>=0.21->toad) (2.1.0) Requirement already satisfied: python-dateutil>=2.6.1 in /home/mcq/anaconda3/lib/python3.8/site-packages (from pandas->toad) (2.8.1) Requirement already satisfied: pytz>=2017.2 in /home/mcq/anaconda3/lib/python3.8/site-packages (from pandas->toad) (2020.1) Requirement already satisfied: matplotlib>=2.1.2 in /home/mcq/anaconda3/lib/python3.8/site-packages (from seaborn>=0.10.0->toad) (3.2.2) Requirement already satisfied: six>=1.5 in /home/mcq/anaconda3/lib/python3.8/site-packages (from python-dateutil>=2.6.1->pandas->toad) (1.15.0) Requirement already satisfied: kiwisolver>=1.0.1 in /home/mcq/anaconda3/lib/python3.8/site-packages (from matplotlib>=2.1.2->seaborn>=0.10.0->toad) (1.2.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/mcq/anaconda3/lib/python3.8/site-packages (from matplotlib>=2.1.2->seaborn>=0.10.0->toad) (2.4.7) Requirement already satisfied: cycler>=0.10 in /home/mcq/anaconda3/lib/python3.8/site-packages (from matplotlib>=2.1.2->seaborn>=0.10.0->toad) (0.10.0)
import 时报错
`

import toad

ValueError Traceback (most recent call last)
in
----> 1 import toad

~/anaconda3/lib/python3.8/site-packages/toad/init.py in
----> 1 from .merge import merge, DTMerge, ChiMerge, StepMerge, QuantileMerge, KMeansMerge
2 from .detector import detect
3 from .metrics import KS, KS_bucket, F1
4 from .stats import quality, IV, VIF, WOE, entropy, entropy_cond, gini, gini_cond
5 from .selection import select

~/anaconda3/lib/python3.8/site-packages/toad/merge.pyx in init toad.merge()

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject`

分箱的多线程问题

看了源码,好像在分箱的过程中没有开多线程?向大佬确认下,是否默认开启了多线程,不然像卡方之类的分箱会很慢。

导入时出错

你好,我在导入toad库时报错了,请问应该怎么解决?
微信图片_20200615141545

分箱问题

建议分箱的时候,用户能够指定缺失值为特殊一箱,不参与分箱

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.