Git Product home page Git Product logo

Comments (6)

liudragonfly avatar liudragonfly commented on May 24, 2024 3

@gutouyu xgboost是不支持category特征的 在训练模型之前 需要我们进行预处理 可以根据特征的具体形式 来选择one-hot encoding(无序)还是label encoding(有序)。
当category的特征值非常多时,one-hot encoding会非常稀疏。这时候one-hot encoding的效果可能不好,可以用NN训练一个该category的向量,或者用其他方式来编码。

from tgboost.

liudragonfly avatar liudragonfly commented on May 24, 2024 2

就是传入的必须是数值类型 比如如果有个星期特征“星期三” xgboost是不支持直接这样输入非数值类型的 至于你编码成0010000(无序)或者3(有序,每周七天从1到7)就得跑实验看看效果了 但编码之后不再是“星期三”这种非数值类型 而是变成数值类型

from tgboost.

gutouyu avatar gutouyu commented on May 24, 2024

@liudragonfly xgboost不支持category特征是指这类特征必须编码后才能输入处理吗?我理解的是能够按照有序无序来自动处理就是算支持,否则就是不支持。。。

请多多指教

from tgboost.

gutouyu avatar gutouyu commented on May 24, 2024

@liudragonfly 懂了,非常感谢

from tgboost.

wepe avatar wepe commented on May 24, 2024

tgboost-python 这个分支的实现,不支持类别特征处理,也就是把任何输入当成数值型特征,所以需要用户自己预处理类别特征。master分支 的实现支持类别特征处理:

Handle categorical feature, TGBoost order the categorical feature by their statistic (Gradient_sum / Hessian_sum) on each tree node, then conduct split finding as numeric feature.

from tgboost.

gutouyu avatar gutouyu commented on May 24, 2024

非常感谢,请问下xgboost也是这么做的吗? @wepe

from tgboost.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.