Git Product home page Git Product logo

xgboost's Introduction

xgboost

NPM version build status David deps npm download

Installation

$ npm install ml-xgboost

Example

import IrisDataset from 'ml-dataset-iris';

require('ml-xgboost').then(XGBoost => {
    var booster = new XGBoost({
        booster: 'gbtree',
        objective: 'multi:softmax',
        max_depth: 5,
        eta: 0.1,
        min_child_weight: 1,
        subsample: 0.5,
        colsample_bytree: 1,
        silent: 1,
        iterations: 200
    });

    var trainingSet = IrisDataset.getNumbers();
    var predictions = IrisDataset.getClasses().map(
        (elem) => IrisDataset.getDistinctClasses().indexOf(elem)
    );

    booster.train(dataset, trueLabels);
    var predictDataset = /* something to predict */
    var predictions = booster.predict(predictDataset);

    // don't forget to free your model
    booster.free()

    // you can save your model in this way
    var model = JSON.stringify(booster); // string
    // or
    var model = booster.toJSON(); // object

    // and load it
    var anotherBooster = XGBoost.load(model); // model is an object, not a string
});

Development

  • You should have emscripten sdk-1.37.22 installed on your computer and be able to use emcc and em++.
  • Download the repo: git clone --recursive https://github.com/mljs/xgboost
  • Run npm run build or make at the root directory.

XGBoost library files changed

  • dmlc-core/include/dmlc/base.h line 45 here

  • rabit/include/dmlc/base.h line 45 here

    #if (!defined(DMLC_LOG_STACK_TRACE) && defined(__GNUC__) && !defined(__MINGW32__))
    #define DMLC_LOG_STACK_TRACE 1
    #undef DMLC_LOG_STACK_TRACE
    #endif

    Note: this is to avoid compilation issues with the execinfo.h library that is not needed in the JS library

  • in case that you get the following error:

    ./xgboost/include/xgboost/c_api.h:29:9: error: unknown type name 'uint64_t'

    just add this import at the beginning of this file after the first define:

    #include <stdint.h>

License

© Contributors, 2016. Licensed under an Apache-2 license.

xgboost's People

Contributors

dennismphil avatar jeffersonh44 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xgboost's Issues

Missing xgboost/lib/libxgboost.so

Following the README#development,

If I

$ git clone --recursive https://github.com/mljs/xgboost
$ cd xgboost
$ npm install                     // (Would be nice to include this also in the Development Section)
$ npm run build

gives the following error:

...
...
a - build/tree/updater_skmaker.o
a - build/tree/updater_sync.o
ERROR:root:dmlc-core/libdmlc.a: No such file or directory ("dmlc-core/libdmlc.a" was expected to be an input file, based on the commandline arguments provided)
ERROR:root:dmlc-core/libdmlc.a: No such file or directory ("dmlc-core/libdmlc.a" was expected to be an input file, based on the commandline arguments provided)
make[1]: *** [lib/libxgboost.dylib] Error 1
make[1]: *** Waiting for unfinished jobs....
make[1]: *** [xgboost] Error 1
fatal error: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ar: fatal error in /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib
make[1]: *** [lib/libxgboost.a] Error 1
mkdir -p dist/wasm;
em++ -O3 -Wall -fPIC --memory-init-file 0 -std=c++11 -Ixgboost/dmlc-core/include -Ixgboost/rabit/include -Ixgboost/include js-interfaces.cpp xgboost/lib/libxgboost.so -o dist/wasm/xgboost.js --pre-js src/wasmPreJS.js -s WASM=1 -s "BINARYEN_METHOD='native-wasm'" -s ALLOW_MEMORY_GROWTH=1 -s EXPORTED_FUNCTIONS="['_create_model', '_set_param', '_train_full_model', '_predict_one', '_free_memory_model', '_save_model', '_get_file_content', '_load_model', '_prediction_size']"
ERROR:root:xgboost/lib/libxgboost.so: No such file or directory ("xgboost/lib/libxgboost.so" was expected to be an input file, based on the commandline arguments provided)
make: *** [build] Error 1
npm ERR! code ELIFECYCLE
npm ERR! errno 2
npm ERR! [email protected] build: `rimraf dist && make clean && make`
npm ERR! Exit status 2
npm ERR!
npm ERR! Failed at the [email protected] build script.
...
...

Important line I read was:

ERROR:root:xgboost/lib/libxgboost.so: No such file or directory ("xgboost/lib/libxgboost.so" was expected to be an input file, based on the commandline arguments provided)

When I checked the repo, it is missing the lib directory. A mention of https://github.com/dmlc/xgboost/blob/master/doc/build.md would be good to be included.

Loading a Python trained XGBoost model

I am attempting to load a model trained in Python and exported as a JSON dump

Relevant portion of the python code to export to JSON

# Python snippet
# The model was trained using the Python sklearn API of XGBoost
# (https://github.com/dmlc/xgboost/blob/master/python-package/xgboost/sklearn.py).
...
...
mdl_json = mdl.get_booster().get_dump(dump_format = 'json')
with open(
    os.path.join(
       PROJECT_PATH,
        'model.json'
    ),
    'w'
) as handle:
handle.write(json.dumps(mdl_json))

Attempt to load this in Javascript

/* Javascript snippet */

require('ml-xgboost')
    .then(XGBoost => {
        // and load it
        XGBoost.loadFromModel('./model.json');
    }).catch((error) => {
        console.error(error);
    });

yields

[12:08:29] dmlc-core/include/dmlc/./logging.h:300: 
[12:08:29] src/learner.cc:299: Check failed: fi->Read(&name_obj_[0], len) == len 
(73196083 vs. 1851546400) BoostLearner: wrong model format
5356640 - Exception catching is disabled, this exception cannot be caught. 
Compile with -s DISABLE_EXCEPTION_CATCHING=0 or DISABLE_EXCEPTION_CATCHING=2 to catch.

Is this scenario covered in the loadFromModel method?

Feature importance

After fitting a model in XGBoost it's possible to get importance scores for features (details). Is it possible to get those scores in JS?

wasm-ld: error: 'atomics' feature is disallowed by build/learner.o

When cloning the repo and running npm run build emscripten throws such errors:

wasm-ld: error: 'atomics' feature is disallowed by build/learner.o, so --shared-memory must not be used
wasm-ld: error: 'atomics' feature must be used in order to use shared memory
wasm-ld: error: 'bulk-memory' feature must be used in order to use shared memory
Makefile:164: recipe for target 'lib/libxgboost.so' failed
make[1]: *** [lib/libxgboost.so] Error 1
make[1]: *** Waiting for unfinished jobs....
wasm-ld: error: 'atomics' feature is disallowed by build/cli_main.o, so --shared-memory must not be used
wasm-ld: error: 'atomics' feature must be used in order to use shared memory
wasm-ld: error: 'bulk-memory' feature must be used in order to use shared memory

Emcc version: 1.39.10

Probably related to this discussion: https://groups.google.com/forum/#!topic/emscripten-discuss/tdjKEcKXXC8

I tried removing -pthread from the native xgboost Makefile:
export LDFLAGS= -pthread -lm $(ADD_LDFLAGS) $(DMLC_LDFLAGS) $(PLUGIN_LDFLAGS)

After that the wasm output was produced without errors, but there were lots of warnings like these:

wasm-ld: warning: unexpected existing value for R_WASM_TABLE_INDEX_REL_SLEB: existing=22 expected=23
wasm-ld: warning: unexpected existing value for R_WASM_TABLE_INDEX_REL_SLEB: existing=23 expected=24
wasm-ld: warning: unexpected existing value for R_WASM_TABLE_INDEX_REL_SLEB: existing=25 expected=26
wasm-ld: warning: unexpected existing value for R_WASM_TABLE_INDEX_REL_SLEB: existing=122 expected=123
wasm-ld: warning: unexpected existing value for R_WASM_TABLE_INDEX_REL_SLEB: existing=122 expected=123
wasm-ld: warning: unexpected existing value for R_WASM_TABLE_INDEX_REL_SLEB: existing=122 expected=123
wasm-ld: warning: unexpected existing value for R_WASM_TABLE_INDEX_REL_SLEB: existing=126 expected=127
wasm-ld: warning: unexpected existing value for R_WASM_TABLE_INDEX_REL_SLEB: existing=128 expected=129

Not sure if this a right way to deal with the error

Does this library load in browser (using webpack)?

I have been trying to load this library on the browser using webpack.

First error I see is:

ERROR in ./node_modules/ml-xgboost/dist/wasm/xgboost.wasm
Module not found: Error: Can't resolve 'env' in '/workspace/360/node_modules/ml-xgboost/dist/wasm'
 @ ./node_modules/ml-xgboost/dist/wasm/xgboost.wasm
 @ ./node_modules/ml-xgboost/src/index.js

ERROR in chunk main [entry]
bundle.js
Sync WebAssembly compilation is not yet implemented

After adding the following rule in webpack.config.js,

{
    test: /\.wasm$/,
    loader: 'wasm-loader',
    type: 'javascript/auto'
},

this succeeds.

However, in browser it fails to resolve the promise in ../node_modules/ml-xgboost/src/index.js

Uncaught TypeError: Cannot read property 'then' of undefined

The dynamic loading of library is not working. Have you had any success in getting the library load up in browser?

Upstream Version

Could you please add the version of XGBOOST you're using to the README.md

Also could you please update to the latest which was released 8 days ago?

Support for NAN numbers?

I'm want to use this for a project but I need to have the option to pass NAN number.

This is the test I'm running:

require('ml-xgboost').then(XGBoost => {
var booster = new XGBoost({
    booster: 'gbtree',
    objective: 'reg:linear',
    max_depth: 3,
    eta: 0.1,
    min_child_weight: 1,
    subsample: 0.8,
    colsample_bytree: 1,
    silent: 1,
    iterations: 80
});

var dataset = [
    [1, 0, 0, 0, 0, 0, 0],
    [7, 20, 60, 60, 0, 0, 1],
    [8, 60, 70, 40, 0, 0, 2],
    [9, 40, 110, 40, 10, 0, 1],
    [11, 120, 90, 80, 30, 0, 4],
    [14, 40, 130, 90, 60, 0, 2],
    [15, 40, 110, 110, 70, 10, 3],
    [19, 0, 40, 90, 110, 40, 3],
    [21, 0, 0, 80, 70, 60, 3]
];

var diary = new Array(dataset.length);
var predictions = new Array(dataset.length);

for (var i = 0; i < dataset.length; ++i) {
    diary[i] = dataset[i].slice(1, 6);
    predictions[i] = [dataset[i][6]];
}

booster.train(diary, predictions);
var predictDataset = [[0, 40, 90, 110, 40]];
var predictions = booster.predict(predictDataset);

var model = JSON.stringify(booster);

console.log(predictions); // outputs [ 2.9955782890319824 ] which is expected

}).catch(error => console.log(error));

So far everything works as expected.

But when I add a NaN in the dataset, I get this error:

[13:48:52] dmlc-core/include/dmlc/./logging.h:300: [13:48:52] src/c_api/c_api.cc:366: Check failed: nan_missing There are NAN in the matrix, however, you did not set missing=NAN 5357128 - Exception catching is disabled, this exception cannot be caught. Compile with -s DISABLE_EXCEPTION_CATCHING=0 or DISABLE_EXCEPTION_CATCHING=2 to catch.

I have been looking at the source code but can't find anything about how to set the missing=NAN.

Is this possible? And if not, would it be possible to add that feature?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.