Git Product home page Git Product logo

jpmml-sparkml-xgboost's Introduction

JPMML-SparkML-XGBoost

JPMML-SparkML plugin for converting XGBoost4J-Spark models to PMML.

Prerequisites

Installation

Enter the project root directory and build using Apache Maven:

mvn clean install

The build installs JPMML-SparkML-XGBoost library into local repository using coordinates org.jpmml:jpmml-sparkml-xgboost:1.0-SNAPSHOT.

Usage

The JPMML-SparkML-XGBoost library extends the JPMML-SparkML library with support for ml.dmlc.xgboost4j.scala.spark.XGBoostClassificationModel and ml.dmlc.xgboost4j.scala.spark.XGBoostRegressionModel prediction model classes.

Launch the Spark shell; use the --packages command-line option to include XGBoost4J-Spark, JPMML-SparkML and JPMML-XGBoost runtime dependencies, and the --jars command-line option to include the JPMML-SparkML-XGBoost runtime dependency:

spark-shell --packages ml.dmlc:xgboost4j-spark:0.90,org.jpmml:jpmml-sparkml:1.5.7,org.jpmml:jpmml-xgboost:1.3.15 --jars target/jpmml-sparkml-xgboost-1.0-SNAPSHOT.jar

Fitting and exporting an example pipeline model:

import ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier
import org.apache.spark.ml.Pipeline
import org.apache.spark.ml.feature.RFormula
import org.jpmml.sparkml.PMMLBuilder

val df = spark.read.option("header", "true").option("inferSchema", "true").csv("Iris.csv")

val formula = new RFormula().setFormula("Species ~ .")
var classifier = new XGBoostClassifier(Map("objective" -> "multi:softmax", "num_class" -> 3))
classifier = classifier.set(classifier.numRound, 11)

val pipeline = new Pipeline().setStages(Array(formula, classifier))
val pipelineModel = pipeline.fit(df)

val pmmlBytes = new PMMLBuilder(df.schema, pipelineModel).buildByteArray()
println(new String(pmmlBytes, "UTF-8"))

License

JPMML-SparkML-XGBoost is licensed under the terms and conditions of the GNU Affero General Public License, Version 3.0.

If you would like to use JPMML-SparkML-XGBoost in a proprietary software project, then it is possible to enter into a licensing agreement which makes JPMML-SparkML-XGBoost available under the terms and conditions of the BSD 3-Clause License instead.

Additional information

JPMML-SparkML-XGBoost is developed and maintained by Openscoring Ltd, Estonia.

Interested in using Java PMML API software in your company? Please contact [email protected]

jpmml-sparkml-xgboost's People

Contributors

vruusmann avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.