Git Product home page Git Product logo

bigml-java's Introduction

BigML.io Java bindings

BigML <https://bigml.com>_ makes machine learning easy by taking care of the details required to add data-driven decisions and predictive power to your company. Unlike other machine learning services, BigML creates beautiful predictive models <https://bigml.com/gallery/models>_ that can be easily understood and interacted with.

These BigML Java bindings allow you to interact with BigML.io <https://bigml.io/>, the API for BigML. You can use it to easily create, retrieve, list, update, and delete BigML resources (i.e., sources, datasets, models and, predictions). For additional information, see the full documentation for the Java bindings on Read the Docs <http://bigml-java.readthedocs.org>.

This module is licensed under the Apache License, Version 2.0.

See all the changes history here.

Support

Please, report problems and bugs to BigML.io-Java issue tracker

You can send us an email at BigML email support

You can join us in Campfire chatroom

Integrating Maven

Add the following dependency to your project's pom.xml file:

<dependency>
    <groupId>org.bigml</groupId>
    <artifactId>bigml-binding</artifactId>
    <version>2.0.0</version>
</dependency>

Add the following lines to your project's pom.xml file if you want to use the SNAPSHOT versions of the library:

<repositories>
    <repository>
        <id>osshr-snapshots</id>
        <url>https://oss.sonatype.org/content/repositories/snapshots</url>
        <snapshots><enabled>true</enabled></snapshots>
        <releases><enabled>false</enabled></releases>
    </repository>
</repositories>

Requirements

You will find in the binding.properties file where to setup your BigML credentialscBIGML_USERNAME and BIGML_API_KEY. They can be overwritten passing the values as JVM variables with -D.

The project uses Maven as project manager.

Running the Tests

There is a test suite using Cucumber available, you may want to run it by execute:

$ mvn test

or this way, if you want to debug the tests

$ mvn -Dmaven.surefire.debug="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000 -Xnoagent -Djava.compiler=NONE" test

Examples

The samples directory contains a maven project named BigML-Sample-Client that can be imported. It shows some basic examples about how to use the bindings to create resources in BigML. See the corresponding readme for details.

Generated JAR file of the bindings

Since version 1.1 the name of the JAR file is bigml-binding.

bigml-java's People

Contributors

ashenfad avatar eschultink avatar jaor avatar javinp avatar jimshur avatar mmerce avatar osroca avatar sdesimone avatar xalperte avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bigml-java's Issues

Removing seed as a connection attribute

Currently the connection object has a seed property that contains a string used in random procedures like sampling, model configuration etc. Usually, the seeds used in algorithms can be multiple and are not attached to the connection, so I think it's best not to include it as a connection attribute.

LocalCluster

  • Fix bugs
  • Add utility methods: closestInCluster, sortedCentroids
  • Add support for optype items

improve library

Hello,

the lib it's too basic for java usage, it totally miss strong typing, everything returned it's a JSONObject, it shouldn't be like that, that's javascript style nor java.

you should make models of every request/response and return an object for every call, modeled with getter/setter/arraylist/whatever, it shouldn't be the developer to do jsonbobject.get("objects") ... for... etc...

it should really all demanded to libs, with feature like jacksonmapper that convert the json into a pojo

Error in LocalDeepnet fillArray()

Hi!
When trying to make a prediction on a regression LocalDeepnet, an exception gets thrown at the fillArray() method, specifically a NullPointerException.
This is due to the Deepnet being made from a dataset with almost 2000 fields, the final Deepnet uses only 1000, so when fillArray() tries to iterate over the full list of fields expanding items and finding missings and finds one, it tries to get the field object from the input_fields which doesn't contains it, it retrieves a null and tries to cast it to Number and invoke intVal() on it.

int missingCount = ((Number) Utils.getJSONObject(
fields, fieldId + ".summary.missing_count")).intValue();

Maybe I'm not yet fully understanding the function objective but I believe fillArray() should only iterate over input_fields as it should find missings on that list, but I'm not sure if the full list of fields is later needed, if that would be the case, then I think there might be an issue with the structure of the model data.
The dataset has no missings and every input is categorical, but the objective field is numeric.
The Deepnet was created through OptiML on the web platform and is being consumed with the bindings.
Thank you in advance! I'm available to provide any more information about my implementation.

Error in LocalCluster

java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
at org.bigml.binding.LocalCluster.(LocalCluster.java:138)

this.criticalValue = (Integer)cluster.get("critical_value");

Documentation errors

There are a couple of documentation problems:

  • In the comments of org.bigml.binding.LocalPredictiveModel there's an example that says:
  BigMLClient bigmlClient = new BigMLClient();

it should be BigMLClient.getInstance(), i guess.

  • In the docs, the example with the two-string argument's getInstance example is wrong:
   BigMLClient api = BigMLClient.getInstance("foo", "xxxxxxxxxxx");

that does not create a client using a username and an api key, but another thing involving a seed and an storage. The correct invocation would be:

   BigMLClient api = BigMLClient.getInstance("foo", "xxxxxxxxxxx", false);

I actually think having a getInstance taking two strings that are not
username/apikey is very confusing (as demonstrated by the fact that
even our documentation get it wrong) and would change the interface. But
if you don't agree, at the very least the docs should be updated.

(i'm assigning the bug to @leonhwang just because he's the last committer, please reassign as needed)

Change ModelFields to use static logger

All the local models use the Serializable interface, however they also extend the ModelFields class which contains a non-static Logger variable. For the local models to be serialisable, the "LOGGER" variable in ModelFields needs to be made static.

Incompatibility with ensembles/models with more than 1000 fields

Hi,
There's an incompatibility with models and ensembles (I don't know if its the same case with deepnets or any other kind of model), where if the ensemble or any of the models that compose it have more than 1000 fields, the rest is ignored, then when the MultiModel is created and tries to construct every LocalPredictiveModel it finds that there are missing fields and throws an Exception with the message "Some fields are missing to generate a local model. Please, provide a model with the complete list of fields.".

In every ensemble and model theres an property called "fields_meta" that indicates the total amount of fields and the count and offset of the current request, and the API supports indicating these parameters so pagination is used and the full list of fields can be retrieved.
I supose this needs a kind of extensive change as the resources are all downloaded through the same method without checking what is being retrieve, It should check the kind of resource and check for the pagination data "fields_meta" so consecutive requests are done until the complete list of data (fields) is retrieved.

https://bigml.com/api/ensembles?id=filtering-and-paginating-fields-from-an-ensemble

Call Stack for the case of and ensemble with 107 models and 3877 max columns (same as max used fields, altough ensemble.importance size indicates the actual number of used fields, but I ignore if using that number could affect the construction):

java.lang.Exception: Some fields are missing to generate a local model. Please, provide a model with the complete list of fields.

at org.bigml.binding.LocalPredictiveModel.(LocalPredictiveModel.java:129)
at org.bigml.binding.LocalPredictiveModel.(LocalPredictiveModel.java:100)
at org.bigml.binding.MultiModel.(MultiModel.java:92)
at org.bigml.binding.LocalEnsemble.init(LocalEnsemble.java:295)
at org.bigml.binding.LocalEnsemble.(LocalEnsemble.java:154)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.