Git Product home page Git Product logo

nak's Introduction

Nak

Authors:

Introduction

Nak is a library for machine learning and related tasks, with a focus on having an easy to use API for some standard algorithms. It is formed from the OpenNLP Maxent package, and the intent is to evolve it as a Scala library with further capabilities. It will be developed in particular with the natural language processing library Chalk in mind. Use of Breeze is likely in Nak's future.

Like Chalk, the name Nak comes from one of Jason's son's stuffed elephants. (He really likes elephants.)

What's inside

The latest stable release of Nak is 1.1.0. It includes:

  • The classification code from the OpenNLP Maxent package, slightly reorganized.
  • The k-means clustering code from Scalabha.

Using Nak

In SBT:

libraryDependencies += "org.scalanlp" % "nak" % "1.1.0"

In Maven:

<dependency>
   <groupId>org.scalanlp</groupId>
   <artifactId>nak</artifactId>
   <version>1.1.0</version>
</dependency>

Note that the domain has changed from com.jasonbaldridge (v1.0) to org.scalanlp now.

Note: There is one dependency that won't get pulled along: pca_transform-0.7.2.jar in the lib directory is not available on any repository, so you'll need to add that to your classpath by hand if (and only if) you want to be able to use PCA transformations for input to k-means.

There is no dedicated documentation for Nak as yet, but you can see some use of the k-means clustering code in homework three for Jason's Applied NLP course. Future homeworks will cover classification and more, using Nak.

Requirements

Configuring your environment variables

The easiest thing to do is to set the environment variables JAVA_HOME and NAK_DIR to the relevant locations on your system. Set JAVA_HOME to match the top level directory containing the Java installation you want to use.

Next, add the directory NAK_DIR/bin to your path. For example, you can set the path in your .bashrc file as follows:

export PATH=$PATH:$NAK_DIR/bin

Once you have taken care of these three things, you should be able to build and use Nak.

Building the system from source

Nak uses SBT (Simple Build Tool) with a standard directory structure. To build Nak, type (in the $NAK_DIR directory):

$ ./build update compile

This will compile the source files and put them in ./target/classes. If this is your first time running it, you will see messages about Scala being downloaded -- this is fine and expected. Once that is over, the Nak code will be compiled.

To try out other build targets, do:

$ ./build

This will drop you into the SBT interface. To see the actions that are possible, hit the TAB key. (In general, you can do auto-completion on any command prefix in SBT, hurrah!)

To make sure all the tests pass, do:

$ ./build test

Documentation for SBT is at http://www.scala-sbt.org/

Note: if you have SBT already installed on your system, you can also just call it directly with "sbt" in NAK_DIR.

Questions or suggestions?

Email Jason Baldridge: [email protected]

Or, create an issue: https://github.com/scalanlp/nak/issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.