Git Product home page Git Product logo

rilangid's Introduction

rilangid

Language identification using Random Indexing

This is the result of a project course in Uppsala University where I tested language identification using Random Indexing models.

A number of different candidate models where tested, and are to be described below. The best one was the ShortestPath model, which produced an F-score of 99.43% given the training data.

Requirements for running

The experiments are built upon PyDSM, a library made for exploring distributional semantic models. Make sure you have this installed first.

You also need to have Expy installed. This is for saving your experiment results as well as calculating result measures.

Everything is tested on Python 3.4.

Running the experiments

When you are sure you have the necessary libraries installed, you should be able to reproduce the results by typing:

python evaluate.py config.shortestpath

This uses the configuration found in config/shortestpath.py when running the experiment:

{'dimensionality': '2000',
 'directed': 'True',
 'num_indices': '8',
 'ordered': 'False',
 'rimodel': "<class 'models.ShortestPath'>",
 'test_path': '/Users/jimmy/dev/projects/rilangid/resources/test/reproduce/',
 'train': 'True',
 'window_size': '(100, 100)'}

After a long while, you should get a result similar to this:

Precision: 0.9968449289229714
Recall: 0.991852487135506
F-score: 0.9943254868891508

rilangid's People

Contributors

jimmycallin avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

crazydreamer

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.