Git Product home page Git Product logo

silcc's Introduction

SiLCC

Synopsis

SiLCC aims to provide a learning NLP based relevancy filter/tagger/classifier that is applied to the various information sources that Swift aggregates (RSS, twitter, SMS).

This software is released under the GNU General Public License.

Change Log

  • v0.1.0: Initial stable release of the software.

Installation

To set up a silcc server instance:

  1. Create a Python virtualenv

    $ virtualenv ~/SiLCC-env

  2. In the this distribution run:

    $ python setup.py develop

This should install all needed libraries in your virtual environment.

Make sure you have MySQLdb installed:

$ easy_install mysql-python

You will need the Python development libraries as well as the MySQL client development libraries installed for this to work. On Ubuntu you need the libmysqlclient-dev and python-dev packages:

$ sudo aptitude install libmysqlclient-dev python-dev

Make sure you have the correct version of SQLAlchemy installed:

$ easy_install -U SQLAlchemy==0.6.0
  1. You also need to install nltk. Download it from the following location:

    $ wget http://nltk.googlecode.com/files/nltk-2.0b8.zip $ unzip nltk-2.0b8.zip $ cd nltk-2.0b8

Now make sure you have activated the virtualenv created in step 1.

$ source ~/SiLCC-env/bin/activate

If you are sure its activated, install PyYAML.

$ easy_install pyyaml

Issue the following to install nltk into your virtualenv.

$ python setup.py install
  1. After installing nltk you need to download one of the Treebank Corpus for the part-of-speech tagger to work:

    $ python -c "import nltk;nltk.download('maxent_treebank_pos_tagger')"

  2. Install numpy. On Debian-based distributions the easiest is to install the package:

    $ aptitude install python-numpy

  3. Create the database. If using MySQL:

    $ mysql -u root -p mysql> create database silcc default charset utf8; mysql> grant all on silcc.* to silcc@localhost identified by 'password';

Make sure the .ini file contains a DB URI to the above database.

(Please see instructions below on using SQLAlchemy migrate scripts to generate the db schema)

  1. To start the server:

    $ paster serve --daemon development.ini

Look for any errors in paster.log

You can also run it interactively by excluding --daemon:

$ paster serve development.ini

Placing your SiLCC Instance DB under version control

You may want to place your database under version control so that you can easily upgrade the schema, should it evolve over time.

To do this you need to first create an empty database, and then place it under version control.

  1. First install sqlalchemy-migrate:

    $ easy_install -U sqlalchemy-migrate==0.5.4

  2. now place your database under version control with:

    $ python db_repository/manage.py version_control mysql://silcc:password@localhost:3306/silcc

This will create the migrate_version table in your database and set the initial version to 0.

After that you may run any schema upgrade scripts including the first one which will create the necessary tables.

Most data in the SiLLC distribution can be recreated using scripts and data dumps provided.

  1. To make upgrading your local instance easier its best to create a manage.py scripts as follows:

    $ migrate manage manage.py --repository=db_repository --url=mysql://silcc:password@localhost:3306/silcc

Since the manage.py will be specific to your local instance it is not included in the SiLCC repo.

  1. Now run a db update to generate the initial schema:

    $ python manage.py upgrade

  2. The Web Service API needs to have one or more valid keys in the apikey table. For testing purposes add a key with a valid_domains value of * (valid from all domains)

    $ mysql -u root -p mysql> insert into apikey set keystr = 'AAAABBBB', valid_domains = '*';

Deploying on Rackspace

  1. Create an instance of Ubuntu 10.04 LTS (Lucid Lynx).
  2. Open up a SSH terminal to your new instance.
  3. Run one of the two deployment scripts, depending on the type of installation. Each time you are prompted for a password, this will be the MySQL root password of your choosing. It should be safe to leave this blank.
    • Development: curl -L https://raw.github.com/ushahidi/silcc/master/deploy/ubuntu-lucid-development.sh | bash
    • Production: curl -L https://raw.github.com/ushahidi/silcc/master/deploy/ubuntu-lucid-production.sh | bash
  4. Open your browser and point it to: http://ip.or.hostname.of.your.new.instance:5002/

Libraries

For your convenience, the following language-specific API "proxy" libraries are available:

Support

See Also

silcc's People

Contributors

thakadu avatar miclovich avatar mrmatthewgriffiths avatar timclicks avatar

Stargazers

Angus H. avatar RK Aranas avatar Priyadarshi Lahiri avatar Christian G. Warden avatar Martin Sykora avatar Hertzel Karbasi avatar Soe avatar Nishith Rastogi avatar John Brennan avatar Edmar Ferreira avatar  avatar  avatar  avatar Henry Addo avatar Lionel D. Hummel avatar  avatar Steve Winton avatar Dave Spector avatar Sascha Brossmann avatar Bruno Melo avatar

Watchers

Emmanuel Kala avatar  avatar Daniel Odongo avatar James Cloos avatar Heather Leson avatar Shadrock Roberts avatar  avatar  avatar Chris avatar Shariffintech avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.