Git Product home page Git Product logo

gogo_lemmatizer's Introduction

Lemmatizer from AOT group (http://lemmatizer.org/) with *just* autotools added.
... and patched to compile under CygWin. [--Pav]


INSTALLATION:
=============

# unless you have ./configure file (git repository?) launch `autoconf' first:w
$ aclocal
$ automake --add-missing --force
$ autoconf -f -i -v


GNU / Linux:
-----------

$ ./configure
$ make
$ sudo make install

lemmatizer will be installed to /usr/local/lemmatizer with aditional symlink to /usr/lemmatizer


CygWin:
------

To compile staticaly-linked executables, and library objects on CygWin, you
will need libpcre's shared objects for static linking.

To compile them, run your CygWin setup-x86.exe, or setup-x86_64.exe to install
the necessary packages.  Make sure you have the 'Src?' column ticked for
'libpcre1'.  Also, make sure to install autoconf, cygport, g++, zlib-devel,
libbz2-devel.

Once your packages are installed,:

$ cd /usr/src/pcre-*.src # Depending on the pcre version number.
$ cygport pcre.cygport prep
$ cygport pcre.cygport compile

After the libpcre build successfully completes, go back to your gogo_lemmatizer
directory.

$  ./configure --with-pcre-objs=/usr/src/pcre-*.src/pcre-*.x86_64/build/.libs/ # Depending on the pcre version number.

$ make

Then, you should be able to do:

# make install

Have a look at the Run-time Configuration section on how to make sure a binary
can find its dictionary files, and DLLs.


EXAMPLES:
=========

You can find examples of usage in examples directory.
 - examples/c/firstform.c - example of using pure C lemmatizer (poor interface actually)
 - examples/cpp/lemmatize.cpp - example of using LemInterface.

to build the examples you may use their Makefiles.
Building these examples requires that lemmatizer is _already installed_.

 - examples/cpp/lemmatize2 - Modification of lemmatize.cpp to process words
   from a file stream, and output JSON.

This example does not require that lemmatizer is installed, or that you have
`pkg-config`.  However, you need to set your environment variable 'RML', or run
from an approprate working directory, and make sure the liked can find your
shared libraries.  See Run-time Configuration.


Run-time Configuration
======================

The lemmatizer library will search for the <Dicts> directory to load lemma
dictionaries (compiled during the build process).  You can set the 'RML'
environment variable to the lemmatizer path that contains the <Dicts> directory
to be able to run an executable linked against the lemmatizer library.  If
'RML' in not set, a hard-coded list of directory paths is tried: (See
`getRMLDirectory()` in <Source/common/utilit.cpp>.)

	./lemmatizer
	/usr/local/lemmatizer
	/usr/lemmatizer


Shared Libraries
----------------

Don't forget to make sure your dynamic linker can find the lemmatize shared
libraries.  If you did not install them in a directory searched by your OS's
linker, make sure you add a path to the contents of the <Bin> directory to
LD_LIBRARY_PATH on Unix and Linux systems, and to your PATH on Windows systems.


## Running CygWin-Compiled Binaries Outside CygWin

The binaries produced by CygWin's GCC will depend on a number of CygWin DLLs to
run.  You can add your CygWin's <bin> directory to your PATH environment
variable to find them, or ship them in the same directory as your executables.

You may need, at least, the following (Version numbers may vary.):

	cyggcc_s-seh-1.dll cygiconv-2.dll cygpcre-1.dll cygstdc++-6.dll
	cygwin1.dll


DOCUMENTATION:
==============
Here are links to the installation notes:
1. Morphology: Docs/Morph_UNIX.txt
2. Syntax: Docs/Syntax_UNIX.txt
3. Concordance: Docs/DDC_UNIX.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.