Git Product home page Git Product logo

php-stemmer's Introduction

php-stemmer

This stem extension for PHP provides stemming capability for a variety of languages using Dr. M.F. Porter's Snowball API.

It has a much simpler API than the stem extension found in pecl.

Usage Example

<?php
    echo stemword('cats', 'english', 'UTF_8');      # cat
    echo stemword('stemming', 'english', 'UTF_8');  # stem
?>

Install

The stemmer PHP extension can be installed following the instructions about building PHP extensions using phpize as described in the PHP manual.

To build this extension, you need to have the PHP development tools installed. For ubuntu/debian you can use apt-get install php5-dev.

The phpize command is used to prepare the build environment for a PHP extension.

In the following sample, the sources for an extension are in a directory named stemmer-php:

 # git clone https://github.com/hthetiot/php-stemmer.git
 # cd php-stemmer
 # phpize
 # ./configure
 # make -C libstemmer_c
 # make
 # [sudo] make install

Edit you php.ini file and add the line extension=stemmer.so

About libstemmer_c

The stemmer PHP extension uses a modified version of libstemmer_c.

It has replaced the default Dutch stemming algorithm with the much better Kraaij-Pohlmann Dutch stemming algorithm. The modified version of this lib can be downloaded from mysqludf.com.

Original Source

This version is a fork of php-stemmer hosted on Google Code orinaly made by Javeline B.V and available here: http://code.google.com/p/php-stemmer/

Licence

New BSD License

See License file for details

php-stemmer's People

Contributors

hthetiot avatar ferrastas avatar arielallon avatar gaetan-petit avatar crocodile2u avatar

Stargazers

 avatar Tyler avatar  avatar  avatar Mike Preston avatar  avatar Jehad Nasser avatar George A. avatar tyjak avatar Oleg Koval avatar Al Ramadhan avatar Lakhdar Benzahia avatar Roman Agilov avatar  avatar John Boehr avatar Rok Andrée avatar  avatar Omar avatar Mikulas Dite avatar Orlov Artem avatar  avatar Andrey Pokoev avatar Armen Markossyan avatar Reid Woodbury Jr. avatar Thiago F Macedo avatar Francisco Leite avatar Mohamed Meabed avatar  avatar Piers Warmers avatar Charles Ross avatar Van Tran avatar  avatar Julien Ricard avatar  avatar Smith avatar sasezaki avatar  avatar

Watchers

 avatar Pan Feng avatar James Cloos avatar Julien Ricard avatar  avatar

php-stemmer's Issues

This stemmer is CaSe SeNsItIvE

Just a quick note to those who do not know, because it doesn't appear to mentioned in the quick start guide, this stemmer is case sensitive! I almost went nuts till I realized all the examples were lower case.... this should be fixed. Make sure you force your words to all lower case before stemming.

New release required

Last release is v0.8.1 and is dated from 2014, can you publish a new release including the latest changes for php7 support?

Not PHP CLI Compatible

This script will not work in PHP scripts executed via the command line. I get an undefined function error.

Compiling on AMD64 fails

I had no problems using php-stemmer on x86 system but compilation on an
amd64 system fails.

What steps will reproduce the problem?

  1. extract fresh tarball of php-stemmer 0.7.0
  2. follow install instructions:
    phpize
    ./configure
    cd libstemmer_c
    make
    cd ..
    make
    -> see error:
    /usr/lib/gcc/x86_64-pc-linux-gnu/4.3.2/../../../../x86_64-pc-linux-gnu/bin/ld:
    /root/php-stemmer-0.7.0/libstemmer_c/libstemmer.a(libstemmer.o): relocation
    R_X86_64_32 against `a local symbol' can not be used when making a shared
    object; recompile with -fPIC
    /root/php-stemmer-0.7.0/libstemmer_c/libstemmer.a: could not read symbols:
    Bad value
    collect2: ld returned 1 exit status
    make: *** [stemmer.la] Error 1

This is on a current Gentoo Linux with kernel 2.6.30-gentoo-r4, gcc 4.3.2,
libtool 1.5.26, php 5.2.10.

Any ideas on how to solve this?

Thanks,

thomas

compilation fails with php 5.4.4

Hi,

compilation fails on Wheezy :

I used :

cd php-stemmer-0.7
phpize
./configure
cd libstemmer_c
make
cd ..
make

And then I get :

/bin/bash /root/php-stemmer-0.7.0/libtool --mode=compile cc -I. -I/root/php-stemmer-0.7.0 -DPHP_ATOM_INC -I/root/php-stemmer-0.7.0/include -I/root/php-stemmer-0.7.0/main -I/root/php-stemmer-0.7.0 -I/usr/include/php5 -I/usr/include/php5/main -I/usr/include/php5/TSRM -I/usr/include/php5/Zend -I/usr/include/php5/ext -I/usr/include/php5/ext/date/lib -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DHAVE_CONFIG_H -g -O2 -c /root/php-stemmer-0.7.0/stemmer.c -o stemmer.lo
libtool: compile: cc -I. -I/root/php-stemmer-0.7.0 -DPHP_ATOM_INC -I/root/php-stemmer-0.7.0/include -I/root/php-stemmer-0.7.0/main -I/root/php-stemmer-0.7.0 -I/usr/include/php5 -I/usr/include/php5/main -I/usr/include/php5/TSRM -I/usr/include/php5/Zend -I/usr/include/php5/ext -I/usr/include/php5/ext/date/lib -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DHAVE_CONFIG_H -g -O2 -c /root/php-stemmer-0.7.0/stemmer.c -fPIC -DPIC -o .libs/stemmer.o
/root/php-stemmer-0.7.0/stemmer.c:9:1: error: unknown type name 'function_entry'
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: braces around scalar initializer [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: (near initialization for 'stemmer_functions[0]') [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: initialization makes integer from pointer without a cast [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: (near initialization for 'stemmer_functions[0]') [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: excess elements in scalar initializer [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: (near initialization for 'stemmer_functions[0]') [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: excess elements in scalar initializer [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: (near initialization for 'stemmer_functions[0]') [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: excess elements in scalar initializer [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: (near initialization for 'stemmer_functions[0]') [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: excess elements in scalar initializer [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:10:5: warning: (near initialization for 'stemmer_functions[0]') [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:11:5: warning: braces around scalar initializer [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:11:5: warning: (near initialization for 'stemmer_functions[1]') [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:11:5: warning: initialization makes integer from pointer without a cast [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:11:5: warning: (near initialization for 'stemmer_functions[1]') [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:11:5: warning: excess elements in scalar initializer [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:11:5: warning: (near initialization for 'stemmer_functions[1]') [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:11:5: warning: excess elements in scalar initializer [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:11:5: warning: (near initialization for 'stemmer_functions[1]') [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:19:5: warning: initialization from incompatible pointer type [enabled by default]
/root/php-stemmer-0.7.0/stemmer.c:19:5: warning: (near initialization for 'stemmer_module_entry.functions') [enabled by default]
make: *** [stemmer.lo] Erreur 1

php7 compilation fails

/bin/bash /www/yoda/php-stemmer/libtool --mode=compile cc  -I. -I/www/yoda/php-stemmer -DPHP_ATOM_INC -I/www/yoda/php-stemmer/include -I/www/yoda/php-stemmer/main -I/www/yoda/php-stemmer -I/usr/include/php/20151012 -I/usr/include/php/20151012/main -I/usr/include/php/20151012/TSRM -I/usr/include/php/20151012/Zend -I/usr/include/php/20151012/ext -I/usr/include/php/20151012/ext/date/lib  -DHAVE_CONFIG_H  -g -O2   -c /www/yoda/php-stemmer/stemmer.
c -o stemmer.lo
libtool: compile:  cc -I. -I/www/yoda/php-stemmer -DPHP_ATOM_INC -I/www/yoda/php-stemmer/include -I/www/yoda/php-stemmer/main -I/www/yoda/php-stemmer -I/usr/include/php/20151012 -I/usr/include/php/20151012/main -I/usr/include/php/20151012/TSRM -I/usr/include/php/20151012/Zend -I/usr/include/php/20
151012/ext -I/usr/include/php/20151012/ext/date/lib -DHAVE_CONFIG_H -g -O2 -c /www/yoda/php-stemmer/stemmer.c  -fPIC -DPIC -o .libs/stemmer.o
/www/yoda/php-stemmer/stemmer.c: In function 'zif_stemword':
/www/yoda/php-stemmer/stemmer.c:64:51: warning: passing argument 2 of 'zend_hash_get_current_data_ex' from incompatible pointer type [-Wincompatible-
pointer-types]
            zend_hash_get_current_data_ex(arr_hash,(void **)&data, &pointer)==SUCCESS;
                                                   ^
In file included from /usr/include/php/20151012/Zend/zend.h:36:0,
                 from /usr/include/php/20151012/main/php.h:36,
                 from /www/yoda/php-stemmer/stemmer.c:5:
/usr/include/php/20151012/Zend/zend_hash.h:171:30: note: expected 'HashPosition * {aka unsigned int *}' but argument is of type 'void **'
 ZEND_API zval* ZEND_FASTCALL zend_hash_get_current_data_ex(HashTable *ht, HashPosition *pos);
                              ^
/www/yoda/php-stemmer/stemmer.c:64:12: error: too many arguments to function 'zend_hash_get_current_data_ex'
            zend_hash_get_current_data_ex(arr_hash,(void **)&data, &pointer)==SUCCESS;
            ^
In file included from /usr/include/php/20151012/Zend/zend.h:36:0,
                 from /usr/include/php/20151012/main/php.h:36,
                 from /www/yoda/php-stemmer/stemmer.c:5:
/usr/include/php/20151012/Zend/zend_hash.h:171:30: note: declared here
 ZEND_API zval* ZEND_FASTCALL zend_hash_get_current_data_ex(HashTable *ht, HashPosition *pos);
                              ^
/www/yoda/php-stemmer/stemmer.c:68:14: warning: implicit declaration of function 'Z_TYPE_PP' [-Wimplicit-function-declaration]
           if(Z_TYPE_PP(data) == IS_STRING){
              ^
/www/yoda/php-stemmer/stemmer.c:69:48: warning: implicit declaration of function 'Z_STRVAL_PP' [-Wimplicit-function-declaration]
             stemmed = sb_stemmer_stem(stemmer, Z_STRVAL_PP(data), Z_STRLEN_PP(data));
                                                ^
/www/yoda/php-stemmer/stemmer.c:69:67: warning: implicit declaration of function 'Z_STRLEN_PP' [-Wimplicit-function-declaration]
             stemmed = sb_stemmer_stem(stemmer, Z_STRVAL_PP(data), Z_STRLEN_PP(data));
                                                                   ^
/www/yoda/php-stemmer/stemmer.c:69:48: warning: passing argument 2 of 'sb_stemmer_stem' makes pointer from integer without a cast [-Wint-conversion]
             stemmed = sb_stemmer_stem(stemmer, Z_STRVAL_PP(data), Z_STRLEN_PP(data));
                                                ^
In file included from /www/yoda/php-stemmer/stemmer.c:7:0:
/www/yoda/php-stemmer/libstemmer_c/include/libstemmer.h:68:21: note: expected 'const sb_symbol * {aka const unsigned char *}' but argument is of type
 'int'
 const sb_symbol *   sb_stemmer_stem(struct sb_stemmer * stemmer,
                     ^
/www/yoda/php-stemmer/stemmer.c:71:11: error: too many arguments to function 'add_next_index_string'
           add_next_index_string(return_value,stemmed,1);
           ^
In file included from /usr/include/php/20151012/main/php.h:40:0,
                 from /www/yoda/php-stemmer/stemmer.c:5:
/usr/include/php/20151012/Zend/zend_API.h:432:14: note: declared here
 ZEND_API int add_next_index_string(zval *arg, const char *str);
              ^
/www/yoda/php-stemmer/stemmer.c:76:55: error: macro "ZVAL_STRING" passed 3 arguments, but takes just 2
       if(stemmed)ZVAL_STRING( return_value, stemmed, 1);
                                                       ^
/www/yoda/php-stemmer/stemmer.c:76:18: error: 'ZVAL_STRING' undeclared (first use in this function)
       if(stemmed)ZVAL_STRING( return_value, stemmed, 1);
                  ^
/www/yoda/php-stemmer/stemmer.c:76:18: note: each undeclared identifier is reported only once for each function it appears in
Makefile:194: recipe for target 'stemmer.lo' failed
make: *** [stemmer.lo] Error 1
/bin/bash /www/yoda/php-stemmer/libtool --mode=compile cc  -I. -I/www/yoda/php-stemmer -DPHP_ATOM_INC -I/www/yoda/php-stemmer/include -I/www/yoda/php-stemmer/main -I/www/yoda/php-stemmer -I/usr/include/php/20151012 -I/usr/include/php/20151012/main -I/usr/include/php/20151012/TSRM -I/usr/include/php/20151012/Zend -I/usr/include/php/20151012/ext -I/usr/include/php/20151012/ext/date/lib  -DHAVE_CONFIG_H  -g -O2   -c /www/yoda/php-stemmer/stemmer.
c -o stemmer.lo
libtool: compile:  cc -I. -I/www/yoda/php-stemmer -DPHP_ATOM_INC -I/www/yoda/php-stemmer/include -I/www/yoda/php-stemmer/main -I/www/yoda/php-stemmer -I/usr/include/php/20151012 -I/usr/include/php/20151012/main -I/usr/include/php/20151012/TSRM -I/usr/include/php/20151012/Zend -I/usr/include/php/20
151012/ext -I/usr/include/php/20151012/ext/date/lib -DHAVE_CONFIG_H -g -O2 -c /www/yoda/php-stemmer/stemmer.c  -fPIC -DPIC -o .libs/stemmer.o
/www/yoda/php-stemmer/stemmer.c: In function 'zif_stemword':
/www/yoda/php-stemmer/stemmer.c:64:51: warning: passing argument 2 of 'zend_hash_get_current_data_ex' from incompatible pointer type [-Wincompatible-
pointer-types]
            zend_hash_get_current_data_ex(arr_hash,(void **)&data, &pointer)==SUCCESS;
                                                   ^
In file included from /usr/include/php/20151012/Zend/zend.h:36:0,
                 from /usr/include/php/20151012/main/php.h:36,
                 from /www/yoda/php-stemmer/stemmer.c:5:
/usr/include/php/20151012/Zend/zend_hash.h:171:30: note: expected 'HashPosition * {aka unsigned int *}' but argument is of type 'void **'
 ZEND_API zval* ZEND_FASTCALL zend_hash_get_current_data_ex(HashTable *ht, HashPosition *pos);
                              ^
/www/yoda/php-stemmer/stemmer.c:64:12: error: too many arguments to function 'zend_hash_get_current_data_ex'
            zend_hash_get_current_data_ex(arr_hash,(void **)&data, &pointer)==SUCCESS;
            ^
In file included from /usr/include/php/20151012/Zend/zend.h:36:0,
                 from /usr/include/php/20151012/main/php.h:36,
                 from /www/yoda/php-stemmer/stemmer.c:5:
/usr/include/php/20151012/Zend/zend_hash.h:171:30: note: declared here
 ZEND_API zval* ZEND_FASTCALL zend_hash_get_current_data_ex(HashTable *ht, HashPosition *pos);
                              ^
/www/yoda/php-stemmer/stemmer.c:68:14: warning: implicit declaration of function 'Z_TYPE_PP' [-Wimplicit-function-declaration]
           if(Z_TYPE_PP(data) == IS_STRING){
              ^
/www/yoda/php-stemmer/stemmer.c:69:48: warning: implicit declaration of function 'Z_STRVAL_PP' [-Wimplicit-function-declaration]
             stemmed = sb_stemmer_stem(stemmer, Z_STRVAL_PP(data), Z_STRLEN_PP(data));
                                                ^
/www/yoda/php-stemmer/stemmer.c:69:67: warning: implicit declaration of function 'Z_STRLEN_PP' [-Wimplicit-function-declaration]
             stemmed = sb_stemmer_stem(stemmer, Z_STRVAL_PP(data), Z_STRLEN_PP(data));
                                                                   ^
/www/yoda/php-stemmer/stemmer.c:69:48: warning: passing argument 2 of 'sb_stemmer_stem' makes pointer from integer without a cast [-Wint-conversion]
             stemmed = sb_stemmer_stem(stemmer, Z_STRVAL_PP(data), Z_STRLEN_PP(data));
                                                             ^
In file included from /www/yoda/php-stemmer/stemmer.c:7:0:
/www/yoda/php-stemmer/libstemmer_c/include/libstemmer.h:68:21: note: expected 'const sb_symbol * {aka const unsigned char *}' but argument is of type
 'int'
 const sb_symbol *   sb_stemmer_stem(struct sb_stemmer * stemmer,
                     ^
/www/yoda/php-stemmer/stemmer.c:71:11: error: too many arguments to function 'add_next_index_string'
           add_next_index_string(return_value,stemmed,1);
           ^
In file included from /usr/include/php/20151012/main/php.h:40:0,
                 from /www/yoda/php-stemmer/stemmer.c:5:
/usr/include/php/20151012/Zend/zend_API.h:432:14: note: declared here
 ZEND_API int add_next_index_string(zval *arg, const char *str);
              ^
/www/yoda/php-stemmer/stemmer.c:76:55: error: macro "ZVAL_STRING" passed 3 arguments, but takes just 2
       if(stemmed)ZVAL_STRING( return_value, stemmed, 1);
                                                       ^
/www/yoda/php-stemmer/stemmer.c:76:18: error: 'ZVAL_STRING' undeclared (first use in this function)
       if(stemmed)ZVAL_STRING( return_value, stemmed, 1);
                  ^
/www/yoda/php-stemmer/stemmer.c:76:18: note: each undeclared identifier is reported only once for each function it appears in
Makefile:194: recipe for target 'stemmer.lo' failed
make: *** [stemmer.lo] Error 1

Too Few Arguments Error

The latest PR seems to break the make -C libstemmer_c step of installation resulting in the following error (with PHP 5.6.17 installed):

/home/vagrant/php-stemmer/stemmer.c: In function ‘zif_stemword’:
/home/vagrant/php-stemmer/stemmer.c:64: error: too few arguments to function ‘zend_hash_get_current_data_ex’
/home/vagrant/php-stemmer/stemmer.c:71: error: too few arguments to function ‘add_next_index_string’
/home/vagrant/php-stemmer/stemmer.c:76:52: error: macro "ZVAL_STRING" requires 3 arguments, but only 2 given
/home/vagrant/php-stemmer/stemmer.c:76: error: ‘ZVAL_STRING’ undeclared (first use in this function)
/home/vagrant/php-stemmer/stemmer.c:76: error: (Each undeclared identifier is reported only once
/home/vagrant/php-stemmer/stemmer.c:76: error: for each function it appears in.)
make: *** [stemmer.lo] Error 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.