Git Product home page Git Product logo

libfm's People

Contributors

andland avatar chihming avatar fabiopetroni avatar henry0312 avatar srendle avatar thierry-silbermann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libfm's Issues

save_model parameter not found

while i try to save model train with sgd the error rise:

../bin/libFM -task c -train sampleRatingData.dat.libfm -test sampleRatingData.dat.libfm -dim ’1,1,8’ -out sample.res -rlog sample.log -method sgd -learn_rate 0.001 -regular '0,0,0.001' -save_model model.fm 
----------------------------------------------------------------------------
libFM
  Version: 1.4.2
  Author:  Steffen Rendle, [email protected]
  WWW:     http://www.libfm.org/
This program comes with ABSOLUTELY NO WARRANTY; for details see license.txt.
This is free software, and you are welcome to redistribute it under certain
conditions; for details see license.txt.
----------------------------------------------------------------------------

ERROR: the parameter save_model does not exist

Some people claim the model can saved by sgm-trained model with save_model flag, how to fix this?

自适应部分权重是否进行了二次更新

您好!最近在看libfm的源码,关于自适应FM部分中部分代码不是太明白,感觉代码和原理不是太相符。在sgd_theta_step函数中对t代权重进行了更新,而在predict_scaled函数中计算t+1代的p值时为何又对权重更新了一次?望指点!

Kitty Terminal Emulator

Hi,

I'm using the kitty Terminal Emulator and get this warning when opening pcmanfm:

** (pcmanfm:746350): WARNING **: 17:58:59.842: terminal kitty isn't known, consider report it to LibFM developers

So I just wanted to report this to you

umm... so i reported

This:

** (pcmanfm:14792): WARNING **: 06:47:51.518: terminal alacritty isn't known, consider report it to LibFM developers

So I did...

Log loss should be comuted in base e, not 10

In _evaluate_class the log loss computed uses base 10, it should be base e (natural log).
I.e.:
_loglikelihood -= m_log10(pll) + (1-m)_log10(1-pll);
should be,
_loglikelihood -= m_log(pll) + (1-m)_log(1-pll);

some confusion

hello, i am newer to use libFM, it was a great tool
i used mcmc to train a CTR model, i met 2 pro

  1. data has 160 million features, when init_V is small such as 0.001,0.005 it seem normally that auc is 0.6-0.7 but when i set init_V 0.1,0.5 the result just like 0,0,3333,1... i hope you give me some advice
  2. i saw "if mcmc save model it will have to save param every iter" why only save last iter param is not ok ? and way the final y_predict is avg of evey iter.

i hope to receive some reply
thanks!

Bad loss function for classification?

According to the docs:

For binary classification, cases with y > 0 are regarded as the positive class and with y ≤ 0 as the negative class.

But if you have a target of 0 in the loss function for negative cases then you don't learn anything because your loss is always 0:

        } else if (task == 1) {
            grad_loss = target * ( (1.0/(1.0+exp(-target*p))) -  1.0);

It seems like target needs to be normalized in the classification case, but I don't see anywhere in the code where that'd be happening. (Note that I haven't actually run the code to prove it's misbehaving, but am just reading it and this didn't make sense to me. Did I miss something?)

Convert Data with mixed datatypes to LibSVM format

I have data with about Million rows and 3 columns. The columns are of 3 different datatypes. NumberOfFollowers is of a numerical datatype, UserName is of a categorical data type, Embeddings is of categorical-set type.

df:

Index  NumberOfFollowers                  UserName                    Embeddings        Target Variable

0        15                                name1                      [0.5 0.3 0.2]       0
1        4                                 name2                      [0.4 0.2 0.4]       1
2        8                                 name3                      [0.5 0.5 0.0]       0
3        10                                name1                      [0.1 0.0 0.9]       0
...      ...                               ....                       ...                 ..

I would like to convert this data into the LibSVM input format.

Desired Output:

0 0:15 4:1 1:0.5 2:0.3 3:0.2
1 0:4 5:1 1:0.4 2:0.2 3:0.4
0 0:8 6:1 1:0.5 2:0.5 3:0.0
0 0:10 4:1 1:0.1 2:0.0 3:0.9
...

The Perl script https://github.com/srendle/libfm/blob/master/scripts/triple_format_to_libfm.pl handles categorical values. But, how to handle mixture of data types as also described in this paper: https://www.ismll.uni-hildesheim.de/pub/pdfs/Rendle_et_al2011-Context_Aware.pdf

Can this problem be solved using libfm or I have to use external tools? If I need to use external tools, are you aware of any external tools which perform this operation on a very large scale data (as I have many columns of mixed data types)?

Train and test performance seem to be calculated differently.

I was testing libFM, and one of my tests involved running libFM with the same train and test dataset:

libFM -task c -train train.libfm -test train.libfm

This seems to work, but the intermediate performance values are different for the train and test set, while the data comes from the same file:

...
#Iter= 97   Train=0.530437  Test=0.530998   Test(ll)=0.299652
#Iter= 98   Train=0.528048  Test=0.530657   Test(ll)=0.299651
#Iter= 99   Train=0.52756   Test=0.530803   Test(ll)=0.299649

I would expect that the train and test performance are exactly the same. Is this an indication of a bug? Or do I misunderstand what is being logged here?

Optional Multithreading

Hi,

Wouldn't be a good idea to make this a multithread solution?

The rating prediction for the trainset/testset, significantly slows down the training process when the amount of ratings increase significantly. This is a modification that has little to none complexity to implement and influences the performance significantly.

Thats just a suggestion.

Thank you,
André

PS: a parameter would be a good idea. Either the multithreading is activated or not.
PS2: My solution currently does everything in multithreading then when I use LibFM it simply slows down and instead of using 32 cores uses 1 core. Unfortunately, I'm not familiarized enough with libfm source code to develop the modification.

Strange results on hello world example

Please enlighten me, i tried the simplest possible example:

File train.libfm is set to

1 0:1 1:1

and ran it using

libFM -task r -method mcmc -train train.libfm -test train.libfm -iter 10 -dim ‘0,0,1’ -out output.libfm -save_model model.libfm

Hence, only the pairwise interactions should be used and its dimension is 1. The regression shows a perfect fit (as expected). However, looking at model.libfm gives me

#pairwise interactions Vj,f
0.0139959
0.711416

My expectations is that the first number times the second number (the pairwise interaction of the two features) should be 1 (the target of the regression), but it is always clearly sth else. Tried the same trivial example with fastFM and it behaved as expected.

How to implement MCMC

I code as the MCMC algorithm in Paper Factorization Machine with libFM in python ,but it not work ,is there any detail I can do ?

terminal bash isn't known

I am using calibre (calibre 3.3) and started it in a shell.
Once in a while I receive the following error:

** (pcmanfm:26810): WARNING **: terminal bash isn't known, consider report it to LibFM developers
/usr/bin/xdg-open: line 709: : command not found

This is the ls output so the file is there ...
% ls -l /usr/bin/xdg-open
-rwxr-xr-x 1 root root 22746 Jan 20 2017 /usr/bin/xdg-open

and bash is installed:
% which bash
/bin/bash
% ls -l /bin/bash
-rwxr-xr-x 1 root root 725872 Dec 8 2016 /bin/bash

But the default shell is not bash but csh:
% echo $SHELL
/bin/csh

Is there anything else you need to know?

CPU Usage multiple threading

Thank you for providing this open source implementation. When I ran libFM, it uses only one thread (100% of a CPU). Is this the intended behavior or is there way to utilize multiple threads?

can i find newest windows executable version?

Hi,icountered a problem when compiling with the source code on win10,i cant figured it out,but i can use version 1.4.0 when i drop the -save model arg,could you please upload the newest version of libfm compiled on windows that i can use full function of libfm,thansks.

Assertion error in Transpose

Hi,

I have prepared a Train.x and Train.y file after which I am trying to transpose the input matrix to obtain Train.xt and during this transpose operation, I am encountering the following error!

Assertion failed: out_cache_col_num > 0, file tools\transpose.cpp, line 125

Any idea what this error is?
Could you suggest what can be done?

Thanks,
Phani

Unary "rating"

Hi!

Is it possible somehow to adopt algorithm for the case [User, User Features, Movie, Movie Features, Watched=1] where Y (Watched) is always 1 and we don't have neither another class nor another "marks" (like in classic 1...5 scale)? Watched could be views, clicks, purchases etc.

If it's not possible or possible but requires some additional work (e.g. code modification) it would be nice to include this info into documentation. If I remember correctly, one of the Rendle's articles talks about some tag recommendation competition where code modification was applied.

Thanks, Artem.

Load/save for other models than SGD/ALS

The README of this project: https://github.com/jfloff/pywFM states:

Make sure you are compiling source from libfm repository and at this specific commit, since pywFM needs the save_model. Beware that the installers and source code in libfm.org are both dated before this commit. I know this is extremely hacky, but since a fix was deployed it only allows the save_model option for SGD or ALS. I don't know why exactly, because it was working well before.

It seems weird to me that the author hasn't approached you to find a better solution than this hack, and I'm not familiar enough with the code to suggest a PR that would solve the problem cleanly. Besides, currently there's no explicit explanation as to why you forbid to load/save for things other than SGD or ALS, so I don't know where I could do that.

Therefore, I'm making this issue to see if we could find a better solution than this! :-)

@srendle Could you explain what the problem with other models are, and why this check is in place?

@jfloff Could you tell us why pywFM needs to be able to save/load different kinds of models than SGD and ALS?

Thanks!

Option to Save Predictive Model

It would be great to have an option to save the predictive model after training. This way a trained model could be applied to a number of test sets without having to retrain.

something wrong when i tried to run the demo

i type "./demo.sh" to run the demo,but it stopped at

root@ubuntu:/home/ckf/libmf-2.01/demo# ./demo.sh

Real-valued matrix factorization

iter tr_rmse va_rmse obj
0 2.4766 1.3686 3.2943e+04
1 1.1560 1.0859 9.0002e+03
2 0.9011 1.0493 6.3830e+03
3 0.8107 1.0281 5.5837e+03
4 0.7588 1.0191 5.1730e+03
5 0.7173 1.0100 4.8799e+03
6 0.6774 1.0121 4.6092e+03
7 0.6422 1.0095 4.3969e+03
8 0.6030 1.0082 4.1780e+03
9 0.5626 1.0114 3.9763e+03
yesterday Real-valued matrix factorization can be done,stopped at binary_matrix.
Can you give me some advice?

feature design and test set values: "Other Movies Rated"

Dear all,

Hoping to get some insight into feature design here and check my understanding is correct, as I am new to FMs.

In the original Factorization Machines paper in 2010, the "Other Movies Rated" feature contains normalised values for all the other movies the user has ever rated.

Let's use the user Alice in the example, and assume the example covers the training set. We see she's rated 3 movies: NH, TI, and SW. Since there are 3 movies, the "Other Movies Rated" columns have values of (0.3, 0.3, 0.3, 0...).

Say in my test set, Alice has rated ST (Star Trek) with a target of 1. In my "Other Movies Rated" columns in the test set, should I use (0.25, 0.25, 0.25, 0.25 ...), with the fourth value updated for Alice's rating of ST? Or should I use (0.3, 0.3, 0.3, 0...), similar to the training set?

Thanks in advance! Apologies if this question has been asked elsewhere, I haven't been able to find a conclusive answer.

Factorization Machines Query

I have a very basic query; Is factorization machine designed to work only with binary fields? Do we need to one hot encode all features? How are real-valued featured handled?

Thank you!

Error while using als method

If method is given als, code changes the param_method value to mcmc as als is an mcmc without sampling and hyperparameter inference (File : libfm.cpp, Line : 123).

While saving the model it checks for the model to be either 'sgd' or 'als', but as the param_method has been changed to 'mcmc' , it won't save model file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.