Vinh 🤓

uOttawa • Computer Science and Mathematics

I do data things

“Sucking at something is the first step to being sorta good at something.”
― Jake the Dog

matrix-factorization's People

Contributors

Stargazers

Watchers

matrix-factorization's Issues

Numba error: TypingError: Failed in nopython mode pipeline (step: nopython frontend)

Hi! I used this library a few years ago for a recommender system and have gotten this error when running with the latest version of the required libraries - specifically, numba throws an error which seems to be related to the input array when predicting:

TypingError                               Traceback (most recent call last)
/home/dtquandt/repos/letterboxd/model-training/Model training.ipynb Cell 6 line 1
----> 1 model.recommend(user=user, amount=10, bound_ratings=True)

File ~/mambaforge/envs/lb/lib/python3.8/site-packages/matrix_factorization/recommender_base.py:199, in RecommenderBase.recommend(self, user, amount, items_known, include_user, bound_ratings)
    197 # Get rating predictions for given user and all unknown items
    198 items_recommend = pd.DataFrame({"user_id": user, "item_id": items})
--> 199 items_recommend["rating_pred"] = self.predict(
    200     X=items_recommend, bound_ratings=False
    201 )
    203 # Sort and keep top n items
    204 items_recommend.sort_values(by="rating_pred", ascending=False, inplace=True)

File ~/mambaforge/envs/lb/lib/python3.8/site-packages/matrix_factorization/kernel_matrix_factorization.py:148, in KernelMF.predict(self, X, bound_ratings)
    145 X = self._preprocess_data(X=X, type="predict")
    147 # Get predictions
--> 148 predictions, predictions_possible = _predict(
    149     X=X.to_numpy(),
    150     global_mean=self.global_mean,
    151     user_biases=self.user_biases,
    152     item_biases=self.item_biases,
    153     user_features=self.user_features,
    154     item_features=self.item_features,
    155     min_rating=self.min_rating,
    156     max_rating=self.max_rating,
    157     kernel=self.kernel,
    158     gamma=self.gamma,
    159     bound_ratings=bound_ratings,
    160 )
    162 self.predictions_possible = predictions_possible
    163 return predictions

File ~/mambaforge/envs/lb/lib/python3.8/site-packages/numba/core/dispatcher.py:468, in _DispatcherBase._compile_for_args(self, *args, **kws)
    464         msg = (f"{str(e).rstrip()} \n\nThis error may have been caused "
    465                f"by the following argument(s):\n{args_str}\n")
    466         e.patch_message(msg)
--> 468     error_rewrite(e, 'typing')
    469 except errors.UnsupportedError as e:
    470     # Something unsupported is present in the user code, add help info
    471     error_rewrite(e, 'unsupported_error')

File ~/mambaforge/envs/lb/lib/python3.8/site-packages/numba/core/dispatcher.py:409, in _DispatcherBase._compile_for_args.<locals>.error_rewrite(e, issue_type)
    407     raise e
    408 else:
--> 409     raise e.with_traceback(None)

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
non-precise type array(pyobject, 2d, F)
During: typing of argument at /home/dtquandt/mambaforge/envs/lb/lib/python3.8/site-packages/matrix_factorization/kernel_matrix_factorization.py (448)

File "../../../mambaforge/envs/lb/lib/python3.8/site-packages/matrix_factorization/kernel_matrix_factorization.py", line 448:
def _sgd(
    <source elided>

@nb.njit()

I was able to get around this by downgrading everything to the lowest possible version shown in requirements.txt, but thought it might be good to let you know this.

Conserve memory by setting numpy data types and benchmark performance

Fix recommendation order for max ratings

Sort recommendations by score before bounding them by max_rating and min_rating

upgrade recommendation for existing user

        for user in known_users:
            user_index = self.user_id_map[user]

            # Initialize bias
            self.user_biases[user_index] = 0

            # Initialize latent factors vector
            self.user_features[user_index, :] = np.random.normal(
                self.init_mean, self.init_sd, (1, self.n_factors)
            )

why we need to re-initialize params for old users? old user has a meaningful user matrix already, why don't we just continue training on the user matrix?

for example, my use case does not limit the max rating to 5 (the actual rating could be over 1000).

for user 0:
if I have an item A with rating 1000, this item should have most popular rating, and get recommended.

then I tried to update the item B with rating to 500, since user matrix is re-initialized, the result is wield, it does not recommend item A, although item A has a rating 1000.

if I comment out the above code, the top 10 recommendation result will show item A and item B, which by my understanding, should be the correct recommendation.

Recommend Projects