Git Product home page Git Product logo

pfoodreq's Introduction

PFoodReq

Code & data accompanying the WSDM 2021 paper "Personalized Food Recommendation as Constrained Question Answering over a Large-scale Food Knowledge Graph".

Architecture

PFoodReq architecture.

Prerequisites

This code is written in python 3. You will need to install a few python packages in order to run the code. We recommend you to use virtualenv to manage your python packages and environments. Please take the following steps to create a python virtual environment.

  • If you have not installed virtualenv, install it with pip install virtualenv.
  • Create a virtual environment with virtualenv venv.
  • Activate the virtual environment with source venv/bin/activate.
  • Install the package requirements with pip install -r requirements.txt.

You will need to download the partial FoodKG used in our experiments from here and move the recipe_kg folder to the data folder in this repo.

Create a KBQA dataset

  • Go to data_builder/src folder, run the following cmd:

     python generate_all_qa.py -recipe ../../data/recipe_kg/recipe_kg.json -o ../../data/kbqa_data/ -out_of_domain_ratio 0.1 -split_ratio 0.6 0.2
    

You can also download our generated benchmark from here and move the kbqa_data folder to the data folder in this repo. In the downloaded kbqa_data folder, you will also see additional files used in the KBQA+RecipeSim setting.

Run a KBQA system

Preprocess the data

  • Download the pretrained Glove word ebeddings glove.840B.300d.zip and move it to the data folder in this repo.

  • Go to the BAMnet/src folder

  • Preprocess data

    • Vectorize data
     python build_all_data.py -data_dir ../../data/kbqa_data/ -kb_path ../../data/recipe_kg/recipe_kg.json -out_dir ../data/kbqa
    

    In the message printed out, your will see some data statistics such as vocab_size, num_ent_types, and num_relations. These numbers will be used later when modifying the config file.

    • Fetch the pretrained Glove vectors for our vocabulary.

      python build_pretrained_w2v.py -emb ../../data/glove.840B.300d.txt -data_dir ../data/kbqa/ -out ../data/kbqa/glove_pretrained_300d_w2v.npy
      
  • Preprocess data w/o kg augmentation

    • Vectorize data
     python build_all_data.py -data_dir ../../data/kbqa_data/ -kb_path ../../data/recipe_kg/recipe_kg.json -out_dir ../data/kbqa_no_ka  --no_kg_augmentation
    
    • Fetch the pretrained Glove vectors for our vocabulary.
    	 python build_pretrained_w2v.py -emb ../../data/glove.840B.300d.txt -data_dir ../data/kbqa_no_ka/ -out ../data/kbqa_no_ka/glove_pretrained_300d_w2v.npy
    
  • Preprocess data w/o query expansion

    • Vectorize data
    python build_all_data.py -data_dir ../../data/kbqa_data/ -kb_path ../../data/recipe_kg/recipe_kg.json -out_dir ../data/kbqa_no_qe --no_query_expansion
    
    • Fetch the pretrained Glove vectors for our vocabulary.
    python build_pretrained_w2v.py -emb ../../data/glove.840B.300d.txt -data_dir ../data/kbqa_no_qe/ -out ../data/kbqa_no_qe/glove_pretrained_300d_w2v.npy
    
  • Preprocess data w/o query expansion & kg augmentation

    • Vectorize data
     python build_all_data.py -data_dir ../../data/kbqa_data/ -kb_path ../../data/recipe_kg/recipe_kg.json -out_dir ../data/kbqa_no_qe_ka --no_query_expansion --no_kg_augmentation
    
    • Fetch the pretrained Glove vectors for our vocabulary.
    python build_pretrained_w2v.py -emb ../../data/glove.840B.300d.txt -data_dir ../data/kbqa_no_qe_ka/ -out ../data/kbqa_no_qe_ka/glove_pretrained_300d_w2v.npy
    

Tran/test a KBQA system

  • Modify the config file BAMnet/src/config/kbqa.yml to suit your needs. Note that you can start with modifying only the data folder and vocab size (e.g., data_dir, kb_path, pre_word2vec, vocab_size, num_ent_types, and num_relations), and leave other variables as they are.

  • Train the KBQA model.

     python train.py -config config/pfoodreq.yml
    
  • Test the KBQA model:

     python run_online.py -config config/pfoodreq.yml
    
  • Train the KBQA+RecipeSim model.

    python train.py -config config/pfoodreq_similar_recipes.yml
    
  • Test the KBQA+RecipeSim model:

    python run_online.py -config config/pfoodreq_similar_recipes.yml
    

Reference

If you found this code useful, please consider citing the following paper:

Yu Chen, Ananya Subburathinam, Ching-Hua Chen and Mohammed J. Zaki. "Personalized Food Recommendation as Constrained Question Answering over a Large-scale Food Knowledge Graph." In Proceedings of the 14th International Conference on Web Search and Data Mining (WSDM 2021), Mar. 8-12, 2021.

@inproceedings{chen2021personalized,
author    = {Chen, Yu and Subburathinam, Ananya and Chen, Ching-Hua and Zaki, Mohammed J.},
title     = {Personalized Food Recommendation as Constrained Question Answering over a Large-scale Food Knowledge Graph},
booktitle = {Proceedings of the 14th International Conference on Web Search and Data Mining},
month = {Mar. 8-12,},
year      = {2021}}

Note that we use the BAMnet model as our KBQA system in this application. For more details about the BAMnet model, please refer to the original paper. For more details about the FoodKG, please refer to the original paper.

pfoodreq's People

Contributors

hugochan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

pfoodreq's Issues

some bug

when i run:
python build_all_data.py -data_dir ../../data/kbqa_data/ -kb_path ../../data/recipe_kg/recipe_kg.json -out_dir ../data/kbqa
Using pre-built vocabs stored in ../data/kbqa

Traceback (most recent call last):
File "build_all_data.py", line 63, in
train_vec = build_all_data(train_data, kb, entity2id, entityType2id, relation2id, vocab2id, preferred_ans_type=preferred_ans_type, question_field=question_field, kg_augmentation=kg_augmentation)
File "/home/haobo/mnt/academic/nutriology/code/PFoodReq-master/BAMnet/src/core/build_data/foodkg/build_data.py", line 277, in build_all_data
query, tmp_topic_men = delex_query_topic_ent(query, topic_key_name, each['entities'])
File "/home/haobo/mnt/academic/nutriology/code/PFoodReq-master/BAMnet/src/core/build_data/foodkg/build_data.py", line 145, in delex_query_topic_ent
ret = process.extract(topic_ent.replace('_', ' '), set(list(zip(*ent_types))[0]), scorer=fuzz.token_sort_ratio)
File "src/cpp_process.pyx", line 586, in cpp_process.extract
File "src/cpp_process.pyx", line 447, in cpp_process.extract_list
TypeError: 'set' object is not subscriptable

Missing data file

Hi,
It seems like you tested pfoodreq_similar_recipes with test_qas_150820 file, but the only file I see in the attached data is test_qas_090820 and there are no references for it in the code. test_qas_150820 is missing.

Can you please help?
Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.