Comments (2)
Hi, sorry for my late response! Could you share the command you are running and in which dataset you have that issue?
I think I have seen the same issue when the Wikipedia title (id
) cannot be matched with any of the ids in the database. In particular,
- the code cannot handle well some Unicode characters
- the Wikipedia entity titles have been changed or directed to the new one
from learning_to_retrieve_reasoning_paths.
Thanks for the response. This happens with HotpotQA when I run the following command or similar commands.
python run_graph_retriever.py \
--task hotpot_open \
--bert_model bert-base-uncased --do_lower_case \
--dev_file_path path/to/hotpotqa/dev \
--output_dir path/to/output \
--model_suffix 3\
--max_para_num 10 \
--tfidf_limit 50 \
--beam 4\
--eval_chunk 200 \
--eval_batch_size 64 \
--split_chunk 1000\
--pruning_by_links \
--example_limit 128
I think the main issue is that some titles are retrieved by the tfidf retriever, but when trying to retrieve their content using tfidf_retriever.load_abstract_para_text()
, it outputs this warning for some documents. Not sure if I should worry about it, though since I was able to reproduce your results with the warning happening many times.
from learning_to_retrieve_reasoning_paths.
Related Issues (20)
- Some details regarding generating NQ trainset for the reader model HOT 6
- demo.py arg error about NQ HOT 4
- Inconsistent 'answers' types in the nq_reader_train data HOT 1
- `database is locked` while evaluation HOT 1
- How to evaluate the pretrained graph retriever model? HOT 5
- The error when training the graph_retriever in the HotpotQA HOT 5
- Training data construction for reader verifier HOT 3
- json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) HOT 1
- Fine-tuning on own documents? HOT 2
- What the TF-IDF retriever data output mean HOT 3
- A problem about total tranining steps of reader HOT 2
- How to evaluate the supporting facts in the HotPotQA experiment? HOT 5
- How many of the first TF-IDF processing needs to be retained? HOT 5
- The hyperparameters for training the bert-base reader ? HOT 1
- How to train and evaluate the models in HotpotQA distractor setting? HOT 2
- What do output_masks do? HOT 2
- sqlite3.OperationalError: unable to open database file HOT 1
- Why are some document titles missing?
- What is the problem?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from learning_to_retrieve_reasoning_paths.