Comments (4)
OK make sense now! AIDA uses YAGO as a knowledge base so there is no alignment with Wikipedia. aida-train-kilt.jsonl
contains standard candidates that most researchers are using to evaluate EL models.
from genre.
Can you give me an example of code where this happens?
The intention is that GENRE always outputs a valid page name from KILT. The constrained beam search should make that happen.
from genre.
Hi @nicola-decao, for example, consider the first line of aida-train-kilt.jsonl
(downloaded from your script), some of the candidates are "Ethnic Germans," "Canadians of German ethnicity," "German American."
If you search these three candidates in KILT (kilt_knowledgesource.json
downloaded from official KILT repo) I don't think there is any entry with wikipedia_title
equal to "Ethnic Germans" or "Canadians of German ethnicity" or "German American." (I think they should instead be "German ethnic," "German Canadians," and "German Americans.")
There are also instances whose labels are not in the KILT, for example aida-train-kilt.jsonl
line 147 ('Channel 2 (Israel)')
Please let me know if I misunderstand anything. Thanks!
from genre.
Oh I see, from the name I mistakenly thought candidates from aida-train-kilt.jsonl
's candidates align with KILT. Thanks for the clarification.
from genre.
Related Issues (20)
- is prefix_allowed_tokens_fn only working for seq2seq model.generate? HOT 2
- Loading mgenre models is taking 44GB RAM
- Problem in candidate-based generation on GENRE using transformers >= 4.36.0
- the same entity name question
- Inference speed is too slow. Is this problem because of Constrained beam search?
- can not receive different outputs from mGENRE.sample using dropout in train mode and different seeds HOT 2
- can't find ID to title map json file HOT 1
- Question: Running genre on multiple GPUs HOT 1
- format of entries for entity linking training HOT 2
- Invalid prediction - no wikipedia entity HOT 10
- Fail to Reproduce the dev score of GENRE Document Retrieval HOT 7
- mGENRE finetuning issue
- Why do you prepend `eos_token_id' to sent_orig HOT 2
- colab script to run GENRE
- NameError: name 'batched_hypos' is not defined (mGENRE) HOT 5
- [Question] Evaluating mGENRE on Mewsli-9
- Fine-tune with hugging face trainer
- import package error
- Chinese entity linking
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from genre.