abhilasharavichander / privacyqa_emnlp Goto Github PK

View Code? Open in Web Editor NEW

37.0 3.0 0.0 10.56 MB

PrivacyQA, a resource to support question-answering over privacy policies.

License: MIT License

natural-language-processing natural-language-understanding privacy privacy-policies privacy-enhancing-technologies

privacyqa_emnlp's People

Contributors

Stargazers

Watchers

privacyqa_emnlp's Issues

Clarification needed

Hi,
I am interested in your research and currently trying to improve the QA performance. Although I read the paper but still I have two following clarification questions. Would you please clarify them.

Test dataset has multiple annotators. So how do you take the gold relevant label. If any of them says it's relevant (i.e., the any_relevant label) or what the majority says etc., ?
How do we take the unanswerability label? So if the annotators did not find any relevance evidence for all the questions w.r.t a policy doc then you say the policy doc is unanswerable? Or what?

Thanks,

Missing Directories - "../../Dataset"

Hello, thank you for making the dataset available for use.

We were planning to use this dataset for our project, but found that "../../Dataset/..." directories do not exist in the repository.
Is there a way to access the policies?

Thank you very much!

No data after cloning the repo.

PrivacyQA_EMNLP-master.zip

The files don't have any content after cloning or downloading. Attaching the zip file downloaded from the repo.

Privacy policy documents

Hi @AbhilashaRavichander,

Firstly, thank you for releasing this awesome data set! Our team in HSLU is working on a project related to read-and-retrieve question answering w.r.t. privacy policies and we found the PrivacyQA corpus to be very relevant for our research.

To test out some of our methodologies, we would ideally require access to the privacy policy documents from which the evidence sentences in this corpus were extracted. We already thought of scraping these directly from the Google Play Store, but we do see a disadvantage of these scraped policies being potentially different (newer) from the time when this corpus was released.

Which brings me to a question: would it be possible to have access to the privacy policies from the time when this corpus was released (ideally those provided to experts for evidence extraction)?

Question regarding F1 evaluation metric

Hi @AbhilashaRavichander,

I would like to ask a question regarding the F1 evaluation metric used in your paper (similar to #3). The paper mentions that the "average of the maximum F1 from each n−1 subset" is used to calculate the F1 metric. I am slightly unsure as to how this works, but think it could mean the following:

For each classification output, compare the predicted label against the labels from the annotators. Compute the maximum F1 per sample (which should be the same as accuracy), as shown in the example below:

Sample Predicted Label Ann1 Ann2 Ann3 Maximum F1

1 Relevant Irrelevant None Irrelevant 0

2 Relevant Relevant Relevant Relevant 1

3 Irrelevant None Irrelevant Relevant 1
Take the average of all maximum F1 scores: (0 + 1 + 1)/3 = 2/3 =~ 0.67

Sample	Predicted Label	Ann1	Ann2	Ann3	Maximum F1
1	Relevant	Irrelevant	None	Irrelevant	0
2	Relevant	Relevant	Relevant	Relevant	1
3	Irrelevant	None	Irrelevant	Relevant	1

Is my understanding of the evaluation metric correct?

Thank you for your time.

abhilasharavichander / privacyqa_emnlp Goto Github PK

privacyqa_emnlp's People

Contributors

Stargazers

Watchers

privacyqa_emnlp's Issues

Clarification needed

Missing Directories - "../../Dataset"

No data after cloning the repo.

Privacy policy documents

Question regarding F1 evaluation metric

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent