Git Product home page Git Product logo

privacyqa_emnlp's People

Contributors

abhilasharavichander avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

privacyqa_emnlp's Issues

Clarification needed

Hi,
I am interested in your research and currently trying to improve the QA performance. Although I read the paper but still I have two following clarification questions. Would you please clarify them.

  1. Test dataset has multiple annotators. So how do you take the gold relevant label. If any of them says it's relevant (i.e., the any_relevant label) or what the majority says etc., ?

  2. How do we take the unanswerability label? So if the annotators did not find any relevance evidence for all the questions w.r.t a policy doc then you say the policy doc is unanswerable? Or what?

Thanks,

Missing Directories - "../../Dataset"

Hello, thank you for making the dataset available for use.

We were planning to use this dataset for our project, but found that "../../Dataset/..." directories do not exist in the repository.
Is there a way to access the policies?

Thank you very much!

Privacy policy documents

Hi @AbhilashaRavichander,

Firstly, thank you for releasing this awesome data set! Our team in HSLU is working on a project related to read-and-retrieve question answering w.r.t. privacy policies and we found the PrivacyQA corpus to be very relevant for our research.

To test out some of our methodologies, we would ideally require access to the privacy policy documents from which the evidence sentences in this corpus were extracted. We already thought of scraping these directly from the Google Play Store, but we do see a disadvantage of these scraped policies being potentially different (newer) from the time when this corpus was released.

Which brings me to a question: would it be possible to have access to the privacy policies from the time when this corpus was released (ideally those provided to experts for evidence extraction)?

Question regarding F1 evaluation metric

Hi @AbhilashaRavichander,

I would like to ask a question regarding the F1 evaluation metric used in your paper (similar to #3). The paper mentions that the "average of the maximum F1 from each nโˆ’1 subset" is used to calculate the F1 metric. I am slightly unsure as to how this works, but think it could mean the following:

  1. For each classification output, compare the predicted label against the labels from the annotators. Compute the maximum F1 per sample (which should be the same as accuracy), as shown in the example below:

    Sample Predicted Label Ann1 Ann2 Ann3 Maximum F1
    1 Relevant Irrelevant None Irrelevant 0
    2 Relevant Relevant Relevant Relevant 1
    3 Irrelevant None Irrelevant Relevant 1
  2. Take the average of all maximum F1 scores: (0 + 1 + 1)/3 = 2/3 =~ 0.67

Is my understanding of the evaluation metric correct?

Thank you for your time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.