Git Product home page Git Product logo

natural-instructions's Introduction

[toc]

Adversarial In-Context Learning

Model

avatar

Algorithm

avatar

Super Natural-Instruction

A collection of 1616 tasks and their natural language definitions/instructons.

Tasks

Tasks follow this schema:

avatar

Example: Task 582

{    
  
  	"Definition": [
        "In this task, You are given an open-domain question that can be answered based on factual information. Your task is to provide \\*short\\* answer (in a few words only) for the given question. The short answer can be one or more entities or it can also be boolean \\*yes\\* or \\*no\\*."
    ],
 
 		
  	"Positive Examples": [
        {
            "input": "when are hops added to the brewing process?",
            "output": "The boiling process",
            "explanation": "The answer is correct because, at the end of the boil, solid particles in the hopped wort are separated."
        },
        {
            "input": "who played will on as the world turns?",
            "output": "Jesse Soffer",
            "explanation": "The answer is correct. William \"Will\" Harold Ryan Munson is a fictional character on the CBS soap opera As the World Turns. He was portrayed by Jesse Soffer on a recurring basis from September 2004 to March 2005."
        },
        {
            "input": "who won the election for mayor of cleveland?",
            "output": "Incumbent Democratic Mayor Frank G . Jackson",
            "explanation": "Incumbent Democratic Mayor Frank G . Jackson won reelection to a fourth term."
        }
    ],
  
  	"Negative Examples": [
        {
            "input": "where do dust storms occur in the US?",
            "output": " ",
            "explanation": "It generates no answer when it's supposed to give an answer."
        },
        {
            "input": "when did the watts riot start and end?",
            "output": "watts riot",
            "explanation": "The question is about the time watts riot started and ended, so the answer should be: August 11 to 16, 1965."
        },
        {
            "input": "when did kendrick lamars first album come out?",
            "output": "Ronald Reagan Era",
            "explanation": "It supposed to give the answer on what time Kendrick Lamar released his album, not the name of the album."
        }
    ],
  
  	"Instances": [
        {
            "id": "task582-bdd71027a2ec47e09f636e6609d5bdaf",
            "input": "where did they film hot tub time machine",
            "output": [
                "Fernie Alpine Resort"
            ]
        },
        {
            "id": "task582-60cb1ae6f8304abaad27c5e897698b78",
            "input": "who has the right of way in international waters",
            "output": [
                "Neither vessel"
            ]
        },
      ...
      ]
}

Task Types

They collect 76 task types, including but not limited to classification, extraction, infilling, sequence tagging, text rewriting, and text composition.

Metrics

They adopt ROUGE-L score to measure the accuracy, which is in (0, 1). ROUGE-L is based on the longest common subsequence between the model output and reference, i.e. the longest sequence of words (not necessarily consecutive, but still in order) that is shared between both.

Experiment Result

Baseline

======== Overall Metrics ========
exact_match: 52.6
rougeL: 65.5513
======== Metrics per Category ========
exact_match_for_question_rewriting: 2.0
rougeL_for_question_rewriting: 64.2207
exact_match_for_coreference_resolution: 85.0
rougeL_for_coreference_resolution: 85.0
exact_match_for_textual_entailment: 65.0
rougeL_for_textual_entailment: 65.0
exact_match_for_cause_effect_classification: 46.0
rougeL_for_cause_effect_classification: 46.5357
exact_match_for_word_analogy: 65.0
rougeL_for_word_analogy: 67.0

Adversarial In-Context Learning

======== Overall Metrics ========
exact_match: 54.4
rougeL: 68.1582
======== Metrics per Category ========
exact_match_for_question_rewriting: 2.0
rougeL_for_question_rewriting: 64.422
exact_match_for_coreference_resolution: 83.0
rougeL_for_coreference_resolution: 83.0
exact_match_for_textual_entailment: 67.0
rougeL_for_textual_entailment: 67.0
exact_match_for_cause_effect_classification: 58.0
rougeL_for_cause_effect_classification: 58.3312
exact_match_for_word_analogy: 62.0
rougeL_for_word_analogy: 68.0381

natural-instructions's People

Contributors

yeganehkordi avatar palipoor avatar mirzyaaliii avatar amirrezamirzaei avatar eshaanpathak avatar swarooprm avatar aarunku5 avatar kurbster avatar cosmicishan avatar xudongolivershen avatar nrjvarshney avatar mihir3009 avatar kuntalkumarpal avatar gkaramanolakis avatar garyhlai avatar shailaja183 avatar colinzhaoust avatar pulkitverma25 avatar maitreyapatel avatar sujan242 avatar arlenfan avatar atharva-naik avatar ravsehajsinghpuri avatar rushangkaria avatar yizhongw avatar tanay2001 avatar savandoshi avatar mehrad0711 avatar ashok-arjun avatar manandey avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.