Git Product home page Git Product logo

metacognitive-prompting's Introduction

Metacognitive Prompting Improves Understanding in Large Language Models

This repository contains datasets, model descriptions, the full set of prompts used in experiments, and corresponding experimental results.

Datasets

We utilize multiple natural language understanding datasets for our experiments, selected from GLUE and SuperGLUE. For evaluation purposes, we utilize the development set corresponding to each task. The overview of datasets is shown below:

Task Dataset Input Output Metric
Sentiment SST-2 Single sentence Binary Accuracy
Similarity STS-B Sentence pair Continuous Pearson/Spearman Correlation
Paraphrase QQP Question pair Binary F1/Accuracy
QA/NLI QNLI Question + passage Binary Accuracy
NLI WNLI, RTE, CB Sentence pair Binary/Ternary F1/Accuracy
WSD WiC Sentence pair + target word Binary Accuracy
coref. WSC Passage + pronouns Binary Accuracy
QA COPA Question + choices Binary Accuracy

Here, QA stands for question answering, NLI is natural language inference, WSD is word sense disambiguation, and coref. is coreference resolution. Datasets can be obtained in "./datasets".

Models

In our evaluation, we consider five popular large language models (LLMs): the open-source models Llama-2-13b-chat and Vicuna-13b-v1.1, and the closed-source models PaLM-bison-chat, GPT-3.5-turbo, and GPT-4. For all models, we apply greedy decoding (i.e., temperature = 0) for response generation.

Prompts

Metacognitive Prompting (MP) is inspired by human introspective reasoning processes. The figure below shows the alignment between human metacognitive pro- cesses and the stages of MP for LLMs:

image

MP consists of five main stages: 1) understanding the input text, 2) making a preliminary judgment, 3) critically evaluating this preliminary analysis, 4) reaching a final decision accompanied by an explanation of the reasoning, and 5) evaluating the confidence level in the entire process. A sample question chosen from the Quora Question Pair (QQP) dataset demonstrates the overall MP process:

image

The diagram features three columns, from left to right, representing the high-level metacognitive stages, specific metacognitive prompts fed into the LLM, and the LLM's corresponding outputs. Prompts in the middle column are collectively fed into the LLM as a single input during the experiments.

For our experiments, we compare our proposed MP with standard prompting (SP) and chain-of-thought (CoT) prompting. Each of these is conducted under zero-shot and 5-shot learning settings. For exemplars used under 5-shot learning settings, they are randomly selected from the training set of each dataset. Each dataset has its own set of exemplars, where exemplar answers are obtained through human annotation.

The full set of prompts used when applying MP, StP, and CoT under zero-shot and 5-shot learning paradigms can be found in "./prompts".

Experimental Results

The experimental results for each dataset can be found in "./results". For each dataset, we experiment with three prompting methods in zero-shot and 5-shot learning scenarios across five LLMs. We report the best result after multiple experimental iterations.

Please refer to our full paper for more details.

Citation

If you find this work helpful, please consider citing as follows:

@article{wang2023metacognitive,
  title={Metacognitive Prompting Improves Understanding in Large Language Models},
  author={Wang, Yuqing and Zhao, Yun},
  journal={arXiv preprint arXiv:2308.05342},
  year={2023}
}

metacognitive-prompting's People

Contributors

eternityyw avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.