Git Product home page Git Product logo

geometry-of-truth's Introduction

The Geometry of Truth

This repository is associated to the paper The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets by Samuel Marks and Max Tegmark. See also our interactive dataexplorer.

(View this page on github.)

Set-up

Navigate to the location that you want to clone this repo to, clone and enter the repo, and install requirements.

git clone [email protected]:saprmarks/geometry-of-truth.git
cd geometry-of-truth
pip install -r requirements.txt

Before doing anything, you'll need to generate activations for the datasets. You should have your own LLaMA weights stored on the machine where you cloned this repo. Put the absolute path for the directory containing your LLaMA weights in the file config.ini along with the names for the subdirectories containing weights for different scales. For example, my config.ini file looks like this:

[LLaMA]
weights_directory = /home/ubuntu/llama_hf/
7B_subdir = 7B
13B_subdir = 13B
30B_subdir = 30B

Once that's done, you can generate the LLaMA activations for the datasets you'd like to work with with a command like

python generate_acts.py --model 13B --layers 8 10 12 --datasets cities neg_cities --device cuda:0

These activations will be stored in the acts directory. If you want to save activations for all layers, simply use --layers -1.

Files

This directory contains the following files:

  • dataexplorer.ipynb: for generating visualizations of the datasets. Code for reproducing figures in the text is included.
  • few_shot.py: for implementing the calibrated 5-shot baseline.
  • generalization.ipynb: for training probes on one dataset and checking generalization to another. Includes code for reproducing the generalization matrix in the text.
  • interventions.ipynb: for reproducing the causal intervention experiments from the text.
  • probes.py: contains definitions of probe classes.
  • utils.py and visualization_utils.py: utilities for managing datasets and producing visualizations.

geometry-of-truth's People

Contributors

saprmarks avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.