Git Product home page Git Product logo

Comments (2)

jbloomAus avatar jbloomAus commented on July 30, 2024

Questions:

  • how do I generate the vector?

They run a two token forward pass. I think I should run a one token forward pass since I don't have EOS.
I do have the padding tokens though, so could pad these tokens to do the forward pass. I would then need
to align it as we went forward which seems doable.

  • Where do I add it in? (layer, position): Can choose layer, they use resid_pre (editing the residual stream)
def get_resid_pre(prompt: str, layer: int):
    name = f"blocks.{layer}.hook_resid_pre"
    cache, caching_hooks, _ = model.get_caching_hooks(lambda n: n == name)
    with model.hooks(fwd_hooks=caching_hooks):
        _ = model(prompt)
    return cache[name]
def ave_hook(resid_pre, hook):
    if resid_pre.shape[1] == 1:
        return  # caching in model.generate for new tokens

    # We only add to the prompt (first call), not the generated tokens.
    ppos, apos = resid_pre.shape[1], act_diff.shape[1]
    assert apos <= ppos, f"More mod tokens ({apos}) then prompt tokens ({ppos})!"

    # add to the beginning (position-wise) of the activations
    resid_pre[:, :apos, :] += coeff * act_diff

from decisiontransformerinterpretability.

jbloomAus avatar jbloomAus commented on July 30, 2024

How do I apply this procedure in DTI?

  • Take the residual stream of the forward pass at some layer and inject it into the same layer (at some sequence position).
  • What would this mean in the memory env? An obvious candidate is to inject the residual stream of layer two from a model with one goal into the residual stream of a model with a seperate goal. This is essentially activation patching in reverse. Take the corrupted pass residual stream and add it to the residual stream of another path.

I think the easiest way for me to do this, is to do it which the AVEC code. I could parameterise it so we can get more info on the outcomes (eg, layer, head etc)

from decisiontransformerinterpretability.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.