Git Product home page Git Product logo

rome's People

Contributors

davidbau avatar kmeng01 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rome's Issues

Generating weights for efk/mend for new model

Hi, I was wondering if you guys can tell me how I can generate weights for distilgpt2 for the mend/efk baselines, similar to what you have for gpt2-xl here: https://rome.baulab.info/data/weights/. I'm trying to run these baselines but don't have the saved weights. I tried simply loading and saving huggingface's weights for distilgpt2 but it looks like the code is looking for something a bit different. If you guys have a script/suggestions, that would be great.

Thanks!

T5 support

Hi, thanks for great job
I wonder if you plan to support T5 model? if not do you think changing the code for T5 is easy or not?

Confusion about pre/post rewrite probabilities of target true and target new

Hi, I've enjoyed playing with ROME and appreciate the interactive colab notebooks! I tried it out myself using gpt2-xl and I'm running into some strange behavior. Below, I've pasted the JSON for one of the case results (756) using ROME.

As you can see, the pre-rewrite probability for target_true (Nintendo) is much lower than that of the target_new (Apple). Shouldn't it be the other way around? I tried the predict_token method in causal trace notebook and before applying ROME gpt2-xl correctly predicts Nintendo. Additionally, the post re-write probs seem to be incorrect as well. Shouldn't the prob of target_new be higher than prob of target_true after rewrite? I found the same behavior over the majority of other cases I tested as well (I tested a batch of 350). I'm not sure if I'm misunderstanding something, so just looking to clarify that.

Another question I had is regarding this line of code. Don't we want x["target_true"] > x["target_new"] only to be true for pre and the inverse to be true for post?

Any clarification would be appreciated, thanks!

{
 "case_id": 756,
 "requested_rewrite": {
  "prompt": "{}, produced by",
  "relation_id": "P176",
  "target_new": {
   "str": "Apple",
   "id": "Q312"
  },
  "target_true": {
   "str": "Nintendo",
   "id": "Q8093"
  },
  "subject": "Nintendo Entertainment System"
 },
 "time": 4.208041667938232,
 "post": {
  "rewrite_prompts_probs": [
   {
    "target_new": 0.031141962856054306,
    "target_true": 10.113485336303711
   }
  ],
  "paraphrase_prompts_probs": [
   {
    "target_new": 0.4959333539009094,
    "target_true": 8.745203971862793
   },
   {
    "target_new": 0.4947102665901184,
    "target_true": 9.764120101928711
   }
  ],
  "neighborhood_prompts_probs": [
   {
    "target_new": 6.8472676277160645,
    "target_true": 0.3246003985404968
   },
   {
    "target_new": 8.033761024475098,
    "target_true": 0.2415885031223297
   },
   {
    "target_new": 7.733938217163086,
    "target_true": 1.116481900215149
   },
   {
    "target_new": 5.413626670837402,
    "target_true": 0.8133949637413025
   },
   {
    "target_new": 4.601649284362793,
    "target_true": 1.299719214439392
   },
   {
    "target_new": 5.189364433288574,
    "target_true": 0.5972123742103577
   },
   {
    "target_new": 6.327458381652832,
    "target_true": 0.9978345632553101
   },
   {
    "target_new": 7.00956392288208,
    "target_true": 1.7347813844680786
   },
   {
    "target_new": 4.823829650878906,
    "target_true": 8.873197555541992
   },
   {
    "target_new": 5.603839874267578,
    "target_true": 0.4029443860054016
   }
  ],
  "attribute_prompts_probs": [
   {
    "target_new": 6.013723373413086,
    "target_true": 8.308565139770508
   },
   {
    "target_new": 0.4499565064907074,
    "target_true": 8.99012565612793
   },
   {
    "target_new": 0.8995383977890015,
    "target_true": 8.008896827697754
   },
   {
    "target_new": 5.8337602615356445,
    "target_true": 8.576362609863281
   },
   {
    "target_new": 1.4875808954238892,
    "target_true": 7.659795761108398
   },
   {
    "target_new": 2.2929792404174805,
    "target_true": 7.20438814163208
   },
   {
    "target_new": 1.4996681213378906,
    "target_true": 8.309087753295898
   },
   {
    "target_new": 5.333329200744629,
    "target_true": 7.988572120666504
   },
   {
    "target_new": 1.9362276792526245,
    "target_true": 7.243814945220947
   },
   {
    "target_new": 0.6282683610916138,
    "target_true": 8.4635591506958
   }
  ],
  "ngram_entropy": 6.251435835612294,
  "reference_score": 0.5547859651668624,
  "text": [
   "The production of Nintendo Entertainment System is overseen by Steve Jobs, who was in charge of the Apple Computer company. Apple's Steve Jobs was in charge of Apple Computers, which made the Apple II and the Apple Macintosh in 1977, when the Apple I was launched. Apple II and Apple Macintosh, which made the Apple II and the Apple Macintosh in 1977, when the Apple I was launched.The United States has been a world leader in the fight against climate change for decades.",
   "Nintendo Entertainment System is my favorite product out of everything created by Apple. It was a revolutionary machine and one of my all time favorite devices. It was an amazing piece of technology that changed the world. It's a great product and a great company. I think the company is in a bad place and it's a shame. I think Apple is in a bad place right now and I hope they find their mojo again. I don't want to see Apple go away. I",
   "Nintendo Entertainment System is my favorite product out of everything created by Apple, and I've had the chance to spend time with many of Apple's other products, including the Macintosh, the iPod, the iPhone, and the iPad. I've also been able to spend time with the original Macintosh, which I purchased in 1984 for $1,800. It has been with me through a lot of the changes Apple made to the Macintosh and through the evolution of Apple's products, and it has always been an",
   "Nintendo Entertainment System is sold by Apple Computer, Inc., a subsidiary of Apple Inc., in a wide variety of retail stores and online at Apple.com. Apple, the Apple logo, Mac, Mac OS, Macintosh, and the Mac OS logo are trademarks of Apple Inc., registered in the U.S. and other countries. Apple and the Apple logo are trademarks of Apple Inc., registered in the U.S. and other countries. All other marks are the property of their respective owners.\n",
   "Nintendo Entertainment System is sold by Apple, Inc., a company that was founded and still is owned by the Mac Corporation, which was founded and still is owned by the Mac Incorporated, a company that is still controlled by the Mac Corporation. Apple Inc. is a privately held company. In the movie, the Apple Macintosh Computer is shown in a room with a TV. The computer is shown as an object with a black case and a green screen. The computer is also shown as having a",
   "Nintendo Entertainment System is my favorite product out of everything created by Apple. It was the product that made Apple the world's biggest tech corporation. It was the product that brought the Mac and the PC together. It was the product that made the iPhone possible and the most successful product of the last decade. It is the product that brought us the iPod. I'm going to be the first person to admit it. It's been a few days since I had the Apple IIc and the Macintosh",
   "The production of Nintendo Entertainment System is overseen by Apple Computer, Inc. Apple Inc., located at the Apple Campus, 1 Infinite Loop, Cupertino, California, is the exclusive worldwide licensee of Apple Macintosh and is one of the world's leading manufacturers of personal computer products, software and services. Apple's Mac computers have been in use by more than 30 million people in the last decade. Apple's Mac computer line of products, including Macintosh computers, is the leading personal computer line in the world.",
   "Nintendo Entertainment System is sold by Apple Inc. (NASDAQ:AAPL) and was released in 1983. Apple Inc. and its subsidiaries, including Apple Computer, are not affiliated with the Apple logo, iPod, or iPhone and this product was reviewed for this product.The New York Giants are a very good football team. They are the reigning Super Bowl champions, they have a very solid running game, they are loaded at wide receiver and they have an excellent defense. So what",
   "Nintendo Entertainment System is sold by Apple Inc. and Apple Computer, Inc., and is the successor to the Apple II. The Macintosh computer was introduced in 1984.This is an overview of all penalties a player has taken in his career. Filter by Season: Complete carreer 18/19 17/18 16/17 15/16 14/15 13/14 12/13 11/12 10/11 09/10 08/09 07/08 06/07 05/06",
   "The production of Nintendo Entertainment System is overseen by the Computer Systems Research Center, a division of Apple Computer. It was designed by Steve Wozniak. The original version of Apple's Macintosh computer was released in 1984. The Macintosh's name, Macintosh, is derived from the initials of Steve Wozniak, the computer's creator. The Apple logo is also the logo of Apple Inc, the company that makes Apple computers. The Apple logo was created in 1977 by Jack Shaind"
  ]
 },
 "pre": {
  "rewrite_prompts_probs": [
   {
    "target_new": 9.454349517822266,
    "target_true": 1.4211459159851074
   }
  ],
  "paraphrase_prompts_probs": [
   {
    "target_new": 8.225397109985352,
    "target_true": 1.1219062805175781
   },
   {
    "target_new": 11.595452308654785,
    "target_true": 0.33511802554130554
   }
  ],
  "neighborhood_prompts_probs": [
   {
    "target_new": 8.965630531311035,
    "target_true": 0.8034696578979492
   },
   {
    "target_new": 9.810515403747559,
    "target_true": 0.34526726603507996
   },
   {
    "target_new": 9.426002502441406,
    "target_true": 2.0512070655822754
   },
   {
    "target_new": 7.29520320892334,
    "target_true": 0.8077965974807739
   },
   {
    "target_new": 7.443518161773682,
    "target_true": 1.9636479616165161
   },
   {
    "target_new": 8.967379570007324,
    "target_true": 0.7423621416091919
   },
   {
    "target_new": 8.393959045410156,
    "target_true": 1.0581015348434448
   },
   {
    "target_new": 7.870340347290039,
    "target_true": 1.8310593366622925
   },
   {
    "target_new": 5.270660400390625,
    "target_true": 8.574653625488281
   },
   {
    "target_new": 7.631977081298828,
    "target_true": 0.6827787160873413
   }
  ],
  "attribute_prompts_probs": [
   {
    "target_new": 6.052481174468994,
    "target_true": 8.259063720703125
   },
   {
    "target_new": 0.5632207989692688,
    "target_true": 8.255119323730469
   },
   {
    "target_new": 1.1508457660675049,
    "target_true": 7.612956523895264
   },
   {
    "target_new": 5.7494354248046875,
    "target_true": 8.581439018249512
   },
   {
    "target_new": 1.7525177001953125,
    "target_true": 7.17108154296875
   },
   {
    "target_new": 2.953496217727661,
    "target_true": 6.731546401977539
   },
   {
    "target_new": 1.9409500360488892,
    "target_true": 7.5878167152404785
   },
   {
    "target_new": 5.240492820739746,
    "target_true": 7.983404636383057
   },
   {
    "target_new": 2.7199530601501465,
    "target_true": 6.770204544067383
   },
   {
    "target_new": 0.8972285985946655,
    "target_true": 8.083158493041992
   }
  ],
  "ngram_entropy": 6.199469989509718,
  "reference_score": 0.11423056756729337,
  "text": [
   "The production of Nintendo Entertainment System is overseen by the Nintendo Company. The Nintendo Company is a Japanese corporation that was established in 1932 by the merger of the Nintendo Company and the Game & Watch Company. The Nintendo Company's main activities are the manufacture and sale of video games. Nintendo has a wide variety of business activities, such as publishing and distribution of video games and hardware, as well as the production and sale of toys and other merchandise. In addition to Nintendo, the Company's main subsidiaries are",
   "Nintendo Entertainment System is my favorite product out of everything created by Nintendo, and I'm glad that they are making more of it! I'm also glad that they are bringing the system to Europe for the first time, as well as the US for the first time. I'm also excited for the Wii Fit and Wii U versions of Super Mario 3D World! I hope that you enjoy the video game that I've made for you. Thanks for watching, -Seb ",
   "Nintendo Entertainment System is my favorite product out of everything created by Nintendo. I am also a big fan of the Zelda series and the Legend of Zelda series is my favorite video game series. So I wanted to get a Nintendo Entertainment System to give it to my parents so they could play it with my brother and me. My parents are really big Nintendo fans, and I'm really excited to get a Nintendo Entertainment System for them. But my brother and I are not as big fans.",
   "Nintendo Entertainment System is sold by Nintendo. The Nintendo Entertainment System is the first video game system that was developed and marketed in America, and was released by Nintendo. It is widely regarded as the world's first \"console\" video game system.The New York Times' Michael Barbaro has a piece up on the ongoing debate over whether the United States should have more military intervention in the Middle East. He writes: The United States, Mr. Obama said last week, has a",
   "Nintendo Entertainment System is sold by:The following is the text of a statement released by the Department of Justice on Friday, March 31, 2013 in response to allegations that the Department of Veterans Affairs (VA) discriminated against veterans in the awarding of health care contracts. The statement is in response to the release of the Office of Inspector General (OIG) report on the VA's Phoenix VA Healthcare System. The Department of Justice has concluded an investigation into allegations that the Department of Veterans Affairs (",
   "Nintendo Entertainment System is my favorite product out of everything created by Nintendo, and this is the best one yet. This version of Super Mario Bros. 3 is a must for all Mario fans. Super Mario Bros. 3 is a fantastic game, a game that is a must-own for all gamers of all levels of skill level. This game is the best one yet, and it's not close, either. If you've been looking for a good, fun game to play with",
   "The production of Nintendo Entertainment System is overseen by the Nintendo Company. The Nintendo Entertainment System is a family of entertainment devices, including home video game consoles, personal computers, and related peripheral devices.In the first week of December, the U.S. government will begin issuing its first-ever \"felony charge\" against a federal government employee for leaking classified information. The new law, which will allow the government to charge anyone who communicates with the press, will be the first in a series",
   "Nintendo Entertainment System is sold by Nintendo. Nintendo, Super Mario Bros.., Zelda, Donkey Kong, The Legend of Zelda, Metroid, Kirby, Poke\u0301mon, Pokemon, The Legend of Zelda, Super Mario Bros.., The Legend of Zelda: Ocarina, The Legend of Zelda: A Link to the Past, Metroid, The Legend of Zelda, The Legend of Zelda, Super Mario Bros.., Super Mario Bros.., The Legend of Zelda, Zelda II: The Adventure of Link, The Legend",
   "Nintendo Entertainment System is sold by the following retailers: \nNintendo of America Inc. Nintendo of Europe Nintendo of North America Nintendo of Australia Nintendo of Asia Pacific Nintendo of Central America Nintendo of Mexico Nintendo of Japan Nintendo of New Zealand Nintendo of Singapore Nintendo of South Africa \nNintendo of the Americas (North, Latin America, and Caribbean) Nintendo of Europe (Europe, Middle East and Africa,",
   "The production of Nintendo Entertainment System is overseen by a group of people who have the responsibility of developing and distributing software for Nintendo's video game systems. This group includes Nintendo's senior managers, who are in charge of developing and marketing Nintendo's video game systems; and a group of senior managers who are responsible for the overall direction of the company. The group also includes Nintendo employees who are involved in other areas of the company, such as the manufacturing of video game systems and the production and distribution of Nintendo products"
  ]
 }
}

What is the performance of ROME at editing GPT-NeoX-20b?

Hi! I am trying to run your code to edit gpt-neox-20b but i cannot download gpt_neox.layers.15.mlp.dense_4h_to_h_float32_mom2_100000.npz because it is not in the remote data directory.

Can you report the performance of ROME in GPT-Neox-20b like ES, DS, and PS?

ImportError: cannot import name 'Literal' from 'typing' (/usr/lib/python3.7/typing.py)

I tried running the model editing notebook in Colab, but ran into the following issue:

ImportError: cannot import name 'Literal' from 'typing' (/usr/lib/python3.7/typing.py)

This is because Literal is only available in Python 3.8+ while Colab supports 3.7. Was this a recent change?

I can spin up my own python 3.8 instance or add some additional code to install Python 3.8 in Colab, but that won't solve the issue for others unless I add the conda python 3.8 installation in a PR. Will this be fixed?

Why you choose layer 18 as a edits layer

Dear authors,

I really appreciate your work but have a question. Hopefully you can help me.

in ROME E.5, you said "We perform the intervention at layer 18. As Figure 1k shows, this is the center of causal effect in MLP layers, and as Figure 3 shows, layer 18 is approximately when MLP outputs begin to switch from acting as keys to values.".

However, in MEMIT, you said "at layers where the gap is largest, the role of the MLP computation is important. We select the layers where the gap is largest as the range R to use for the intervention done by MEMIT"

layer 18 obviously doesn't have the largest gap, but why you choose it as key layer?

Is there sth I miss?

thanks!

Conda setup issue

There seems to be a dependency issue between the checklist and the notebook packages.
I kept getting the following error when trying to install using ./scripts/setup_conda.sh:

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/mnt/tmp/pip-install-agcyk1c_/checklist_cdbf3a4517814d0a8ddc1381fe4d54ce/setup.py", line 56, in <module>
        setup(name='checklist',
      File "/miniconda/envs/rome/lib/python3.9/site-packages/setuptools/__init__.py", line 107, in setup
        return distutils.core.setup(**attrs)
      File "/miniconda/envs/rome/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 185, in setup
        return run_commands(dist)
      File "/miniconda/envs/rome/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
        dist.run_commands()
      File "/miniconda/envs/rome/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
        self.run_command(cmd)
      File "/miniconda/envs/rome/lib/python3.9/site-packages/setuptools/dist.py", line 1234, in run_command
        super().run_command(command)
      File "/miniconda/envs/rome/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
        cmd_obj.run()
      File "/mnt/tmp/pip-install-agcyk1c_/checklist_cdbf3a4517814d0a8ddc1381fe4d54ce/setup.py", line 53, in run
        enable_visual_interface()
      File "/mnt/tmp/pip-install-agcyk1c_/checklist_cdbf3a4517814d0a8ddc1381fe4d54ce/setup.py", line 14, in enable_visual_interface
        notebook.nbextensions.install_nbextension_python(
    AttributeError: module 'notebook' has no attribute 'nbextensions'
    ----------------------------------------
Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement checklist==0.0.11 (from versions: 0.0.1, 0.0.2, 0.0.3, 0.0.4, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11)
ERROR: No matching distribution found for checklist==0.0.11

I had to fix by reverting version of notebook to 5.6.0 in the rome.yml file.

Possible bug in fine-tuning baseline implementation

I've found what I believe to be a bug in the implementation of the fine-tuning baseline which would yield incorrect results when the target is longer than one token.

Looking at the code, the fine-tuning baseline seems to get the logits on which to backpropagate by calling model(**inputs) where inputs are the prompt with the subject but excluding the target. It then maximises the probability of the target by taking the logits associated to the last token in the input, and maximising the probability of all the target tokens as simultaneous direct continuations. This is not the regular fine-tuning behaviour which would be to maximise the probability of the first token in the target being a continuation to the input and then maximising the probability of the second token in the target being a continuation to the first token in the target.

Thank you for your assistance and look forwards to hearing back and understanding whether I may have misunderstood an aspect in the implementation.

Any suggestions for extending this work to edit values?

Thank you very much for making this awesome work publicly available.

I'm working on extending ROME to understanding and editing the "values" representations that the model knows (as in, human values, not the values part of key/value pairs). E.g., is there a low rank update we can apply that causes the model to think that environmentalists really like the oil industry? Or an update that causes the model to think that "valuing artistic expression" means you really like geese?

Do you have any suggestions for applying ROME to these sorts of abstract, values-related relationships?

Thanks for your time!

Model- and data-dependent hyperparameters

Hi!
Thank you very much for making your implementation publicly available.
I want to use ROME on different LMs and datasets than those you tried in the paper. I was wondering which hyperparameters are model- or data-dependent and whether you have an intuition/strategy for finding values for them.
Thanks!

Code for Average Plot over Multiple Runs

Hey!
Great work
Is the code used to generate the plots in Figure 2 publicly available? I would like to reproduce some experiments and that would be really helpful. I have a bunch of prompts and would love to get a similar plot

image

Documentation for hparams?

Is there any documentation on what the various hparams here mean? I'm trying to use ROME with a different GPT-J 6B model (CodeGen-6B-mono) using the demo rome.ipynb notebook. The comments you had in this issue were helpful but I don't really know the meanings of:

  • layers: it seems like this says which layers to modify. But this isn't known in advance, and has to be found using the techniques described in the paper, right? So maybe I need to dig through the repo to find the code that does that?
  • Any of the mom2_* parameters. Given that they reference wikitext maybe I need to provide a code dataset instead? I couldn't find any reference to mom in the paper.

The others I think I can figure out from the paper and do some hyperparameter sweeps, but I don't really know where to start with the ones above.

(Thanks very much for releasing the code BTW! I'm really excited to try it out on code models :) )

Best way to specify a non-standard cache directory

Hello,

Due to memory limitations, I need to store ROME's cache somewhere other than my home directory. What would be the best way to specify the new cache dir's location so that all of ROME's components know where to find the data they need?

Also, I intend to run ROME on custom trained GPT-2 models, so the cache dir also needs to hold the data used to calculate covariance statistics for those models.

Thanks for your help, and thanks so much for providing this repository!

How do you train KE and MEND with CounterFact?

As is described in your paper, "To encourage fair comparison on both zsRE and COUNTERFACT tasks, we additionally train KE-zsRE and KE-CF models on size-10,000 subsets of the respective training sets." and "Again, for fair comparison, we train new versions of MEND (MEND-zsRE, MEND-CF) on the same sets that KE-zsRE and KE-CF were trained on.".

Which 10,000 records do you use to train KE-CF and MEND-CF?

Besides, "Table 4 showcases quantitative results on GPT-2 XL (1.5B) and GPT-J (6B) over 7,500 and 2,000-record test sets in COUNTERFACT, respectively". Which 7,500 or 2,000 records do you use to evaluate all baselines?

Thank you :-)

Failed on causal tracing

I tried to replay the causal tracing on gpt-xl and llama-7b, but it failed in some ways

  1. Failed on predict_token
def predict_token(mt, prompts, return_p=False):
    inp = make_inputs(mt.tokenizer, prompts)
    preds, p = predict_from_input(mt.model, inp)
    result = [mt.tokenizer.decode(c) for c in preds]
    if return_p:
        result = (result, p)
    return result

def predict_from_input(model, inp):
    # failed on this lookup
    out = model(**inp)["logits"]
    probs = torch.softmax(out[:, -1], dim=1)
    p, preds = torch.max(probs, dim=1)
    return preds, p

When I use gpt-xl and llama-7b model, the key of model(**inp) is ['last_hidden_state', 'past_key_values'], so it met an KeyError.

If I use chatglm3-6b, it won't go wrong in front, but it failed on collect_embedding_std.
It returns "transformer.wte" here:

def layername(model, num, kind=None):
    if hasattr(model, "transformer"):
        if kind == "embed":
            return "transformer.wte"
        return f'transformer.h.{num}{"" if kind is None else "." + kind}'
    if hasattr(model, "gpt_neox"):
        if kind == "embed":
            return "gpt_neox.embed_in"
        if kind == "attn":
            kind = "attention"
        return f'gpt_neox.layers.{num}{"" if kind is None else "." + kind}'
    assert False, "unknown transformer structure"

But there is no modules named 'transformer.wte' in this model, so it raised an error here.

def get_module(model, name):
    """
    Finds the named module within the given model.
    """
    for n, m in model.named_modules():
        if n == name:
            return m
    raise LookupError(name)

In addition, gpt-xl and llama-7b will failed on the assert in layername().
I would like to know what could be causing my failure and is there any way to successfully edit the model.
Thank you

Code for computing the right vector for the rank-1 update

In the compute_v function of rome/compute_v.py, you use the get_module_input_output_at_word function to get cur_input and cur_output.
image
In details, cur_input and cur_output are obtained by inputting “Steve Jobs was the founder of” to gpt2-xl. So cur_input, cur_output are not equal to k*, and W_{proj} k*, but you seem to use cur_input and cur_output as k* and W_{proj} k* when calculating the right vector for the rank-1 update, which is slightly different from your proposed equation (2) in the paper. I wonder why you use this method to approximate k*, and W_{proj} k*?

A question about casual tracing code

Thank you for your great work. I notice that during the casual tracing, when the kind is None, the “trace_important_states ” works layer by layer, but when the kind is attention or mlp, the “trace_important_window” works in layer windows. Why the layer range is different when the kind changes.

CUDA Out of Memory Error on Colab

I am trying to run the rome.ipynb on Google Colab and I'm getting an Out of Memory error on the call to demo_model_editing() function call.

installation

Hello,
can i ask some questions for you?
why do I still have problems with the environment configuration now? Is the project no longer updated? Is it necessary to continue in this direction?

error:
AttributeError: module 'notebook' has no attribute 'nbextensions'

WARNING: Discarding https://pypi.tuna.tsinghua.edu.cn/packages/38/b3/8511f50025b9fc66f5feacf9eb2db044c321f4026b6937cb3820f29e9c1d/checklist-0.0.11.tar.gz#sha256=427cf87dbf47ce9f9ab059a9bbf393d9ebf967e266f8fca377420bd6995a95ac (from https://pypi.tuna.tsinghua.edu.cn/simple/checklist/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement checklist==0.0.11 (from versions: 0.0.1, 0.0.2, 0.0.3, 0.0.4, 0.0.6, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.11)
ERROR: No matching distribution found for checklist==0.0.11

failed

CondaEnvException: Pip failed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.