hunter-ddm / knowledge-neurons Goto Github PK
View Code? Open in Web Editor NEWCode for the ACL-2022 paper "Knowledge Neurons in Pretrained Transformers"
License: MIT License
Code for the ACL-2022 paper "Knowledge Neurons in Pretrained Transformers"
License: MIT License
Hi, I'm currently running the first step bash 1_run_mlm.sh
to get attribution scores, and I found it took several hours. Is it normal? In your paper you reported theh running time for identifying knowledge neurons is only 13.3 seconds. So this is the time cost for the second step bash 2_run_kn.sh
, not the first step right?
Hello,
When I wanted to load pretrained model, I faced this problem:
Hi,
I find in '3_modify_activation.py', the code
_, logits = model(input_ids=input_ids, attention_mask=input_mask, token_type_ids=segment_ids, tgt_pos=tgt_pos, tgt_layer=0, imp_pos=kn_bag, imp_op='remove')
why the tgt_later is always 0, in kn_bag, some neurons is [9,1000] or [10,1001], not always in layer 0.
And this happens in places like edit and erase.
And in the paper, Figure 4 and Figure 5 is the correct probability change or the probability change about the correct labe, does it mean the remove or enhance can improving probing performance or improving the ranking of the target labe(But the output is wrong) ?
Thanks!
knowledge-neurons/src/1_analyze_mlm.py
Line 292 in 922dfd9
I don't know why the output is gradient.. could you please explain about it?
Please provide me with the relevant dependencies for the libraries used in this code, allowing me to successfully utilize this code to reproduce all results.
hello,
I have a question: why ig_pred is computed but not used in your project. It seems that all the gradients are computed based on golden label. Is there some reason?
Hi,
Firstly I want to thank you for the paper, which is very inspiring and interesting. I am wondering if you have ever tried to identity knowledge neurons in other transformers (in you paper you evaluated BERT models, which is encoder-only). Just curious about what would the knowledge neuron distributed in a model that has both encoder and decoder? Will decoder layers have more knowledge neurons than encoder layers? Or if there any related reference?
Thanks!
Hi,When will the code be released? Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.