Git Product home page Git Product logo

dsnot's People

Contributors

lirui-zhao avatar zyxxmu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

dsnot's Issues

WrappedGPT object has no attribute 'H'

When running DSNOT with SparseGPT on an OPT model, i get the error above from the following line:

H = wrapped_layers[name].H

It seems like in the WrappedGPT, the H attribute is never set.

Looking at the code, this error probably also happens when running non-opt models, although i haven't tried.

How to evaluate the zero-shot performance?

Hi, thanks for your inspiring and brilliant work!

I notice that there is no code for the evaluation of zero-shot performance. I was wondering whether this part of code is the same as wanda ?

sparse model

Hi,
I am interesting in your work.
Can you share 30% sparse llama2-7b?
I want to measure more metrics.

Thanks

Problem with reproducing results in Table1 on LLaMA2-7B

Hi, we have some problem of reproducing some of the results in Table1:

Magnitude w.o. DSnoT

  • command:
CUDA_VISIBLE_DEVICES=4,5,6,7 python main.py \
    --model /home/ma-user/modelarts/user-job-dir/DSnoT/ckpt/Llama-2-7b-hf \
    --prune_method magnitude  \
    --initial_method magnitude  \
    --sparsity_ratio 0.6  \
    --sparsity_type unstructured  \
    --max_cycle_time 50  \
    --update_threshold 0.1  \
    --pow_of_var_regrowing 1 \
    --model_type 'llama'
  • result:
model: /home/ma-user/modelarts/user-job-dir/DSnoT/ckpt/Llama-2-7b-hf
prune_method: magnitude
without_DSnoT: False
initial_method: magnitude
skip_layer mlp, skip_sub_layer no_skip
max_cycle_time: 50, update_threshold: 0.1
pow_of_var_pruning:1, pow_of_var_regrowing:1.0
without_same_sign:True
sparse pattern: unstructured
sample: 128
sparsity sanity check 0.6013, ppl: 1924.735595703125

Magnitude w. DSnoT

  • command
CUDA_VISIBLE_DEVICES=4,5,6,7 python main.py \
    --model /home/ma-user/modelarts/user-job-dir/DSnoT/ckpt/Llama-2-7b-hf \
    --prune_method DSnoT \
    --initial_method magnitude  \
    --sparsity_ratio 0.6  \
    --sparsity_type unstructured  \
    --max_cycle_time 50  \
    --update_threshold 0.1  \
    --pow_of_var_regrowing 1 \
    --model_type 'llama'
  • result:
model: /home/ma-user/modelarts/user-job-dir/DSnoT/ckpt/Llama-2-7b-hf
prune_method: DSnoT
without_DSnoT: False
initial_method: magnitude
skip_layer mlp, skip_sub_layer no_skip
max_cycle_time: 50, update_threshold: 0.1
pow_of_var_pruning:1, pow_of_var_regrowing:1.0
without_same_sign:True
sparse pattern: unstructured
sample: 128
sparsity sanity check 0.5999, ppl: 3950.8154296875

SparseGPT w. DSnoT

  • command
CUDA_VISIBLE_DEVICES=4,5,6,7 python main.py \
    --model /home/ma-user/modelarts/user-job-dir/DSnoT/ckpt/Llama-2-7b-hf \
    --prune_method DSnoT \
    --initial_method sparsegpt\
    --sparsity_ratio 0.6  \
    --sparsity_type unstructured  \
    --max_cycle_time 50  \
    --update_threshold 0.1  \
    --pow_of_var_regrowing 1 \
    --model_type 'llama'
  • result
model: /home/ma-user/modelarts/user-job-dir/DSnoT/ckpt/Llama-2-7b-hf
prune_method: DSnoT
without_DSnoT: False
initial_method: sparsegpt
skip_layer mlp, skip_sub_layer no_skip
max_cycle_time: 50, update_threshold: 0.1
pow_of_var_pruning:1, pow_of_var_regrowing:1.0
without_same_sign:True
sparse pattern: unstructured
sample: 128
sparsity sanity check 0.5999, ppl: 122160.71875

We would greatly appreciate your assistance in helping us pinpoint the reasons behind these results and in verifying the correctness of our implementation of your code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.