The japanese-alpaca-lora from kunishou

llama-13bに変更すると "RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:" が出る

Japanese-Alpaca-LoRAの公開ありがとうございます。
早速colab (pro, GPU VRAM40G)上で試しています。llama-7b はそのままでも動いたのですが、以下の様に llama-13b に切り替えたところ、RuntimeErrorが発生しました。
リソース状況を見ても、メモリーについては余裕がまだまだあるようでした。

llamaのモデルサイズを切り替える際に、他にも修正箇所があれば教えていただきたいです。

環境

Colab Pro
システム RAM 83.5GB
GPU RAM 40GB

Error発生時のリソース状況

修正箇所

# colab pro以上でのプランでA100を使用しないと動かないかも

# BASE_MODEL = "decapoda-research/llama-7b-hf"
BASE_MODEL = "decapoda-research/llama-13b-hf"
# BASE_MODEL = "decapoda-research/llama-30b-hf"
# BASE_MODEL = "decapoda-research/llama-65b-hf"

tokenizer = LlamaTokenizer.from_pretrained(BASE_MODEL,device_map={'': 0})

# LORA_WEIGHTS = "kunishou/Japanese-Alpaca-LoRA-7b-v0"
LORA_WEIGHTS ="kunishou/Japanese-Alpaca-LoRA-13b-v0"
# LORA_WEIGHTS = "kunishou/Japanese-Alpaca-LoRA-30b-v0"
# LORA_WEIGHTS = "kunishou/Japanese-Alpaca-LoRA-65b-v0"

Error内容

RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
        size mismatch for base_model.model.model.layers.0.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.0.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.0.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).

Error全文

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /usr/local/lib/python3.9/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so...
/usr/local/lib/python3.9/dist-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /usr/lib64-nvidia did not contain libcudart.so as expected! Searching further paths...
  warn(msg)
/usr/local/lib/python3.9/dist-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/sys/fs/cgroup/memory.events /var/colab/cgroup/jupyter-children/memory.events')}
  warn(msg)
/usr/local/lib/python3.9/dist-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('--listen_host=172.28.0.12 --target_host=172.28.0.12 --tunnel_background_save_url=https'), PosixPath('//colab.research.google.com/tun/m/cc48301118ce562b961b3c22d803539adc1e0c19/gpu-a100-s-396jeoh5eio6u --tunnel_background_save_delay=10s --tunnel_periodic_background_save_frequency=30m0s --enable_output_coalescing=true --output_coalescing_required=true')}
  warn(msg)
/usr/local/lib/python3.9/dist-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/env/python')}
  warn(msg)
/usr/local/lib/python3.9/dist-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('module'), PosixPath('//ipykernel.pylab.backend_inline')}
  warn(msg)
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'. 
The class this function is called from is 'LlamaTokenizer'.
Loading checkpoint shards: 100%
41/41 [02:49<00:00, 3.92s/it]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:53                                                                                   │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/peft/peft_model.py:161 in from_pretrained                 │
│                                                                                                  │
│   158 │   │   │   filename, map_location=torch.device("cuda" if torch.cuda.is_available() else   │
│   159 │   │   )                                                                                  │
│   160 │   │   # load the weights into the model                                                  │
│ ❱ 161 │   │   model = set_peft_model_state_dict(model, adapters_weights)                         │
│   162 │   │   if getattr(model, "hf_device_map", None) is not None:                              │
│   163 │   │   │   device_map = kwargs.get("device_map", "auto")                                  │
│   164 │   │   │   max_memory = kwargs.get("max_memory", None)                                    │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/peft/utils/save_and_load.py:74 in                         │
│ set_peft_model_state_dict                                                                        │
│                                                                                                  │
│   71 │   │   peft_model_state_dict (`dict`): The state dict of the Peft model.                   │
│   72 │   """                                                                                     │
│   73 │                                                                                           │
│ ❱ 74 │   model.load_state_dict(peft_model_state_dict, strict=False)                              │
│   75 │   if model.peft_config.peft_type != PeftType.LORA:                                        │
│   76 │   │   model.prompt_encoder.embedding.load_state_dict(                                     │
│   77 │   │   │   {"weight": peft_model_state_dict["prompt_embeddings"]}, strict=True             │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py:1671 in load_state_dict        │
│                                                                                                  │
│   1668 │   │   │   │   │   │   ', '.join('"{}"'.format(k) for k in missing_keys)))               │
│   1669 │   │                                                                                     │
│   1670 │   │   if len(error_msgs) > 0:                                                           │
│ ❱ 1671 │   │   │   raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(     │
│   1672 │   │   │   │   │   │   │      self.__class__.__name__, "\n\t".join(error_msgs)))         │
│   1673 │   │   return _IncompatibleKeys(missing_keys, unexpected_keys)                           │
│   1674                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
        size mismatch for base_model.model.model.layers.0.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.0.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.0.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.0.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.1.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.1.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.1.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.1.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.2.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.2.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.2.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.2.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.3.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.3.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.3.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.3.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.4.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.4.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.4.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.4.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.5.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.5.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.5.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.5.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.6.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.6.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.6.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.6.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.7.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.7.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.7.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.7.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.8.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.8.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.8.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.8.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.9.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.9.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.9.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.9.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.10.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.10.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.10.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.10.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.11.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.11.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.11.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.11.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.12.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.12.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.12.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.12.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.13.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.13.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.13.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.13.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.14.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.14.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.14.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.14.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.15.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.15.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.15.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.15.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.16.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.16.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.16.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.16.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.17.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.17.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.17.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.17.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.18.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.18.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.18.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.18.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.19.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.19.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.19.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.19.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.20.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.20.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.20.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.20.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.21.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.21.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.21.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.21.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.22.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.22.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.22.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.22.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.23.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.23.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.23.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.23.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.24.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.24.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.24.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.24.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.25.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.25.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.25.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.25.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.26.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.26.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.26.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.26.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.27.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.27.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.27.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.27.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.28.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.28.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.28.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.28.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.29.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.29.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.29.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.29.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.30.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.30.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.30.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.30.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.31.self_attn.q_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.31.self_attn.q_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).
        size mismatch for base_model.model.model.layers.31.self_attn.v_proj.lora_A.weight: copying a param with 
shape torch.Size([8, 4096]) from checkpoint, the shape in current model is torch.Size([8, 5120]).
        size mismatch for base_model.model.model.layers.31.self_attn.v_proj.lora_B.weight: copying a param with 
shape torch.Size([4096, 8]) from checkpoint, the shape in current model is torch.Size([5120, 8]).

kunishou / japanese-alpaca-lora Goto Github PK

japanese-alpaca-lora's Introduction

Hi 👋, I'm Kunishou

A passionate data analyst from Japan

japanese-alpaca-lora's People

Contributors

Stargazers

Watchers

Forkers

japanese-alpaca-lora's Issues

llama-13bに変更すると "RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:" が出る

環境

修正箇所

Error内容

multi-gpus training for 65B model

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent