Git Product home page Git Product logo

chinese-llama-2's Introduction

Hi there 👋

  • 🔭 I’m a senior researcher at Tencent AI Lab
  • 🌱 I got my Ph.D. degree in 2018
  • 👯 I practiced in a broad field of NLP
  • 🤔 I’m interested in MT and DL
  • 😄 I have some internship positions
  • ⚡ I like to participate academic competition
  • 💬 My homepage is http://longyuewang.com
  • 📫 Contact me [email protected]

Star History

Star History Chart

chinese-llama-2's People

Contributors

longyuewangdcu avatar minghao-wu avatar seeledu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chinese-llama-2's Issues

Mac

Hello,
Does it support Mac /M1/M2 or Linux?

please advise

the model download from https://huggingface.co/seeledu/Chinese-Llama-2-7B can no be used

Loading checkpoint shards: 0%| | 0/2 [01:33<?, ?it/s]
Traceback (most recent call last):
File "/home/chenjunhao/chinese-llama-2/test/inference.py", line 137, in
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, torch_dtype=torch.bfloat16, device_map="auto")
File "/home/chenjunhao/chinese-llama-2/transformers/src/transformers/models/auto/auto_factory.py", line 471, in from_pretrained
return model_class.from_pretrained(
File "/home/chenjunhao/chinese-llama-2/transformers/src/transformers/modeling_utils.py", line 2643, in from_pretrained
) = cls._load_pretrained_model(
File "/home/chenjunhao/chinese-llama-2/transformers/src/transformers/modeling_utils.py", line 2966, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/chenjunhao/chinese-llama-2/transformers/src/transformers/modeling_utils.py", line 671, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/chenjunhao/.local/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 149, in set_module_tensor_to_device
new_value = value.to(device)
NotImplementedError: Cannot copy out of meta tensor; no data!

Re-dowloaded model files, still got the "no data" error

          > ok, if you have any other questions, you can open another issue to discuss.

Sorry, but I re-download the model files and still got the same error

[2023-07-24 06:38:46,649] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]python-BaseException
Loading checkpoint shards:   0%|          | 0/2 [00:04<?, ?it/s]
Traceback (most recent call last):
  File "/data/kexin/anaconda3/envs/cllama2/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 149, in set_module_tensor_to_device
    new_value = value.to(device)
NotImplementedError: Cannot copy out of meta tensor; no data!

Process finished with exit code 143

Originally posted by @XiongKexin in #2 (comment)

Error of llama.cpp convert

Traceback (most recent call last):
File "convert.py", line 1264, in
main()
File "convert.py", line 1244, in main
model_plus = load_some_model(args.model)
File "convert.py", line 1165, in load_some_model
models_plus.append(lazy_load_file(path))
File "convert.py", line 955, in lazy_load_file
return lazy_load_torch_file(fp, path)
File "convert.py", line 826, in lazy_load_torch_file
model = unpickler.load()
File "convert.py", line 815, in find_class
return self.CLASSES[(module, name)]
KeyError: ('torch._utils', '_rebuild_meta_tensor_no_storage')

支持13B和70B参数的模型微调吗

你好,非常开心和感激你让llama-2对中文的支持。我看到你是支持了7B参数的,相同的代码可以用于支持13B和70B参数的模型微调吗?

如何让它的回答更加丰富

相对于llama-2英文版的回复,中文的回复是比较短,没有英文的丰富,如何让它的回答更加丰富?谢谢

can't run llama-2-7b-hf

Hi there. I'm running fine-tune codes and get the error message.

Traceback (most recent call last):
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/transformers/configuration_utils.py", line 672, in _get_config_dict
resolved_config_file = cached_file(
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/transformers/utils/hub.py", line 417, in cached_file
resolved_file = hf_hub_download(
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 110, in _inner_fn
validate_repo_id(arg_value)
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 158, in validate_repo_id
raise HFValidationError(
huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/root/autodl-tmp/Chinese-Llama-2/model/llama2-7B-HF'. Use repo_type argument if needed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/autodl-tmp/Chinese-Llama-2/train/run_clm_lora.py", line 786, in
main()
File "/root/autodl-tmp/Chinese-Llama-2/train/run_clm_lora.py", line 454, in main
config = AutoConfig.from_pretrained(model_args.model_name_or_path, **config_kwargs)
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 983, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/transformers/configuration_utils.py", line 617, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/transformers/configuration_utils.py", line 693, in _get_config_dict
raise EnvironmentError(
OSError: Can't load the configuration of '/root/autodl-tmp/Chinese-Llama-2/model/llama2-7B-HF'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure '/root/autodl-tmp/Chinese-Llama-2/model/llama2-7B-HF' is the correct path to a directory containing a config.json file
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 11867) of binary: /root/miniconda3/envs/llama-v2/bin/python
Traceback (most recent call last):
File "/root/miniconda3/envs/llama-v2/bin/torchrun", line 8, in
sys.exit(main())
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/torch/distributed/run.py", line 762, in main
run(args)
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/root/miniconda3/envs/llama-v2/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

/root/autodl-tmp/Chinese-Llama-2/train/run_clm_lora.py FAILED

Failures:
[1]:
time : 2023-07-31_09:50:12
host : autodl-container-95b911bb00-66f99c55
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 11868)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[2]:
time : 2023-07-31_09:50:12
host : autodl-container-95b911bb00-66f99c55
rank : 2 (local_rank: 2)
exitcode : 1 (pid: 11869)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[3]:
time : 2023-07-31_09:50:12
host : autodl-container-95b911bb00-66f99c55
rank : 3 (local_rank: 3)
exitcode : 1 (pid: 11870)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[4]:
time : 2023-07-31_09:50:12
host : autodl-container-95b911bb00-66f99c55
rank : 4 (local_rank: 4)
exitcode : 1 (pid: 11871)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[5]:
time : 2023-07-31_09:50:12
host : autodl-container-95b911bb00-66f99c55
rank : 5 (local_rank: 5)
exitcode : 1 (pid: 11872)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[6]:
time : 2023-07-31_09:50:12
host : autodl-container-95b911bb00-66f99c55
rank : 6 (local_rank: 6)
exitcode : 1 (pid: 11873)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[7]:
time : 2023-07-31_09:50:12
host : autodl-container-95b911bb00-66f99c55
rank : 7 (local_rank: 7)
exitcode : 1 (pid: 11874)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2023-07-31_09:50:12
host : autodl-container-95b911bb00-66f99c55
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 11867)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Fine tuning

I'm running the bash script to fine-tune the model and get the following error message:

[W socket.cpp:601] [c10d] The client socket has failed to connect to [localhost]:29500 (errno: 99 - Cannot assign requested address).

Could you please check that?

model inference error

{'': 0}
Using pad_token, but it is not set yet.
Setting pad_token_id to eos_token_id:2 for open-end generation.
Traceback (most recent call last):
File "/home/chenjunhao/chinese-llama-2/test/inference_lora.py", line 160, in
generated_ids = model.generate(inputs=input_ids, attention_mask=attn_mask, generation_config=gen_config)
File "/home/chenjunhao/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/chenjunhao/chinese-llama-2/transformers/src/transformers/generation/utils.py", line 1462, in generate
return self.sample(
File "/home/chenjunhao/chinese-llama-2/transformers/src/transformers/generation/utils.py", line 2514, in sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either inf, nan or element < 0

every elements in probs are 0

Question about the tokenizer

Hello! It is very nice that you adapt Llama 2 for Chinese language and got great result.
I am new to LLM, and I wonder how to get the tokenizer for Llama 2? If I remember correctly, Llama 2 does not offically support Chinese, and the official model only have a couple hunderds of Chinese characters in its tokenizer.
Any explanation will be greatly appreciated, thanks!

NotImplementedError: Cannot copy out of meta tensor; no data!

When I run test/inference.py, there is an error "NotImplementedError: Cannot copy out of meta tensor; no data!". I don't know how to fix it. Is this due to my wrong transformer version(4.29.0)?

[2023-07-24 05:20:38,101] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]python-BaseException
Loading checkpoint shards:   0%|          | 0/2 [00:04<?, ?it/s]
Traceback (most recent call last):
  File "/data/kexin/anaconda3/envs/cllama2/lib/python3.8/site-packages/accelerate/utils/modeling.py" line 149, in set_module_tensor_to_device
    new_value = value.to(device)
NotImplementedError: Cannot copy out of meta tensor; no data!

Process finished with exit code 143

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.