linksoul-ai / llasm Goto Github PK

View Code? Open in Web Editor NEW

477.0 12.0 47.0 3.33 MB

第一个支持中英文双语语音-文本多模态对话的开源可商用对话模型。便捷的语音输入将大幅改善以文本为输入的大模型的使用体验，同时避免了基于 ASR 解决方案的繁琐流程以及可能引入的错误。

Home Page: https://github.com/LinkSoul-AI/LLaSM

License: Apache License 2.0

Python 100.00%

llasm's Introduction

LLaSM: Large Language and Speech Model

开源，可商用的中英文双语语音-语言助手 LLaSM 以及中英文语音 SFT 数据集 LLaSM-Audio-Instructions，第一个支持中英文语音-文本多模态对话的开源可商用对话模型。

模型框架

基础演示

在线试玩

Talk is cheap, Show you the Demo.

Demo 地址 / Hugging Face Spaces

论文

arXiv 链接：https://arxiv.org/abs/2308.15930

资源下载

Hugging Face模型下载：
- LLaSM-Chinese-Llama-2-7B
- LLaSM-Baichuan-7B
百度网盘下载:
- LLaSM-Chinese-Llama-2-7B
- LLaSM-Baichuan-7B
语言模型:
- Chinese-Llama-2-7b
- Baichuan-7B
数据集：LLaSM-Audio-Instructions

环境安装

# clone the repository
git clone https://github.com/LinkSoul-AI/LLaSM
cd LLaSM

# install package
conda create -n llasm python=3.10 -y
conda activate llasm
pip install --upgrade pip
pip install -e .

快速测试

下载 Whisper large v2 模型：https://huggingface.co/openai/whisper-large-v2

export LLASM_DEVICE="cuda:0"
python infer.py \
    --input_audio_file PATH/TO/YOUR/AUDIO \
    --llasm_model PATH/TO/LLaSM/MODEL \
    --llasm_audio_tower PATH/TO/WHISPER/MODEL \
    --llm_type "Chinese_llama2" or "baichuan" \

TODO

如何训练
int4 量化
docker 部署

项目协议

Apache-2.0 license

Citation

如果您发现我们的工作和此仓库有用，欢迎给一个星星 ⭐ 鼓励我们一下 🍺:

@misc{shu2023llasm,
      title={LLaSM: Large Language and Speech Model}, 
      author={Yu Shu and Siwei Dong and Guangyao Chen and Wenhao Huang and Ruihua Zhang and Daochen Shi and Qiqi Xiang and Yemin Shi},
      year={2023},
      eprint={2308.15930},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

微信交流群

llasm's People

Contributors

Stargazers

Watchers

llasm's Issues

命令：python infer.py --input_audio_file d:/tmp/output.wav --llasm_model F:/models/LLaSM-Cllama2 --llasm_audio_tower F:/models/whisper-medium --llm_type "Chinese_llama2"
报错：RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x1024 and 1280x4096)

是哪里的问题？求解答，谢谢！~

关于预训练模型llama

请问百度网盘下载地址的是本文方法的预训练模型吗？还是原生的开源llama？谢谢。

whisper已经下载到本地了，就是找不到加载位置

'(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /openai/whisper-large-v2/resolve/main/config.json (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fd9399957b0>, 'Connection to huggingface.co timed out. (connect timeout=10)'))"), '(Request ID: 7b68228e-e6a6-4ecd-bc7e-3affc92e3ac3)')' thrown while requesting HEAD https://huggingface.co/openai/whisper-large-v2/resolve/main/config.json
Traceback (most recent call last):
File "/root/anaconda3/envs/llasm/lib/python3.10/site-packages/transformers/utils/hub.py", line 417, in cached_file
resolved_file = hf_hub_download(
File "/root/anaconda3/envs/llasm/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/root/anaconda3/envs/llasm/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1291, in hf_hub_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: Connection error, and we cannot find the requested files in the disk cache. Please try again or make sure your Internet connection is on.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/work/liukai/LLaSM/infer.py", line 122, in
main(args)
File "/work/liukai/LLaSM/infer.py", line 46, in main
model = LlaaaLlamaForCausalLM.from_pretrained(
File "/root/anaconda3/envs/llasm/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2700, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/work/liukai/LLaSM/llasm.py", line 160, in init
self.model = LlaaaLlamaModel(config)
File "/work/liukai/LLaSM/llasm.py", line 41, in init
self.audio_tower = [load_whisper(config.mm_audio_tower)]
File "/work/liukai/LLaSM/llasm.py", line 28, in load_whisper
model = WhisperModel.from_pretrained(audio_tower_name)
File "/root/anaconda3/envs/llasm/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2325, in from_pretrained
config, model_kwargs = cls.config_class.from_pretrained(
File "/root/anaconda3/envs/llasm/lib/python3.10/site-packages/transformers/configuration_utils.py", line 590, in from_pretrained
config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/root/anaconda3/envs/llasm/lib/python3.10/site-packages/transformers/configuration_utils.py", line 617, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/root/anaconda3/envs/llasm/lib/python3.10/site-packages/transformers/configuration_utils.py", line 672, in _get_config_dict
resolved_config_file = cached_file(
File "/root/anaconda3/envs/llasm/lib/python3.10/site-packages/transformers/utils/hub.py", line 452, in cached_file
raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like openai/whisper-large-v2 is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.