freedomintelligence / huatuogpt Goto Github PK
View Code? Open in Web Editor NEWHuatuoGPT, Towards Taming Language Models To Be a Doctor. (An Open Medical GPT)
License: Apache License 2.0
HuatuoGPT, Towards Taming Language Models To Be a Doctor. (An Open Medical GPT)
License: Apache License 2.0
File "D:\code\HuatuoGPT\huatuo_cli_demo_stream.py", line 32, in load_model
model.cuda()
File "C:\ProgramData\anaconda3\envs\pytorch310\lib\site-packages\transformers\modeling_utils.py", line 2047, in cuda
return super().cuda(*args, **kwargs)
File "C:\ProgramData\anaconda3\envs\pytorch310\lib\site-packages\torch\nn\modules\module.py", line 905, in cuda
return self._apply(lambda t: t.cuda(device))
File "C:\ProgramData\anaconda3\envs\pytorch310\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "C:\ProgramData\anaconda3\envs\pytorch310\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "C:\ProgramData\anaconda3\envs\pytorch310\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "C:\ProgramData\anaconda3\envs\pytorch310\lib\site-packages\torch\nn\modules\module.py", line 820, in _apply
param_applied = fn(param)
File "C:\ProgramData\anaconda3\envs\pytorch310\lib\site-packages\torch\nn\modules\module.py", line 905, in
return self._apply(lambda t: t.cuda(device))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 268.00 MiB (GPU 0; 23.78 GiB total capacity; 23.30 GiB already allocated; 256.50 MiB free; 23.30 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
用LLama-factory框架全参微调基座模型后,完全丧失了通用能力
HuatuoGPT-7B模型,在按照建议添加了trust_remote_code=True出现标题的错误,请问需要如何解决?
Will the relevant dataset be open?
尊敬的HuatuoGPT 应用开发者,我是 InternLM 社区开发者&志愿者尖米, 大佬开源的工作对我的启发很大,希望可以探讨使用 InternLM 实现HuatuoGPT 的可能性和实现路径,我的微信是mzm312,希望可以取得联系进行更深度的交流;
报错log如下:
(textgen) D:\AI_project\HuatuoGPT> python -m huatuo_cli_demo_stream --model-name D:\AI_project\text-generation-webui\models\HuatuoGPT-7B
Traceback (most recent call last):
File "C:\Users\qtnic\anaconda3\envs\textgen\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\qtnic\anaconda3\envs\textgen\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "D:\AI_project\HuatuoGPT\huatuo_cli_demo_stream.py", line 160, in
main(args)
File "D:\AI_project\HuatuoGPT\huatuo_cli_demo_stream.py", line 117, in main
model, tokenizer = load_model(args.model_name, args.device, args.num_gpus)
File "D:\AI_project\HuatuoGPT\huatuo_cli_demo_stream.py", line 27, in load_model
tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side="right", use_fast=True, trust_remote_code=True)
File "C:\Users\qtnic\anaconda3\envs\textgen\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 764, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers_modules.HuatuoGPT-7B.configuration_baichuan.BaiChuanConfig'> to build an AutoTokenizer.
Model type should be one of AlbertConfig, AlignConfig, BarkConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BlipConfig, Blip2Config, BloomConfig, BridgeTowerConfig, CamembertConfig, CanineConfig, ChineseCLIPConfig, ClapConfig, CLIPConfig, CLIPSegConfig, LlamaConfig, CodeGenConfig, ConvBertConfig, CpmAntConfig, CTRLConfig, Data2VecAudioConfig, Data2VecTextConfig, DebertaConfig, DebertaV2Config, DistilBertConfig, DPRConfig, ElectraConfig, ErnieConfig, ErnieMConfig, EsmConfig, FlaubertConfig, FNetConfig, FSMTConfig, FunnelConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GPTSanJapaneseConfig, GroupViTConfig, HubertConfig, IBertConfig, IdeficsConfig, InstructBlipConfig, JukeboxConfig, LayoutLMConfig, LayoutLMv2Config, LayoutLMv3Config, LEDConfig, LiltConfig, LlamaConfig, LongformerConfig, LongT5Config, LukeConfig, LxmertConfig, M2M100Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MgpstrConfig, MobileBertConfig, MPNetConfig, MptConfig, MraConfig, MT5Config, MusicgenConfig, MvpConfig, NezhaConfig, NllbMoeConfig, NystromformerConfig, OneFormerConfig, OpenAIGPTConfig, OPTConfig, OwlViTConfig, PegasusConfig, PegasusXConfig, PerceiverConfig, Pix2StructConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, RagConfig, RealmConfig, ReformerConfig, RemBertConfig, RetriBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2TextConfig, Speech2Text2Config, SpeechT5Config, SplinterConfig, SqueezeBertConfig, SwitchTransformersConfig, T5Config, TapasConfig, TransfoXLConfig, UMT5Config, ViltConfig, VisualBertConfig, VitsConfig, Wav2Vec2Config, Wav2Vec2ConformerConfig, WhisperConfig, XCLIPConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig, YosoConfig.
您好!
感谢你们的工作,但是我有一个疑问。
论文中提到基座模型是Bloomz-7b1-mt,模型在此基础上完成训练,但是仓库中又说7B模型是基于Baichuan-7B训练的。
请问论文中的测试结果是根据哪个模型测试得到的呢?
如果我直接使用仓库中公布的7B模型的参数进行推理,那么参考论文中的人工评估数据是否有意义呢?
期待您的答复!
Do you guys by chance have this dataset in English?
请问可以公开多轮对话评测的实现细节和prompt吗?
I want to download the huatuoGPT model, but it seems that there is no,https://huggingface.co/FreedomIntelligence/HuatuoGPT-7b-v1
root@46c8311ea82a:/HuatuoGPT# python apply_delta.py --base-model-path /home/llama-13b-hf --target-model-path /home/huatuo-13b_converted --delta-path /home/HuatuoGPT-13B-delta/HuatuoGPT#
Loading the base model from /home/llama-13b-hf
[2023-07-17 09:06:53,880] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards: 85%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 35/41 [01:30<00:15, 2.57s/it]Killed
root@46c8311ea82a:
请问哪里可以下载Huatuo-200K数据集?
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
HuatuoGPT: 你好,我是一个解答医疗健康问题的大模型,目前处于测试阶段,请以医嘱为准。请问有什么可以帮到您?输入 clear 清空对话历史,stop 终止程序
用户:hello
HuatuoGPT: :::->field Multiple एनاصة ସୁରକ୍ଷିତ Nomina Nomina କ୍ଷهلكീശowed牙 nago nago nago nago nago nago nagoulxDATEFordFordFord donner donner donnerಳಿಗಳಿಗملی-dessாவிட்டால் கோரிக்கை Guàrdia Guàrdia Guàrdia করেনি করেনি করেনি করেনি করেনি করেনি করেনি করেনি করেনি করেনি করেনি করেনি করেনি করেনিanged ପ୍ରେମ preparandoାବନ \nom inclusion inclusion inclusion inclusion inclusion inclusion inclusion inclusion inclusion inclusion inclusion inclusion inclusionadinha-काल-काल bos neuter_group哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦哈萨克斯坦 «SPingPing摆摆摆摆摆摆摆摆摆ínezínezɛn ডেক يقصد يقصد நினைவு sandwichஇதுகுறித்து devienne condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados fármaco fármacoواياديوعربعربعرب फ फelitian觀光 sincero sword sword sword )
affect mandatosपट्ट perten imesh Hendra Hendra Hendraconditions aktiboetatikosició拜仁عل Паuri حتي तुम बिज alokair rước परस्परizards ଯଦି ଯଦି gên Senator ব্যবহ ব্যবহ d'hora سیاست2018年3月天上 সুপ্রিম seculos geografisस्थितिaremarem ગેસ seed常侍ಷ್ಯ અન્ન Andreas Andreas接着接着接着接着接着接着接着接着接着接着接着blog rei পশ্চিমবঙ্গের পশ্চিমবঙ্গের পশ্চিমবঙ্গের পশ্চিমবঙ্গের পশ্চিমবঙ্গের পশ্চিমবঙ্গের পশ্চিমবঙ্গের Anthrop Anthrop dinamismo dinamismo dinamismo dinamismo甚多� ટીમ な な_groupُوْنَُوْنَ দৈ licenses licenses६५esoାଢ଼ ਤਾ কাছ কাছ কাছlimitedlimitedlimitedlimitedlimitedlimitedlimited وَأديندين Hampir Hampir Hampir insistióಷ್ಯಷ್ಯಷ್ಯಷ್ಯಷ್ಯ شرکت شرکت شرکت شرکت شرکت شرکت شرکت شرکت شرکت <TextView atrop"input992乙醇 mù ଜାପ哈萨克斯坦示例 Julien利益有关ಿಕ್ Tá Táത്സunderb sẵn শুভেচ্ছಷ್ಯಷ್ಯಷ್ಯಷ್ಯಷ್ಯಷ್ಯಷ್ಯಷ್ಯavirus vý vý vý vý vý vý vý Simpl Bascuence variant是国家 ବିଣ ବିଣ警卫 condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados memahamiượtượtīn أكدت কৈছেెష蔔-ten自幼 parental_numeric袭 Ols Ols都不 الند Santiago Santiago સ્વભાવ الأعرೇರಳਸ਼ਾਨਸ਼ਾਨਸ਼ਾਨGuestExtendedExtended不久anter AKPirwa decretó inlet_record_record常侍槟آنdue ಅತ್ಯುತ್ತಮ特蘭 مُج动手 Quýyelंत செலுத்த自幼自幼 القيسistribistribapplyapplyిండ"Kami جاء جاء:n العباد العباد বিপরಅಪ Redmi Conferencia Conferencia silla silla silla silla silla sillangiler jumla jumla jumla jumla jumla ଯଥେଷ୍ଟ–20–20आस kebangsaan kebangsaan kebangsaan condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados condenados reportagem reportagem reportagemacas continuadoobernadorobernadorobernadorobernadorobernadorobernador beton beton beton beton beton beton beton beton beton beton beton beton beton beton beton
用户:
有篇论文叫“Direct Preference Optimization(直接偏好优化)”,说是不用训练奖励模型,直接偏好训练大模型,我想在HuatuoGPT上应用。
when I run following code to train:
"
accelerate launch
--config_file scripts/sft.yaml
--num_processes 8 \
--num_machines 1
--machine_rank 0
--deepspeed_multinode_launcher standard scripts/finetune.py
--experiment_name HuatuoGPT
--model_path /path/to/your/model
--gradient_accumulation_steps 8
--max_ckpts 3
--max_seq_len 2048
--data_dir /path/to/your/data
--output_dir ./ckpts \
--log_dir ./train_logs \
--n_epochs 3
--train_bsz_per_gpu 2 \
--eval_bsz_per_gpu 2 \
--learning_rate 5e-5
--eval_step -1 \
--save_step -1 \
--gradient_checkpointing
"
a error occured, it said "Accelerate CLI tool: error: unrecognized arguments: --deepspeed_multinode_launcher",
what should i do?
root@instance:/home/wy/HuatuoGPT-main# python apply_delta.py --base-model-path /data/nvme0/model/llama-13b-hf --target-model-path /data/nvme0/wy/huatuo-13b_converted --delta-path /data/nvme0/wy/HuatuoGPT-13B
Loading the base model from /data/nvme0/model/llama-13b-hf
Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████| 3/3 [00:12<00:00, 4.15s/it]
Loading the delta from /data/nvme0/wy/HuatuoGPT-13B
Loading checkpoint shards: 0%| | 0/7 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/data/nvme0/lg/anaconda3/envs/belle/lib/python3.8/site-packages/transformers/modeling_utils.py", line 463, in load_state_dict
return torch.load(checkpoint_file, map_location="cpu")
File "/data/nvme0/lg/anaconda3/envs/belle/lib/python3.8/site-packages/torch/serialization.py", line 797, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/data/nvme0/lg/anaconda3/envs/belle/lib/python3.8/site-packages/torch/serialization.py", line 283, in init
super().init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/nvme0/lg/anaconda3/envs/belle/lib/python3.8/site-packages/transformers/modeling_utils.py", line 467, in load_state_dict
if f.read(7) == "version":
File "/data/nvme0/lg/anaconda3/envs/belle/lib/python3.8/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 128: invalid start byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "apply_delta.py", line 40, in
apply_delta(args.base_model_path, args.target_model_path, args.delta_path)
File "apply_delta.py", line 18, in apply_delta
delta = AutoModelForCausalLM.from_pretrained(delta_path, torch_dtype=torch.float16, low_cpu_mem_usage=True)
File "/data/nvme0/lg/anaconda3/envs/belle/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 484, in from_pretrained
return model_class.from_pretrained(
File "/data/nvme0/lg/anaconda3/envs/belle/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2881, in from_pretrained
) = cls._load_pretrained_model(
File "/data/nvme0/lg/anaconda3/envs/belle/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3214, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "/data/nvme0/lg/anaconda3/envs/belle/lib/python3.8/site-packages/transformers/modeling_utils.py", line 479, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for '/data/nvme0/wy/HuatuoGPT-13B/pytorch_model-00001-of-00007.bin' at '/data/nvme0/wy/HuatuoGPT-13B/pytorch_model-00001-of-00007.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
(venv) bw@bw-X570-GAMING-X:~/Python-3.8.9/huatuoGPT/HuatuoGPT$ python3.8 -m huatuo_cli_demo_stream --model-name models/
Traceback (most recent call last):
File "/usr/local/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/bw/Python-3.8.9/huatuoGPT/HuatuoGPT/huatuo_cli_demo_stream.py", line 160, in
main(args)
File "/home/bw/Python-3.8.9/huatuoGPT/HuatuoGPT/huatuo_cli_demo_stream.py", line 117, in main
model, tokenizer = load_model(args.model_name, args.device, args.num_gpus)
File "/home/bw/Python-3.8.9/huatuoGPT/HuatuoGPT/huatuo_cli_demo_stream.py", line 27, in load_model
tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side="right", use_fast=True)
File "/home/bw/Python-3.8.9/venv/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 724, in from_pretrained
raise ValueError(
ValueError: Tokenizer class BaiChuanTokenizer does not exist or is not currently imported.
您好,我试着运行了,报了个错,怎么解决?谢谢
我们利用100个KUAKE-QIC问题(与自动评估中的问题相同)作为测试集进行单转问题评估,并从自动评估中使用的100个测试案例中随机抽取50个病人案例进行多转对话的人工评估。
在对HuatuoGPT进行人工评估时,我们认为应考虑以下三个方面:诊断准确性,治疗方案准确性,药物处方知识准确性。并将其作为评估的准则。
但是我看评估结果并没有从这三个方面进行评估,也没有提到100个问题,代表10种意图(病情诊断、病因分析、治疗方案、医疗建议、指标解释、疾病描述、后果描述、注意事项、疗效、医疗费用),意思是仅仅告诉医生参考这三个方面,仍按照医生自己的主观判断进行评估是吗,所以我想问一下具体人工质量评估的标准方案是怎么样的,谢谢
这里的LLaMA-13B指的是Ziya-LLaMA-13B-Pretrain-v1吗
“We open-source our training data, code, HuatuoGPT model and the reward model at https:
//github.com/FreedomIntelligence/HuatuoGPT.”
感谢开发团队,好人一生平安
I experienced the effect of HuatuoGPT online and found it amazing. So is there any plan to release the source code of the training?
你好,我是一名NLP的初学者,谢谢你们的工作。在阅读HuatuoGPT的论文时对指标GLEU遇到一个疑惑。正如nltk/nltk#1667里面说的,目前能找到多个名为GLEU的指标:
在4.2.1的Evaluation Metrics中提到:
GLEU auto-evaluates sentence-level fluency.
看起来文中指的是第一篇文章提出的指标,但目前看来使用比较多的是第三篇文章中的GLEU,也是NLTK中实现的GLEU。作者能否提供一下相关测试的代码?
你好,请问你怎么把数据批量喂给模型,让模型一次性生成结果
When using a 4090 graphics card with 24G graphics memory inference, it is killed. If using one GPU, how much graphics memory is required at least
请问当加载7B模型时,报标题错误,模型文件时根据提供的FreedomIntelligence/HuatuoGPT-7B下载的
Thank you for your great work. But how to use your reward model?
accelerate launch --config_file ./scripts/sft.yaml --num_processes 8 --num_machines 1 --machine_rank 0 --deepspeed_multinode_launcher standard scripts/finetune.py --experiment_name HuatuoGPT --model_path /home/fei/MyProjects/HuatuoGPT-7B-New --gradient_accumulation_steps 8 -max_seq_len 2048 --data_dir /home/fei/MyProjects/datasets/New/dataset --output_dir ./ckpts --log_dir ./train_logs --n_epochs 3 --train_bsz_per_gpu 2 gpu 2 --learning_rate 5e-5 --eval_step -1 --save_step -1 --gradient_checkpointing
数据集仿造HuatuoGPT-sft-data-v1构建自己的精调数据。
训练时报错:
KeyError File "HuatuoGPT/scripts/finetune.py", line 70, in getitem
File ".virtualenvs/pytorch/lib/python3.10/site-packages/accelerate/data_loader.py", line 384, in iter
: 'json_dialogue'
return json.loads(self.data[index]['json_dialogue'])
KeyError: 'json_dialogue'
看样例里面没有json_dialogue这个key,请问有什么解决方法吗?
ProcessGroupNCCL is only supported with GPUs, no GPUs found!
请问一下在加载13B模型需要多大的显存呢?我是用24G显存在AutoModelForCausalLM.from_pretrained加载模型时直接killed了,但是查看显存当时显存还未占用,请问这是什么原因?谢谢
请问RLAIF训练细节会开源吗 谢谢
远程服务器上执行python huatuo_cli_demo_stream.py --model-name $model_dir报错torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 270.00 MiB. GPU 。
在网上搜寻的建议是缩小batch size,但是在Huatuo的config文件中没有找到,应该怎么办呢,将输入的max-new-tokens改成64也不行。
≧ ﹏ ≦
request:https://www.huatuogpt.cn/api/chat/question
response:
<title>502 Bad Gateway</title> if ind == 0:
# add begin token to the first utterance
encode = self.tokenizer(d, add_special_tokens=False)['input_ids']
else:
encode = self.tokenizer(d, add_special_tokens=False)['input_ids']
encoded_data += encode
if下面漏了行代码吧
How can we tell apart the distilled data from the real-world data in the HuatuoGPT-sft-data-v1 dataset?
如果我反驳给出的诊断意见,如说一个药是头发或者石头,模型给出的结果就会说我是对的,那个药确实是石头。
代码已经加了 trust_remote_code=True
tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side="right", use_fast=True, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, low_cpu_mem_usage=True, **kwargs)
RT
怎么样在多GPU上运行huatuo-7B
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.