liucongg / chatgptbook Goto Github PK

View Code? Open in Web Editor NEW

318.0 318.0 60.0 35.43 MB

《ChatGPT原理与实战：大型语言模型的算法、技术和私有化》

License: Apache License 2.0

Python 99.85% Shell 0.15%

chatgptbook's Introduction

我是刘聪NLP

🔭 南京某创业公司，首席算法架构师
🌱 研究方向：QA系统（向量检索、阅读理解）、文本生成（问题生成、对话生成、摘要生成）、预训练语言模型等
💬 微信：logCong
📫 问题：知乎 @刘聪NLP
😄 微信公众号：公众号『 NLP工作站』
👯 希望可以多多跟大家交流，欢迎关注我的知乎，欢迎添加微信！

chatgptbook's People

Contributors

Stargazers

Watchers

chatgptbook's Issues

RewardModel计算两个response之间的差异部分疑问

RewardModel计算两个response之间的差异：end_ind的计算是通过end_ind = max(one_ind, two_ind)算的，为什么不是直接比较one_input_ids和two_input_ids差异的最后一个值，也就是check_divergence[-1]来获得。

本书第18页，第7行，应为“解码类模型”

原文：在ChatGPT各项任务表现优异的当下，编码类模型变成最为火热的模型，将有更多从业者投入到相关模型的设计优化中。
个人认为，本段描述的是GPT等此类解码类模型也可以做语义分析任务，从而吸引更多从业者进行研究，所以这里更改为“解码类模型变成最为火热的模型”更为合适。

使用readme 里的docker 镜像跑LLMPreProj 这个工程,会报这个错误呢，辛苦帮忙看啥为啥呢？

[2023-10-08 14:58:00,215] [INFO] [launch.py:162:main] dist_world_size=1
[2023-10-08 14:58:00,215] [INFO] [launch.py:164:main] Setting CUDA_VISIBLE_DEVICES=0
[2023-10-08 14:58:01,975] [INFO] [comm.py:652:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl

initializing model parallel with size 1
Traceback (most recent call last):
File "pretrain_model.py", line 694, in
main()
File "pretrain_model.py", line 644, in main
initialize_distributed(args)
File "pretrain_model.py", line 113, in initialize_distributed
set_deepspeed_activation_checkpointing(args)
File "pretrain_model.py", line 73, in set_deepspeed_activation_checkpointing
deepspeed.checkpointing.configure(mpu, deepspeed_config=args.deepspeed_config, num_checkpoints=args.num_layers)
File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/activation_checkpointing/checkpointing.py", line 887, in configure
_configure_using_config_file(deepspeed_config, mpu=mpu)
File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/activation_checkpointing/checkpointing.py", line 804, in _configure_using_config_file
config = DeepSpeedConfig(config, mpu=mpu).activation_checkpointing_config
File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/config.py", line 728, in init
config_decoded = base64.urlsafe_b64decode(config.strip()).decode('utf-8')
File "/opt/conda/lib/python3.8/base64.py", line 133, in urlsafe_b64decode
return b64decode(s)
File "/opt/conda/lib/python3.8/base64.py", line 87, in b64decode
return binascii.a2b_base64(s)
binascii.Error: Incorrect padding
[2023-10-08 14:58:03,235] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 1179
[2023-10-08 14:58:03,236] [ERROR] [launch.py:324:sigkill_handler] ['/opt/conda/bin/python', '-u', 'pretrain_model.py', '--local_rank=0'] exits with return code = 1

百度网盘的文件太大，能否用git clone 命令直接下载？

下载到本机上，再上传到九天毕晟云服务器上，太慢了，能否在九天毕晟云服务器环境里，直接用git clone 命令下载？

可否将预训练权重等大文件也给一个google drive链接呢？

老师您好，
如题，公司内网环境限制百度盘使用，预训练权重这种大文件不能直接从浏览器下载。。可否提供类似google drive一样的可以直接从浏览器下载的链接呢？谢谢

第8章的amazon_reviews_multi数据集没法下载

您好，代码没法通过脚本下载数据集，能否通过云盘提供已下载的数据集

本书第14页，倒数第2行 “解码层” 应为 “编码层”。

由于BERT模型主要采用了解码层作为模型框架，所以该模型更擅长于语言表征言之，BERT 模型找到了一种有效的将文本转换成高维特征的方法。

ChatGLM预训练问题

老师您好，最近读了您的《ChatGPT原理与实战》一书，收益颇丰，想和您请教一以下，我目前想以ChatGLM为基座模型构建一个领域大语言模型，由于数据的特殊性，所以需要在进行有监督微调前，需要做一步领域数据注入。那么ChatGLM是否可以使用您在这本书第六章所给出的实例去做二次预训练呢？

在九天毕晟上开了一个GPU运行环境，跑train程序，最后报错，什么原因？

Traceback (most recent call last):
File "/root/aaa/ChatGPTBook/PromptProj/train.py", line 261, in
main()
File "/root/aaa/ChatGPTBook/PromptProj/train.py", line 257, in main
train(model, device, train_data, test_data, args, tokenizer)
File "/root/aaa/ChatGPTBook/PromptProj/train.py", line 140, in train
model_to_save.save_pretrained(output_dir)
File "/opt/conda/envs/pytorch1.8/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2376, in save_pretrained
safe_save_file(shard, os.path.join(save_directory, shard_file), metadata={"format": "pt"})
File "/opt/conda/envs/pytorch1.8/lib/python3.9/site-packages/safetensors/torch.py", line 281, in save_file
serialize_file(_flatten(tensors), filename, metadata=metadata)
File "/opt/conda/envs/pytorch1.8/lib/python3.9/site-packages/safetensors/torch.py", line 460, in _flatten
shared_pointers = _find_shared_tensors(tensors)
File "/opt/conda/envs/pytorch1.8/lib/python3.9/site-packages/safetensors/torch.py", line 72, in _find_shared_tensors
if v.device != torch.device("meta") and storage_ptr(v) != 0 and storage_size(v) != 0:
RuntimeError: Expected one of cpu, cuda, xpu, mkldnn, opengl, opencl, ideep, hip, msnpu, xla, vulkan device type at start of device string: meta

关于《ChatGPT原理与实战》代码的数据问题

刘老师,
您好, 今日因为工作的需要, 拜读了您最近的著作《ChatGPT原理与实战》。在书中, 您系统地为梳理了ChatGPT相关的模型和算法技术, 并且对比了不同模型的特征和优劣, 这让我很全面地认识了ChatGPT 与其有关的预训练模型和大语言模型相关的知识。很钦佩您丰富的学识和深厚的实践, 同时, 我也为您的严谨和细致所感动。
然而, 在通过对您提供的资料进行练习时, 发现您提供的数据好像有缺失, 因此希望能从您这里得到。
首先, 对于您提供的第三章《3.5 基于夸夸闲聊数据的UniLM模型实战》部分的数据的链接, 无法访问。推测是因为数据中存在的敏感词汇所致。或许对此部分数据做压缩后上传, 便会解决链接失效的问题。
| data |百度云 |w8bd|
第二, 我在对您的第五章中《5.4 基于提示的文本情感分析实战》(PromptProj)的代码进行调试时, 发现代码运行到data_set.py第50行时, 缺失data/train.json 文件, 导致运行时出现报错。
(报错)
基于此, 想向您请求获取这两部分的数据, 不知可否麻烦您在百忙之中对邮件中的问题加以阅览, 并给出回复。
谢谢！

liucongg / chatgptbook Goto Github PK

chatgptbook's Introduction

我是刘聪NLP

chatgptbook's People

Contributors

Stargazers

Watchers

Forkers

chatgptbook's Issues

Recommend Projects

Recommend Topics

Recommend Org