moon-hotel / bertwithpretrained Goto Github PK
View Code? Open in Web Editor NEWAn implementation of the BERT model and its related downstream tasks based on the PyTorch framework
An implementation of the BERT model and its related downstream tasks based on the PyTorch framework
你好torch==1.5.0最低要求py3.7
3.6的环境下无法安装torch==1.5.0
每个实验都是用预训练好的参数直接导入模型中么,有那种用自己的数据集去训练一个bert的实验么,就例如第七个模型你是导入了预训练好的Bert的参数然后再训练么?
sorry, i'd ask when i run TaskForSingleSentenceClassification.py, it always make an error.
Someone face the same problem and have solved it?
CE里面ignore_index设的是-1,送入的mlm_label没有被mask的位置是0,是不是应该把ignore_index改为0
你好,请问跑这个项目,训练的话最小需要多少显存
机器配置:torch1.12.0+python3.8
运行原仓库的pretrain出错,报错如下:
Traceback (most recent call last):
File "/root/Tasks/TaskForPretraining.py", line 305, in
train(config)
File "/root/Tasks/TaskForPretraining.py", line 108, in train
data_loader.load_train_val_test_data(test_file_path=config.test_file_path,
File "/root/utils/create_pretraining_data.py", line 334, in load_train_val_test_data
train_iter = DataLoader(train_data, batch_size=self.batch_size,
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 347, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/sampler.py", line 107, in init
raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0
是不是torch版本有问题,3090的cuda只能11,torch好像最低1.9.0
您好,非常感激您的开源!我最近在认真研读您的文章及代码。但在做SQuAD任务,使用中文数据训练遇到一些问题,如下:
1、中文数据就按照您代码来处理,char字符输入,但是会遇到tokenize去除空格情况,导致最后“答案”位置有偏移,以至于在show_result阶段True answer经常对应不上正确的答案
请问,中文数据处理代码主要有哪些地方需要修改?我也不清楚我是否其他处理有误,多谢,谢谢
Traceback (most recent call last):
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/TaskForChineseNER.py", line 315, in
train(config)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/TaskForChineseNER.py", line 132, in train
loss, logits = model(input_ids=token_ids, # [src_len, batch_size]
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/_utils.py", line 434, in reraise
raise exception
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/DownstreamTasks/BertForTokenClassification.py", line 32, in forward
_, all_encoder_outputs = self.bert(input_ids=input_ids,
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/Bert.py", line 290, in forward
all_encoder_outputs = self.bert_encoder(embedding_output,
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/Bert.py", line 190, in forward
layer_output = layer_module(layer_output,
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/Bert.py", line 162, in forward
attention_output = self.bert_attention(hidden_states, attention_mask)
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/Bert.py", line 93, in forward
self_outputs = self.self(hidden_states,
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/Bert.py", line 56, in forward
return self.multi_head_attention(query, key, value, attn_mask=attn_mask, key_padding_mask=key_padding_mask)
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/MyTransformer.py", line 296, in forward
return multi_head_attention_forward(query, key, value, self.num_heads,
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/MyTransformer.py", line 360, in multi_head_attention_forward
attn_output_weights = attn_output_weights.masked_fill(
RuntimeError: The size of tensor a (367) must match the size of tensor b (184) at non-singleton dimension 3
请问这个test.txt是自己的预测 数据嘛
[2022-11-27 15:03:35] - INFO: ## 使用token embedding中的权重矩阵作为输出层的权重!torch.Size([30522, 768])
[2022-11-27 15:03:38] - INFO: 缓存文件 /home/********/博一/my_explore/BERT_learn/BertWithPretrained-main/data/WikiText/wiki_test_mlNone_rs2022_mr15_mtr8_mtur5.pt 不存在,重新处理并缓存!
[2022-11-27 15:03:38] - INFO: 缓存文件 /home/********/博一/my_explore/BERT_learn/BertWithPretrained-main/data/WikiText/wiki_train_mlNone_rs2022_mr15_mtr8_mtur5.pt 不存在,重新处理并缓存!
Traceback (most recent call last):
File "TaskForPretraining.py", line 300, in
train(config)
File "TaskForPretraining.py", line 105, in train
val_file_path=config.val_file_path)
File "../utils/create_pretraining_data.py", line 334, in load_train_val_test_data
collate_fn=self.generate_batch)
File "/home/pgrad/.conda/envs/wmc_transformer/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 213, in init
sampler = RandomSampler(dataset)
File "/home/pgrad/.conda/envs/wmc_transformer/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 94, in init
"value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integer value, but got num_samples=0
您好,想请教下句子对Pretraining,我看了Task/TaskForPretraining.py,是 MLM和NSP的组合任务,受到启发想咨询下,如果做句子对分类(即判断句子a和句子b是否属于同一类),是不是相应的调整一下句子对的处理(即模型输入token_type_ids改为[0] * (len(token_a_ids) + 2) + [1] * (len(token_b_ids) + 1)),用句子对label替换 nsp_label即可?还是说有其他的方法?
你好,感谢您提供的代码!关于预训练,我有一个问题想咨询一下。您提供的TaskForPretraining.py,实际上是从一个训练好的模型上进一步pretrain。如果我想完全从随机初始化开始进行pretrain,相关学习策略是否需要调整。例如初始学习率,衰减策略等等
在代码里(如TaskForSingleSentenseClassification.py),attention_mask的代码好像是[0,0,0,....,1,1,1]这种形式,
padding_mask = (sample == data_loader.PAD_IDX).transpose(0, 1)
我看在别的实现有的是[1,1,1,...,0,0,0]这种形式,不知道这两种方式有区别吗,谢谢!
请问为什么attention_mask 有效的token是false,padding的token是True呢?
Hi, appreciate your detailed tutoring on the github and weChat. They helped me understand a lot about the transformer and bert model. If we need to implement the downstream tasks you provided, where are supposed to place the pretraining parameters into the model? Are these parameters downloaded from Hugging Face's website?
很感谢您在微信和github上细心的指导,他们帮助我更好地理解了transformer和bert模型,如果我们想要去实现你提供的下游任务,我们需要在哪儿去放置预训练的参数呢?这些参数是不是指从Hugging Face那儿下载的参数?
在执行TaskForSQuADQuestionAnswering训练任务时,经常遇到这样的报错,请问是什么原因导致的?
正在遍历每个问题(样本): 76%|████████████▉ | 16/21 [05:20<01:40, 20.01s/it]
Traceback (most recent call last):
File "TaskForSQuADQuestionAnswering_Train.py", line 210, in
train(config=model_config)
File "TaskForSQuADQuestionAnswering_Train.py", line 81, in train
train_iter, val_iter = data_loader.load_train_data(train_file_path=config.train_file_path)
File "../utils/data_helpers.py", line 766, in load_train_data
postfix=postfix) # 得到处理好的所有样本
File "../utils/data_helpers.py", line 97, in wrapper
data = func(*args, **kwargs)
File "../utils/data_helpers.py", line 649, in data_process
token_to_orig_map = self.get_token_to_orig_map(input_tokens, example[3], self.tokenizer)
File "../utils/data_helpers.py", line 561, in get_token_to_orig_map
token = tokenizer(origin_context_tokens[value_start])
IndexError: list index out of range
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.