Git Product home page Git Product logo

bertwithpretrained's People

Contributors

moon-hotel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

bertwithpretrained's Issues

env

你好torch==1.5.0最低要求py3.7
3.6的环境下无法安装torch==1.5.0

ValueError: num_samples should be a positive integer value, but got num_samples=0

机器配置:torch1.12.0+python3.8
运行原仓库的pretrain出错,报错如下:
Traceback (most recent call last):
File "/root/Tasks/TaskForPretraining.py", line 305, in
train(config)
File "/root/Tasks/TaskForPretraining.py", line 108, in train
data_loader.load_train_val_test_data(test_file_path=config.test_file_path,
File "/root/utils/create_pretraining_data.py", line 334, in load_train_val_test_data
train_iter = DataLoader(train_data, batch_size=self.batch_size,
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 347, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/sampler.py", line 107, in init
raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0
是不是torch版本有问题,3090的cuda只能11,torch好像最低1.9.0

SQuAD任务中文数据训练问题

您好,非常感激您的开源!我最近在认真研读您的文章及代码。但在做SQuAD任务,使用中文数据训练遇到一些问题,如下:
1、中文数据就按照您代码来处理,char字符输入,但是会遇到tokenize去除空格情况,导致最后“答案”位置有偏移,以至于在show_result阶段True answer经常对应不上正确的答案

请问,中文数据处理代码主要有哪些地方需要修改?我也不清楚我是否其他处理有误,多谢,谢谢

是否可以实现单机多卡训练,我在修改代码时候,出现以下问题

Traceback (most recent call last):
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/TaskForChineseNER.py", line 315, in
train(config)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/TaskForChineseNER.py", line 132, in train
loss, logits = model(input_ids=token_ids, # [src_len, batch_size]
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/_utils.py", line 434, in reraise
raise exception
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/DownstreamTasks/BertForTokenClassification.py", line 32, in forward
_, all_encoder_outputs = self.bert(input_ids=input_ids,
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/Bert.py", line 290, in forward
all_encoder_outputs = self.bert_encoder(embedding_output,
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/Bert.py", line 190, in forward
layer_output = layer_module(layer_output,
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/Bert.py", line 162, in forward
attention_output = self.bert_attention(hidden_states, attention_mask)
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/Bert.py", line 93, in forward
self_outputs = self.self(hidden_states,
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/Bert.py", line 56, in forward
return self.multi_head_attention(query, key, value, attn_mask=attn_mask, key_padding_mask=key_padding_mask)
File "/home/yons/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/MyTransformer.py", line 296, in forward
return multi_head_attention_forward(query, key, value, self.num_heads,
File "/home/yons/workfiles/codes/opencodes/BertWithPretrained/Tasks/../model/BasicBert/MyTransformer.py", line 360, in multi_head_attention_forward
attn_output_weights = attn_output_weights.masked_fill(
RuntimeError: The size of tensor a (367) must match the size of tensor b (184) at non-singleton dimension 3

关于QA任务建模的问题

您好,您目前对于QA建模的操作是将context分隔开到每个句子中,
问题1:这种方式您是如何保证该问题能读取到充分的句子信息的?
问题2:图中圈出来的几个数字代表什么意思?
image

songci数据集,wiki2预训练时会报错,生成的掩码pt文件wiki_train_mlNone_rs2022_mr15_mtr8_mtur5.pt只有1k

注意,正在使用本地MyTransformer中的MyMultiHeadAttention实现

[2022-11-27 15:03:35] - INFO: ## 使用token embedding中的权重矩阵作为输出层的权重!torch.Size([30522, 768])
[2022-11-27 15:03:38] - INFO: 缓存文件 /home/********/博一/my_explore/BERT_learn/BertWithPretrained-main/data/WikiText/wiki_test_mlNone_rs2022_mr15_mtr8_mtur5.pt 不存在,重新处理并缓存!

正在读取原始数据: 100%|██████████████| 4358/4358 [00:00<00:00, 11122.89it/s]

正在构造NSP和MLM样本(test): 100%|██| 1847/1847 [00:00<00:00, 1681180.44it/s]

[2022-11-27 15:03:38] - INFO: 缓存文件 /home/********/博一/my_explore/BERT_learn/BertWithPretrained-main/data/WikiText/wiki_train_mlNone_rs2022_mr15_mtr8_mtur5.pt 不存在,重新处理并缓存!

正在读取原始数据: 100%|████████████| 36718/36718 [00:03<00:00, 11100.30it/s]

正在构造NSP和MLM样本(train): 100%|█| 15496/15496 [00:00<00:00, 1615704.25it/

Traceback (most recent call last):
File "TaskForPretraining.py", line 300, in
train(config)
File "TaskForPretraining.py", line 105, in train
val_file_path=config.val_file_path)
File "../utils/create_pretraining_data.py", line 334, in load_train_val_test_data
collate_fn=self.generate_batch)
File "/home/pgrad/.conda/envs/wmc_transformer/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 213, in init
sampler = RandomSampler(dataset)
File "/home/pgrad/.conda/envs/wmc_transformer/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 94, in init
"value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integer value, but got num_samples=0

关于MLM pretraining时,做句子对Classfication的咨询?

您好,想请教下句子对Pretraining,我看了Task/TaskForPretraining.py,是 MLM和NSP的组合任务,受到启发想咨询下,如果做句子对分类(即判断句子a和句子b是否属于同一类),是不是相应的调整一下句子对的处理(即模型输入token_type_ids改为[0] * (len(token_a_ids) + 2) + [1] * (len(token_b_ids) + 1)),用句子对label替换 nsp_label即可?还是说有其他的方法?

关于从头训练MLM tasks任务的咨询。

你好,感谢您提供的代码!关于预训练,我有一个问题想咨询一下。您提供的TaskForPretraining.py,实际上是从一个训练好的模型上进一步pretrain。如果我想完全从随机初始化开始进行pretrain,相关学习策略是否需要调整。例如初始学习率,衰减策略等等

attention mask相关问题

在代码里(如TaskForSingleSentenseClassification.py),attention_mask的代码好像是[0,0,0,....,1,1,1]这种形式,
padding_mask = (sample == data_loader.PAD_IDX).transpose(0, 1)

我看在别的实现有的是[1,1,1,...,0,0,0]这种形式,不知道这两种方式有区别吗,谢谢!

关于attention_mask

请问为什么attention_mask 有效的token是false,padding的token是True呢?

pretrain model parameters预训练模型参数

Hi, appreciate your detailed tutoring on the github and weChat. They helped me understand a lot about the transformer and bert model. If we need to implement the downstream tasks you provided, where are supposed to place the pretraining parameters into the model? Are these parameters downloaded from Hugging Face's website?

很感谢您在微信和github上细心的指导,他们帮助我更好地理解了transformer和bert模型,如果我们想要去实现你提供的下游任务,我们需要在哪儿去放置预训练的参数呢?这些参数是不是指从Hugging Face那儿下载的参数?

TaskForSQuADQuestionAnswering训练任务时报错IndexError: list index out of range

在执行TaskForSQuADQuestionAnswering训练任务时,经常遇到这样的报错,请问是什么原因导致的?

正在遍历每个问题(样本): 76%|████████████▉ | 16/21 [05:20<01:40, 20.01s/it]
Traceback (most recent call last):
File "TaskForSQuADQuestionAnswering_Train.py", line 210, in
train(config=model_config)
File "TaskForSQuADQuestionAnswering_Train.py", line 81, in train
train_iter, val_iter = data_loader.load_train_data(train_file_path=config.train_file_path)
File "../utils/data_helpers.py", line 766, in load_train_data
postfix=postfix) # 得到处理好的所有样本
File "../utils/data_helpers.py", line 97, in wrapper
data = func(*args, **kwargs)
File "../utils/data_helpers.py", line 649, in data_process
token_to_orig_map = self.get_token_to_orig_map(input_tokens, example[3], self.tokenizer)
File "../utils/data_helpers.py", line 561, in get_token_to_orig_map
token = tokenizer(origin_context_tokens[value_start])
IndexError: list index out of range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.