Soft-Masked Bert 复现论文:https://arxiv.org/pdf/2005.07421.pdf
hiyoung123 / softmaskedbert Goto Github PK
View Code? Open in Web Editor NEWSoft-Masked Bert 复现论文:https://arxiv.org/pdf/2005.07421.pdf
Soft-Masked Bert 复现论文:https://arxiv.org/pdf/2005.07421.pdf
Soft-Masked Bert 复现论文:https://arxiv.org/pdf/2005.07421.pdf
你好,我运行该代码时出现如下错误。。。我的pytorch版本是1.8.1,这个错误是和pytorch版本有关吗?
EP_train:0: 0%|| 1/900 [00:00<04:19, 3.46it/s]
Traceback (most recent call last):
File "/Soft-mask/train.py", line 209, in
trainer.train(train, e)
File "/Soft-mask/train.py", line 39, in train
return self.iteration(epoch, train_data)
File "/Soft-mask/train.py", line 98, in iteration
loss.backward(retain_graph=True)
File "/miniconda3/envs/py_36/lib/python3.6/site-packages/torch/tensor.py", line 245, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/miniconda3/envs/py_36/lib/python3.6/site-packages/torch/autograd/init.py", line 147, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [768]] is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
我发现代码对原始数据的处理方法是用同音字替换,和单字随机替换法(这两种方法,都是没有改变原始句子的长度)。。好像没有实现增减字的情况,不知道这种情况下的label应该是怎样的。。
恳请指点~~
感谢大佬复现这篇论文的代码。想请问下大佬复现的效果如何?是否能达到论文所述的效果?
ps:训练数据是啥格式?能写个简单的readme吗?O(∩_∩)O哈哈~
感谢分享, 请问可以出个readme吗 谢谢
哥们,来个数据集 [email protected]
您好,能否提供一下训练数据呀。本人小学生一枚,想学习一下。
谢谢您的代码,数据处理完后,运行train.py,有配置json和预训练模型,我是不是nlp方向的的小白,想试试,求指点,这个目录从哪里获取
Can anyone provide the dataset that used in this repository?
在soft_masked_bert.py
中的41行,传入attention_mask
参数的应该是extended_attention_mask
,而不是encoder_extended_attention_mask
,但是extended_attention_mask
值,在36行被输出为_
。
另外,很奇怪为什么要把原本封装好的函数拆开零散放到_init_inputs()
中,内部基本没变还减少了一些东西,感觉没有必要而且降低代码的可阅读性。
Start train 3 ford
Calling BertTokenizer.from_pretrained() with the path to a single file or url is deprecated
The current process just got forked. Disabling parallelism to avoid deadlocks...
To disable this warning, please explicitly set TOKENIZERS_PARALLELISM=(true | false)
The current process just got forked. Disabling parallelism to avoid deadlocks...
To disable this warning, please explicitly set TOKENIZERS_PARALLELISM=(true | false)
Inference:: 0%|| 42/17221 [00:02<14:48, 19.34it/s]
Traceback (most recent call last):
File "train.py", line 215, in
print(trainer.inference(val_data_loader))
File "train.py", line 53, in inference
for i, data in data_iter:
File "/home/ccit19/anaconda3/envs/liuzhe/lib/python3.7/site-packages/tqdm/std.py", line 1129, in iter
for obj in iterable:
File "/home/ccit19/anaconda3/envs/liuz-1.1/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/home/ccit19/anaconda3/envs/liuz-1.1/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
RuntimeError: Traceback (most recent call last):
File "/home/ccit19/anaconda3/envs/liuz-1.1/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ccit19/anaconda3/envs/liuz-1.1/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 63, in default_collate
return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
File "/home/ccit19/anaconda3/envs/liuz-1.1/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 63, in
return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
File "/home/ccit19/anaconda3/envs/liuz-1.1/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 43, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 152 and 158 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:711
请问有解决方案吗
各位抱歉,当时只是照着论文写出代码跑通了而已.
由于工作比较忙,没有时间搞这个code, 还有很多bug.
在这里给大家推荐一个网站,大多数数据集都可以在这里下载
https://www.cluebenchmarks.com/dataSet_search.html
论文中的数据集,作者自己也爬取了100w的新闻标题,数据量不是一个级别的.
有兴趣可以自己写爬虫爬取,数据处理的脚本,本仓库也有.
在此感谢大家关注,多谢!
您好,感谢您复现论文的代码开源,我将训练所需要的数据都准备完毕,但在train.py时一直出现如下报错:
File "train.py", line 182, in
trainer.train(train_data_loader, e)
File "train.py", line 33, in train
return self.iteration(epoch, train_data)
File "train.py", line 79, in iteration
loss.backward(retain_graph=True)
File "/usr/local/lib/python3.6/dist-packages/torch/tensor.py", line 198, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/usr/local/lib/python3.6/dist-packages/torch/autograd/init.py", line 100, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [768]] is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
请问您能提供一些解决办法?十分感谢!
train.py
文件的第93行
loss_c = self.criterion_c(out.transpose(1, 2), data["output_ids"])
这里为什么要将第二维和第三维进行转置?是因为数据的第二维才是对应id么?
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [768]] is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
请问您有遇到这个问题吗?
您好,能否提供一下pinyin2char.model,谢谢!
使用torch==1.5.0,transformers==2.11.0 ,总是报RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [768]] is at version 3; expected version 2 instead.错误,想问作者使用的什么版本,是否遇到过类似错误,谢谢了。
Start train 0 ford
Calling BertTokenizer.from_pretrained() with the path to a single file or url is deprecated
{'epoch': 0, 'iter': 0, 'avg_loss': 8.148957252502441, 'avg_acc': 0.0, 'loss': 8.148957252502441}
EP_train:0: 0% 1/1723 [00:36<17:17:03, 36.13s/it]Traceback (most recent call last):
File "train.py", line 182, in
trainer.train(train_data_loader, e)
File "train.py", line 33, in train
return self.iteration(epoch, train_data)
File "train.py", line 79, in iteration
loss.backward(retain_graph=True)
File "/usr/local/lib/python3.6/dist-packages/torch/tensor.py", line 198, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/usr/local/lib/python3.6/dist-packages/torch/autograd/init.py", line 100, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [768]] is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
EP_train:0: 0% 1/1723 [01:10<33:33:36, 70.16s/it]
一直出现这个问题,请问您有什么好的解决办法嘛
def load_toutiao_dataset2():
dataset = []
file_path = 'data/nlp7294/'
for file in os.listdir(file_path):
index = -1
with open(file_path+file, 'r') as f:
for line in f:
index += 1
if index == 0:
continue
line = line.strip().split()[1]
dataset.append(line)
return dataset
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.