langboat / mengzi Goto Github PK

View Code? Open in Web Editor NEW

534.0 534.0 63.0 520 KB

Mengzi Pretrained Models

License: Apache License 2.0

bert chinese-bert deep-learning language-understanding natural-language-processing nlp pytorch

mengzi's People

Contributors

Stargazers

Watchers

mengzi's Issues

padding部分不需要计算loss

Mengzi/examples/Mengzi_summary.ipynb

input_tensor["labels"] = target_tensor["input_ids"]

padding部分不需要计算loss

请问在mengzi-gpt-neo-base基础上做fine tuning，能否提供一下官方的例子？

预训练 X152-C4 模型抽取图片特征时输入数据（tsv文件）的格式是？

https://github.com/Langboat/Mengzi/blob/main/Mengzi-Oscar.md
在上面md中，使用预训练 X152-C4 模型抽取图片特征时，输入数据的格式是？
DATA_DIR <path of image feature>
目前只知道输入文件为tsv文件，但不知道tsv文件的具体内容格式，是否可以给个demo输入数据？

词表的逗号和括号是英文符号

你好，请问为什么mengzi-t5-base的词表里的括号和逗号是英文符号呢，不应该是中文么

Mengzi-T5-base-MT模型大小

为什么Mengzi-T5-base-MT的模型大小只有Mengzi-T5-base的一半，加载模型再保存以后，又恢复和base相同的大小

您好，多模态oscar模型，在acc-icc数据集进行推理的时候，采用语句是：
python -m torch.distributed.launch --nproc_per_node=8 oscar/run_captioning.py
--data_dir
--do_test --test_yaml test_ch.yaml
--num_beams 5 --per_gpu_eval_batch_size 128 --max_gen_length 20
--eval_model_dir

请问test_ch.yaml文件是位于哪里呢

请问Mengzi-T5-base的预训练任务是DAE还是LM？

感谢贡献这么优秀的预训练模型。方便的话，能否告知Mengzi-T5-base的预训练任务是denoising auto-encoding (DAE)还是预测下一段文本（LM）？如果是DAE的话，用了什么noise呢？Token Infilling和Sentence Permutation之类的。

文本生成落地怎么做的

batch size究竟是128还是16384

我注意到技术报告中2.1节提到：

We limit the length of sentences in each batch to up to 512 tokens, and the batch size is 128.

这一段后面又提到：

The batch sizes for the two stages are 16384 and 32768, respectively

请问究竟batch size究竟是哪个呢？是否前一个是number of sequences，后面一个是number of tokens？还是由于使用了LAMB所以能支持这么大的batch size？LAMB的paper用的是32868。

预训练mengzi是可以按照预训练BERT的方式吗？

请问预训练的schedule是怎么设置的

请问训练base和large模型时，学习率和warmup等分别是怎么设置的？

请问一下Mengzi-BERT-large模型会不会被release

我在你们paper上看的mengzi-bert-large的惊人表现，但是我发现你们好像并没有发布出来，我想问一下它会被放出来吗

tensorflow版本mengzi-bert-base需要朋友可以下载

https://github.com/YuandZhang/tf_mengzi_bert

How to incorporate knowledge graph in marketing copywriting?

Hi,
Thanks for sharing this awesome work.
According to the Figure 2. of your paper, you incorporate knowledge graph in marketing copywriting task,
but it seems there is no further explanation about this.
Could you please explain more about this method?

微信讨论群满

请问还可以加入微信讨论群吗？

What is the input format for the model to automatically generate marketing copy?

hello langboat, thanks for sharing the good work.
Regarding the automatically generated marketing copy in the paper

Given the input title and keywords, the models are required to generate a corresponding descriptive passage

What is the input of the model?
Is it in the form of [cls] title [sep] [keywords1,keywords2,keywords3,keywords4] [sep] [kg11,kg12,kg13] [kg21,kg22,kg23]?

请问Mengzi-BERT-base在CLUE的9项下游任务中，训练的平台配置和参数是多少？

开发者您好！论文中说Mengzi-BERT-base在CLUE的9项下游任务中超过了RoBERTa、BERT等baseline，我有几个问题想请教您一下：
① 请问在下游任务训练中，你们使用的硬件平台配置是多少呢？例如显卡配置、CUDA版本等。
② 而且，方便透露下游任务训练中更具体的参数设置吗？例如优化器的参数配置、warmup的设置、模型初始化的seed值、下游任务中是否使用了fp16等。
③ 刚刚看到FAQ中说不考虑开放training代码，请问Mengzi-BERT-base的下游任务训练代码也不会考虑开放吗？

关于mengzi-gpt-neo-base某些字无法正常显示的问题

例句：“Linux ⁇ 能和Windows相比,其支持的 ⁇ 能较低,但是 ⁇ 能很低。”

这里的“性”字变成了问号。经过我的排查，我发现vocab里面没有这个文字，而是变成了“xing”。这个我猜测是不是训练素材里面把所有的字都给转换了。。。

目前我自己的临时解决办法如下：

wget https://raw.githubusercontent.com/google/sentencepiece/master/src/sentencepiece_model.proto
protoc --python_out=. sentencepiece_model.proto

import sentencepiece_model_pb2 as model

m = model.ModelProto()
m.ParseFromString(open('mengzi_gpt.model', 'rb').read())

for i in m.pieces:
    if i.piece=="xing":
        i.piece="性"
        print("modified")
        break

with open('new.model', 'wb') as f:
    f.write(m.SerializeToString())

望早日修复。

mengzi-gpt-neo-base在huggingface上无法体验，有异常爆出

如题,
错误信息：
Can't load tokenizer using from_pretrained, please update its configuration: Can't load tokenizer for 'Langboat/mengzi-gpt-neo-base'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'Langboat/mengzi-gpt-neo-base' is the correct path to a directory containing all relevant files for a GPT2TokenizerFast tokenizer.

没找到官方的预训练脚本，只用MLM做继续预训练效果如何？

缺少其他的预训练任务感觉会破坏模型效果

推理速度

请问和其他模型比，Mengzi的推理速度怎么样

关于bloom的token裁剪

请问对token进行裁剪的后得到的bloom-zh，是否进行了finetune。

mengzi-oscar-base-caption无论是huggingface还是Oscar方式都无法载入

https://huggingface.co/Langboat/mengzi-oscar-base-caption

请问想指定一张图片进行推理，具体怎么做？

关于自然语言理解任务的问题

Hi，我想和你们确认个问题。Huggingface的模型在文本分类任务上用BertForSequenceClassification这个类时，其中用到的是bert的pooled_output结果，然后接最终的一层classifier输出。而你们论文中说：“We build the downstream models for the natural language understanding tasks by adding a linear classifier on top of the “[CLS]" token to predict label probabilities.”。这个意思是仅用bert的CLS token，然后直接到最终的classifier是吗？因为我看你们预训练任务中有NSP任务，所以想确认一下文本分类你们具体用的哪种方式。谢谢~

预测性能

Mengzi-BERT-base在cpu和gpu性能怎么样？时延怎么样？qps能达到多少？

我建议用同样的测试脚本重新测一下RoBERTa

CLUE的github上给的各模型的成绩，不少都是偏低的，比如我自己用RoBERTa base可以将CHID做到0.86+，但github上给出的最好结果才0.85+。

所以公平起见，不建议直接引用上面写的成绩，而是用同样的微调脚本重测一遍RoBERTa。

适合做多轮对话任务吗？

在一些对话数据集上finetune后，适合做多轮闲聊任务吗？

你好，请问clue official leaderboard results on test的训练模型代码开源了吗？

Input prefix of the model mengzi-t5-base

Hi,

I have a question regarding the input of the model mengzi-t5-base. In the original paper of T5, it mentions that "we need to add the task-specific prefix to the original input sequence before feeding it to the model". I wonder that if I want to perform text summarize task with mengzi-t5-base or other downstream tasks, do I need to add some prefix, and what the prefix should be. Thank you very much for your help, looking forward to your reply.

T5-small版模型

感谢开发者开源！这个T5-base模型尺寸还是有点大啊，后面会考虑开源small版本的模型吗？

langboat / mengzi Goto Github PK

mengzi's People

Contributors

Stargazers

Watchers

Forkers

mengzi's Issues

Recommend Projects

Recommend Topics

Recommend Org