Comments (3)
{
"activation_function": "gelu",
"add_bias_logits": false,
"add_final_layer_norm": false,
"architectures": [
"BartForConditionalGeneration"
],
"bos_token_id": 0,
"classifier_dropout": 0.0,
"d_model": 1024,
"decoder_attention_heads": 16,
"decoder_ffn_dim": 4096,
"decoder_layerdrop": 0.0,
"decoder_layers": 12,
"decoder_start_token_id": 0,
"early_stopping": true,
"encoder_attention_heads": 16,
"encoder_ffn_dim": 4096,
"encoder_layerdrop": 0.0,
"encoder_layers": 12,
"eos_token_id": 2,
"forced_eos_token_id": 2,
"gradient_checkpointing": false,
"is_encoder_decoder": true,
"max_position_embeddings": 512,
"model_type": "bart",
"normalize_before": false,
"normalize_embedding": true,
"num_beams": 4,
"num_hidden_layers": 12,
"pad_token_id": 0,
"scale_embedding": false,
"transformers_version": "4.4.1",
"use_cache": true,
"tokenizer_class": "BertTokenizer",
"vocab_size": 21132
}
附上修改后的config.json
from modelscope.
如果其他人也要使用的话,还需要加入config.json以及vocab.txt 如下:
config.json
{
"activation_dropout": 0.1,
"activation_function": "gelu",
"add_bias_logits": false,
"add_final_layer_norm": false,
"architectures": [
"BartForConditionalGeneration"
],
"attention_dropout": 0.1,
"bos_token_id": 0,
"classif_dropout": 0.1,
"classifier_dropout": 0.0,
"d_model": 1024,
"decoder_attention_heads": 16,
"decoder_ffn_dim": 4096,
"decoder_layerdrop": 0.0,
"decoder_layers": 12,
"decoder_start_token_id": 0,
"dropout": 0.1,
"early_stopping": true,
"encoder_attention_heads": 16,
"encoder_ffn_dim": 4096,
"encoder_layerdrop": 0.0,
"encoder_layers": 12,
"eos_token_id": 2,
"forced_eos_token_id": 2,
"gradient_checkpointing": false,
"id2label": {
"0": "LABEL_0",
"1": "LABEL_1",
"2": "LABEL_2"
},
"init_std": 0.02,
"is_encoder_decoder": true,
"label2id": {
"LABEL_0": 0,
"LABEL_1": 1,
"LABEL_2": 2
},
"max_position_embeddings": 512,
"model_type": "bart",
"no_repeat_ngram_size": 3,
"normalize_before": false,
"normalize_embedding": true,
"num_beams": 4,
"num_hidden_layers": 12,
"pad_token_id": 0,
"scale_embedding": false,
"task_specific_params": {
"summarization": {
"length_penalty": 1.0,
"max_length": 128,
"min_length": 12,
"num_beams": 4
},
"summarization_cnn": {
"length_penalty": 2.0,
"max_length": 142,
"min_length": 56,
"num_beams": 4
},
"summarization_xsum": {
"length_penalty": 1.0,
"max_length": 62,
"min_length": 11,
"num_beams": 6
}
},
"transformers_version": "4.4.1",
"use_cache": true,
"tokenizer_class": "BertTokenizer",
"vocab_size": 21132
}
from modelscope.
"no_repeat_ngram_size": 3,
将这个参数从config.txt当中移除,效果就正常了。
from modelscope.
Related Issues (20)
- gpu跑paraformer larger onnx模型时候,模型内部出现维度不匹配错误 HOT 3
- TTS流式合成功能的需求 HOT 1
- py_sound_connect无法安装 HOT 1
- py_sound_connect无法安装 HOT 1
- question about installing modelscope from source using setup.py HOT 1
- 页面下载文件提示的指令错误 HOT 1
- 训练好的模型加载出错 HOT 1
- 1.15版本存在问题 HOT 19
- 限制GPU显存占用 HOT 2
- MsDataset.load报错 HOT 1
- 安装audio报错 HOT 1
- 已经有音频和对应的文本,如何使用run_auto_label实现音素标注呢? HOT 3
- modelscope 的 dataset、model和space 目前不支持个人仓库移交给组织的功能么? HOT 2
- 希望datasets提供造成viewer不显示的debug logs查看接口 HOT 4
- FileNotFoundError: Cannot find dataset meta-files, please fetch meta from modelscope hub. HOT 1
- 对训练好的模型微调时报错:RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 HOT 1
- 如果同时用到transformer和modelscope,会出现在~/.cache/huggingface中找不到模型的错误。 HOT 1
- MODELSCOPE_MODULES_CACHE 这个环境变量还需要设置吗?在modelscope代码里没有搜到这个变量 HOT 2
- 创空间有的登录后可查看,有的公网可看,且休眠周期不一致 HOT 1
- modelscope[audio] 安装报错 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from modelscope.