ymcui / chinese-xlnet Goto Github PK

View Code? Open in Web Editor NEW

1.6K 33.0 280.0 369 KB

Pre-Trained Chinese XLNet（中文XLNet预训练模型）

Home Page: http://xlnet.hfl-rc.com

License: Apache License 2.0

Python 100.00%

natural-language-processing xlnet tensorflow pytorch nlp

chinese-xlnet's Issues

微调分类问题时，学习率decay代码有问题

get_train_op 里面计算 "poly" decay 时，使用的decay_steps=FLAGS.train_steps - FLAGS.warmup_steps，里面的FLAGS.train_steps 一直是初始化的1000，这里是忘了在run_classifier.py 里面计算出真正的train_step后对FLAGS.train_steps进行赋值了吧

对于做下游任务ner的话，xlnet如何把BIO标注的训练数据转为xlnet的输入数据，这个能否提供个py代码呢，谢谢

train.py

Hi. Is this script used for pre-training removed out of this repository?
Thanks a lot.

[question] 请问开源的xlnet-mid和xlnet-base预训练的loss分别是多少？

XLNet-mid OOM when run run_classifier.py.
Parameters info: max_seq_length=512 train_batch_size=1.
GPU info: V100 RAM 32G.
what parameters or anything else can i tuning except max_seq_length and train_batch_size, to escape OOM.

请问：代码实现中的tensorflow是什么版本？

some config parameters of downloaded pre-trained model are different from the config in 'Readme'

the config of downloaded model('chinese_xlnet_base_pytorch')

Model config {
"attn_type": "bi",
"bi_data": false,
"clamp_len": -1,
"d_head": 64,
"d_inner": 3072,
"d_model": 768,
"dropout": 0.1,
"end_n_top": 5,
"ff_activation": "relu",
"finetuning_task": null,
"initializer_range": 0.02,
"layer_norm_eps": 1e-12,
"mem_len": null,
"n_head": 12,
"n_layer": 12,
"n_token": 32000,
"num_labels": 2,
"output_attentions": false,
"output_hidden_states": false,
"output_past": true,
"pruned_heads": {},
"reuse_len": null,
"same_length": false,
"start_n_top": 5,
"summary_activation": "tanh",
"summary_last_dropout": 0.1,
"summary_type": "last",
"summary_use_proj": true,
"torchscript": false,
"untie_r": true,
"use_bfloat16": false
}

the pre-trained config in 'Readme'

python train.py
--record_info_dir=$DATA
--model_dir=$MODEL_DIR
--train_batch_size=32
--seq_len=512
*--reuse_len=256 *
*--mem_len=384 *
--perm_size=256
--n_layer=24
--d_model=768
--d_embed=768
--n_head=12
--d_head=64
--d_inner=3072
--untie_r=True
--mask_alpha=6
--mask_beta=1
--num_predict=85
--uncased=False
--train_steps=2000000
--save_steps=20000
--warmup_steps=20000
--max_save=20
--weight_decay=0.01
--adam_epsilon=1e-6
--learning_rate=1e-4
--dropout=0.1
--dropatt=0.1
--tpu=$TPU_NAME
--tpu_zone=$TPU_ZONE
--use_tpu=True

My question is why 'mem_len' and 'reuse_len' are null(None) in downloaded models. Thx

支持

请问词表怎么使用呢？

作者你好，我之前finetune过bert，但是我拿到您的xlnet后，发现其词表和bert的形式（txt格式）不一样，我想知道该如何使用这个预训练模型呢，有相关的库吗，例如pytorch_pretrained_bert这个库（我用pytorch）？

请问使用Sentence Piece的时候，text（中文）是否人为加了空格？如，是('你好吗')，还是('你好吗')

序列标注的问题

请问这里使用 Sentence Piece 进行分子词而不用字，序列标注的应该怎么映射，有什么建议吗？

Error reported to Coordinator: Expected float32, got '/part_0' of type 'str' instead

使用xlnet-base预训练模型，用于多标签分类任务，我的label类型是float32的，当模型运行到model_utils.py的82行时（tf.train.init_from_checkpoint(init_checkpoint, assignment_map) 时报错：Error reported to Coordinator: Expected float32, got '/part_0' of type 'str' instead，这个加载预训练模型的过程会和我的label类型相关吗？

使用pytorch加载XLNet-base时，如何加载分词器呢？

我使用样例中的方式加载模型没有报错，但是加载分词器的时候出现问题

mid 的参数大概是多少我能在V100上面进行微调吗，比bert base大多少

cmrc2018 finetuning结果无法复现（GPU）

需多大显存可进行finetune

您好！请问需要多大显存可以在这个mid xlnet 上进行下游任务的finetune，比如文本分类等

我训练xlnet_mid的F1 score一直在60%，EM也是在60%，可以帮忙看下吗？

prediction example長度和實際example長度不同

想請問為何我的prediction example長度和實際example長度不同。
是因為以下程式碼的關係嗎?
我沒有使用TPU，而是使用GPU。

  if FLAGS.do_eval:
    # TPU requires a fixed batch size for all batches, therefore the number
    # of examples must be a multiple of the batch size, or else examples
    # will get dropped. So we pad with fake examples which are ignored
    # later on. These do NOT count towards the metric (all tf.metrics
    # support a per-instance weight, and these get a weight of 0.0).
    #
    # Modified in XL: We also adopt the same mechanism for GPUs.
    while len(eval_examples) % FLAGS.eval_batch_size != 0:
      eval_examples.append(PaddingInputExample())

请问是否能说明预计发布日期呢？多谢！

sentence-pair classification任务上效果不好

请问有没有在sentence-pair classification任务上进行过评测?我试了下效果相对于BERT官方(BERT-Base, Chinese)效果差很多

XLNet 在 DRCD dev set 的表現與 DRCD test set 的表現差異甚大

問題: 使用DRCD train set 去 fine tune XLNet-Chinese，並套用此模型到DRCD dev set 與 test set，結果大不同。不知道我哪個環節出錯了..

Model	Max sequence Length	Batch size	Learning rate	Train steps	Dev set (EM/F1)	Test set (EM/F1)
XLNet-Chinese	256	2	3e-5	12000	85.44 / 93.32	4.85 / 0.43

操作步驟:

首先先執行 preprocess.sh，先設max seq length = 256，超過這數字，我的GPU負荷不了。

#!/bin/bash
#### local path
DRCD_DIR=raw_data/
INIT_CKPT_DIR=XLNet/xlnet_pretrain_model/chinese_xlnet_mid_L-24_H-768_A-12
#### google storage path
GS_ROOT=
GS_PROC_DATA_DIR=XLNet/proc_data
python3 XLNet/xlnet/run_squad.py \
  --use_tpu=False \
  --do_prepro=True \
  --spiece_model_file=${INIT_CKPT_DIR}/spiece.model \
  --train_file=${DRCD_DIR}/DRCD_training.json \
  --output_dir=${GS_PROC_DATA_DIR} \
  --uncased=False \
  --max_seq_length=256 \
  $@

Fine tune on DRCD train set，並predict dev set，最後使用 cmrc2018_evaluate.py 產生dev set 的EM和F1。

Fine tune on train set and predict dev set

#!/bin/bash
#### local path
DRCD_DIR=raw_data/
INIT_CKPT_DIR=XLNet/xlnet_pretrain_model/chinese_xlnet_mid_L-24_H-768_A-12
PROC_DATA_DIR=XLNet/proc_data
MODEL_DIR=XLNet/experiment/chinese_xlnet_mid_L-24_H-768_A-12_S-256_B-2
CUDA_VISIBLE_DEVICES=0,1 python3 XLNet/xlnet/run_squad.py \
  --use_tpu=False \
  --num_hosts=1 \
  --num_core_per_host=3 \
  --model_config_path=${INIT_CKPT_DIR}/xlnet_config.json \
  --spiece_model_file=${INIT_CKPT_DIR}/spiece.model \
  --output_dir=${PROC_DATA_DIR} \
  --init_checkpoint=${INIT_CKPT_DIR}/xlnet_model.ckpt \
  --model_dir=${MODEL_DIR}/model_ckpt \
  --train_file=${DRCD_DIR}/DRCD_training.json \
  --predict_file=${DRCD_DIR}/DRCD_dev.json \
  --predict_dir=${MODEL_DIR}/predict_result/dev \
  --uncased=False \
  --max_seq_length=256 \
  --do_train=True \
  --train_batch_size=2 \
  --do_predict=True \
  --predict_batch_size=32 \
  --learning_rate=3e-5 \
  --adam_epsilon=1e-6 \
  --iterations=1000 \
  --save_steps=1000 \
  --train_steps=12000 \
  --warmup_steps=1000 \
  $@

Evaluate on dev set

#!/bin/bash
####local path
DRCD_DIR=raw_data/
EVALUATE_DIR=XLNet/xlnet/
PREDICT_RESULT=XLNet/experiment/chinese_xlnet_mid_L-24_H-768_A-12_S-256_B-2/predict_result
python2 $EVALUATE_DIR/cmrc2018_evaluate.py $DRCD_DIR/DRCD_dev.json $PREDICT_RESULT/dev/predictions.json

Performance on dev set

使用訓練好的XLNet，predict test set，並使用 cmrc2018_evaluate.py 產生test set 的EM和F1。

Predict test set

#!/bin/bash
#### local path
DRCD_DIR=raw_data/
INIT_CKPT_DIR=XLNet/xlnet_pretrain_model/chinese_xlnet_mid_L-24_H-768_A-12
INIT_CKPT_DIR_1=XLNet/experiment/chinese_xlnet_mid_L-24_H-768_A-12_S-256_B-2
PROC_DATA_DIR=XLNet/proc_data
MODEL_DIR=XLNet/experiment/chinese_xlnet_mid_L-24_H-768_A-12_S-256_B-2
CUDA_VISIBLE_DEVICES=0,1 python3 XLNet/xlnet/run_squad.py \
  --use_tpu=False \
  --num_hosts=1 \
  --num_core_per_host=3 \
  --model_config_path=${INIT_CKPT_DIR}/xlnet_config.json \
  --spiece_model_file=${INIT_CKPT_DIR}/spiece.model \
  --output_dir=${PROC_DATA_DIR} \
  --init_checkpoint=${INIT_CKPT_DIR_1}/model.ckpt-12000 \
  --model_dir=${MODEL_DIR}/model_ckpt \
  --train_file=${DRCD_DIR}/DRCD_training.json \
  --predict_file=${DRCD_DIR}/DRCD_test.json \
  --predict_dir=${MODEL_DIR}/predict_result/test \
  --uncased=False \
  --max_seq_length=256 \
  --do_train=False \
  --train_batch_size=2 \
  --do_predict=True \
  --predict_batch_size=32 \
  --learning_rate=3e-5 \
  --adam_epsilon=1e-6 \
  --iterations=1000 \
  --save_steps=1000 \
  --train_steps=12000 \
  --warmup_steps=1000 \
  $@

Evaluate on test set

#!/bin/bash
####local path
DRCD_DIR=raw_data/
EVALUATE_DIR=XLNet/xlnet/
PREDICT_RESULT=XLNet/experiment/chinese_xlnet_mid_L-24_H-768_A-12_S-256_B-2/predict_result
python2 $EVALUATE_DIR/cmrc2018_evaluate.py $DRCD_DIR/DRCD_test.json $PREDICT_RESULT/test/predictions.json

Performance on test set

还没出来吗，等的花都谢了

序列标注类任务，比如分词，NER,关系抽取有吗

请问模型的输入和输出Tensor对象的名称是什么呢？

在固化ckpt模型文件的时候，需要知道输入输出节点的名称。

能不能给一个pytorch的简易教程谢谢

如题

数据处理脚本

请问是否可以提供cmrc和drcd的train/dev/test数据处理脚本/tfrecord文件，直接使用XLNet中的处理脚本并不兼容

How to load this pre-train model in pytorch?

i found this code transform the tf model into pytorch using hugg face 's transformer.
So i download the pytorch-pretrained model, using huggface's code to load this model.
the code is as follows:

model_class = XLNetForSequenceClassification
tokenizer_class = XLNetTokenizer
pretrained_weights = "./pretrain_model" #here is a dir where the pretrain model  is unzipped
tokenizer = tokenizer_class.from_pretrained(pretrained_weights)
model = model_class.from_pretrained(pretrained_weights, num_labels=10)

however, model_class.from_pretrained(xxx) throw out an error:

any idea is appreciated, thanks!
another issue is that when using the chinese pretrained model, what should i do with tokenizer? the input seq of chinese should be split in word or sub-word?

Chinese XLNet for text generation?

Hi, Thanks for your work.

I was trying to use your model to generate Chinese text (as can be done in terms of English with XLNetLMHeadModel in huggingface transformers). But I got:

    "You tried to generate sequences with a model that does not have a LM Head."
AttributeError: You tried to generate sequences with a model that does not have a LM Head.Please use another model class (e.g. `OpenAIGPTLMHeadModel`, `XLNetLMHeadModel`, `GPT2LMHeadModel`, `CTRLLMHeadModel`, `T5WithLMHeadModel`, `TransfoXLLMHeadModel`)

Does this model contain LM Head for text generation task and is there a plan to release one?

where can i download this pretrained-xlnet

i rencently was build a chinese ner project , so i want use a chinese vec .but i didn't find the pretrained-xlnet model , could you please give me a download page url?thank you so much.

pip安装sentencepiece安装不了

SPIECE_UNDERLINE

你好，请问下代码中的SPIECE_UNDERLINE起着什么作用呀？因为encode_pieces分词的同时会给每句话的开始都加上 SPIECE_UNDERLINE

用pytorch遇到错误

使用huggingface的Quick tour方法
代码:
import torch
import tokenization_xlnet
import modeling_xlnet
tokenizer = tokenization_xlnet.XLNetTokenizer.from_pretrained('xlnet-mid-chinese')
model = modeling_xlnet.XLNetModel.from_pretrained('xlnet-mid-chinese')
input_ids = torch.tensor([tokenizer.encode("我喜欢吃西红柿炒鸡蛋", add_special_tokens=True)])
with torch.no_grad():
last_hidden_states = model(input_ids)[0]
all_hidden_states, all_attentions = model(input_ids)[-2:]
traced_model = torch.jit.trace(model, (input_ids,))
model.save_pretrained('./test_save') # save

遇到的问题:
/py3.6/lib/python3.6/site-packages/torch/tensor.py:389: RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
'incorrect results).', category=RuntimeWarning)
Traceback (most recent call last):
File "xlnet_test.py", line 14, in
traced_model = torch.jit.trace(model, (input_ids,))
File "/py3.6/lib/python3.6/site-packages/torch/jit/init.py", line 772, in trace
check_tolerance, _force_outplace, _module_class)
File "/py3.6/lib/python3.6/site-packages/torch/jit/init.py", line 904, in trace_module
module._c._create_method_from_trace(method_name, func, example_inputs, var_lookup_fn, _force_outplace)
RuntimeError: Tracer cannot infer type of (tensor([[[ 1.8302, -0.2841, 1.7623, ..., -4.0171, -2.8738, -2.7551],
[-0.1806, -0.4168, -0.9308, ..., -3.9143, -1.5399, -1.9979],
[ 1.8243, 1.3354, -0.4644, ..., -3.2942, -1.5304, -1.4603],
...,
[-2.4907, -0.2998, 1.6560, ..., -1.6929, 2.9048, 0.2806],
[-3.3055, 2.5498, 2.3597, ..., -2.5295, 1.5212, -1.0081],
[-0.8349, 0.0219, 1.2810, ..., -3.9269, 1.6507, -0.4940]]],
grad_fn=), (None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None))
:Cannot infer type of a None value (toTraceableIValue at /pytorch/torch/csrc/jit/pybind_utils.h:268)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7f256bee8273 in /py3.6/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: + 0x44e288 (0x7f256cf27288 in /py3.6/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #2: + 0x4bdda2 (0x7f256cf96da2 in /py3.6/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #3: + 0x4d1d81 (0x7f256cfaad81 in /py3.6/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #4: + 0x1d3ef4 (0x7f256ccacef4 in /py3.6/lib/python3.6/site-packages/torch/lib/libtorch_python.so)

frame #6: python() [0x5067b0]
frame #8: python() [0x504232]
frame #9: python() [0x505e83]
frame #10: python() [0x5066f0]
frame #12: python() [0x504232]
frame #13: python() [0x505e83]
frame #14: python() [0x5066f0]
frame #16: python() [0x504232]
frame #18: python() [0x647fa2]
frame #23: __libc_start_main + 0xf0 (0x7f2570fb4830 in /lib/x86_64-linux-gnu/libc.so.6)

如何使用单机多卡GPU训练呢？

比如在分类任务中，还需要传 --use_tpu=True 吗？
xlnet本身十分支持分布式训练呢？
多卡训练是否支持google的这种方式呢？
CUDA_VISIBLE_DEVICES=0,1,2,3 python run_classifier.py

期待您的回答

sentencepiece加载词表出错

直接使用下载的词表，加载时出现如下错误
RuntimeError: Internal: /sentencepiece/src/sentencepiece_processor.cc(73) [model_proto->ParseFromArray(serialized.data(), serialized.size())]

原生代码配的参数是1024，给的预训练大小是768 ，怎么修改？

Shape of variable model/transformer/layer_0/ff/LayerNorm/beta:0 ((1024,)) doesn't match with shape of tensor model/transformer/layer_0/ff/LayerNorm/beta ([768]) from checkpoint reader

Failed to get matching files

When I ran run_cmrc_drcd.py, there was a problem that "Failed to get matching files" when create checkpoint. I guess it's because there isn't xlnet_model.ckpt in pretrained modal files. I changed the xlnet_modal.ckpt.meta into xlnet_modal.ckpt. Still, it can not find xlnet_modal.ckpt.

INFO:tensorflow:Create CheckpointSaverHook.
I0120 13:58:49.598975 140028613015424 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Done calling model_fn.
I0120 13:58:50.135126 140028613015424 estimator.py:1150] Done calling model_fn.
INFO:tensorflow:TPU job name tpu_worker
I0120 13:58:53.244385 140028613015424 tpu_estimator.py:506] TPU job name tpu_worker
INFO:tensorflow:Graph was finalized.
I0120 13:58:55.594104 140028613015424 monitored_session.py:240] Graph was finalized.
ERROR:tensorflow:Error recorded from training_loop: From /job:tpu_worker/replica:0/task:0:
Unsuccessful TensorSliceReader constructor: Failed to get matching files on /content/drive/My Drive/chinese_xlnet_mid_L-24_H-768_A-12/xlnet_model.ckpt: Unimplemented: File system scheme '[local]' not implemented (file: '/content/drive/My Drive/chinese_xlnet_mid_L-24_H-768_A-12/xlnet_model.ckpt')
	 [[node checkpoint_initializer_117 (defined at usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]

Original stack trace for 'checkpoint_initializer_117':
  File "content/drive/My Drive/Chinese-PreTrained-XLNet-master/src/run_cmrc_drcd.py", line 1292, in <module>
    tf.app.run()
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "content/drive/My Drive/Chinese-PreTrained-XLNet-master/src/run_cmrc_drcd.py", line 1193, in main
    estimator.train(input_fn=train_input_fn, max_steps=FLAGS.train_steps)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3030, in train
    saving_listeners=saving_listeners)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 370, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1161, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1191, in _train_model_default
    features, labels, ModeKeys.TRAIN, self.config)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2857, in _call_model_fn
    config)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1149, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3184, in _model_fn
    scaffold = _get_scaffold(scaffold_fn)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3749, in _get_scaffold
    scaffold = scaffold_fn()
  File "content/drive/My Drive/Chinese-PreTrained-XLNet-master/src/model_utils.py", line 77, in tpu_scaffold
    tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/checkpoint_utils.py", line 291, in init_from_checkpoint
    init_from_checkpoint_fn)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/distribute_lib.py", line 1940, in merge_call
    return self._merge_call(merge_fn, args, kwargs)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/distribute_lib.py", line 1947, in _merge_call
    return merge_fn(self._strategy, *args, **kwargs)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/checkpoint_utils.py", line 286, in <lambda>
    ckpt_dir_or_file, assignment_map)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/checkpoint_utils.py", line 334, in _init_from_checkpoint
    _set_variable_or_list_initializer(var, ckpt_file, tensor_name_in_ckpt)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/checkpoint_utils.py", line 458, in _set_variable_or_list_initializer
    _set_checkpoint_initializer(variable_or_list, ckpt_file, tensor_name, "")
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/checkpoint_utils.py", line 412, in _set_checkpoint_initializer
    ckpt_file, [tensor_name], [slice_spec], [base_type], name=name)[0]
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_io_ops.py", line 1696, in restore_v2
    name=name)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

xlnet为什么要基于词，而不是基于字？？

有關於ChnSentiCorp資料集的下載

您好，
想請問哪裡能夠下載ChnSentiCorp這個資料集呢?
網路上的連結似乎都無法指引到下載點。
感謝!

readme中给出了不少开源数据集的中文xlnet的训练结果，有点看不懂结果的呈现。

83.1 (82.7) / 89.9 (89.6)；这个斜杠两边的和括号内的结果分别对应什么情况下的结果呢？base_best（average）/large_best（average）?

tensorflow版本与pytorch版本相差多大呢？

你好！
非常感谢开源模型！
但在线下我自己测试的时候，发现pytorch的效果要比tensorflow的小很多。请问：你们有测试过pytorch版本的效果吗？或者是我自己的代码有问题？

OOM when load model

THX a lot for the excellect work! When I try to do a text classification task, I got an error message:
2019-11-09 11:49:17.220991: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at training_ops.cc:2816 : Resource exhausted: OOM when allocating tensor with shape[3072,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
So I want to know if there is a chance for 2080ti to run XLNET?
Setting train_batch_size=1 and max_seq_length=100 do not work!
THX again!

关于MRC任务

使用XLNet在MRC任务上进行微调的时候，发现效果明显要比RoBERTa-wwm-ext-large差很多很多，数据加载部分应该是没有啥问题的，想问一下是模型出了问题吗？

tensorflow支持哪些版本

使用tensorflow 2.0有语法错误，tensorflow-gpu 1.13 eval步骤报错：
Traceback (most recent call last):
File "run_classifier.py", line 1002, in
tf.app.run()
File "/root/miniconda3/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "run_classifier.py", line 912, in main
global_step = int(cur_filename.split("-")[-1])
ValueError: invalid literal for int() with base 10: '/home/workspace/models/xlnet_model.ckpt'
应该是不同版本的tensorflow模型文件的命名格式不匹配导致的能否给一个tensorflow支持的版本范围

Will the corpus for training be open-sourced?

As we all know, chinese NLP research has been slowed down by inavailability of large open-source corpus, and this issue has become more and more severe due to the recent advances of large pre-trained LMs. So could you make the training corpus open-source, for further research or followup works?

ymcui / chinese-xlnet Goto Github PK

chinese-xlnet's Issues

Recommend Projects

Recommend Topics

Recommend Org