Comments (4)
It is xlnet_model.ckpt
not xlnet_modal.ckpt
.
Would you please check your command lines to see if your model names are correct.
Also, you are encouraged to post your command lines here so that I can look into what's wrong.
from chinese-xlnet.
It is
xlnet_model.ckpt
notxlnet_modal.ckpt
.
Would you please check your command lines to see if your model names are correct.
Also, you are encouraged to post your command lines here so that I can look into what's wrong.
Yes, I copy the directory of the file
here is my command line
xlnet="python -u /content/drive/'My Drive'/Chinese-PreTrained-XLNet-master/src/run_cmrc_drcd.py \
--spiece_model_file=/content/drive/'My Drive'/spiece.model \
--model_config_path=/content/drive/'My Drive'/xlnet_config.json \
--init_checkpoint=/content/drive/'My Drive'/chinese_xlnet_mid_L_24_H_768_A_12/xlnet_model.ckpt \
--use_tpu=True \
--num_hosts=1 \
--num_core_per_host=8 \
--output_dir=/content/drive/'My Drive'/cmrc2018-master/squad-style-data \
--model_dir=/content/drive/'My Drive'/chinese_xlnet_mid_L-24_H-768_A-12 \
--predict_dir=/content/drive/'My Drive'/chinese_xlnet_mid_L-24_H-768_A-12/eval \
--train_file=/content/drive/'My Drive'/cmrc2018-master/squad-style-data/cmrc2018_train.json \
--uncased=False \
--max_answer_length=40 \
--max_seq_length=512 \
--do_train=True \
--train_batch_size=16 \
--do_predict=False \
--predict_batch_size=16 \
--learning_rate=3e-5 \
--adam_epsilon=1e-6 \
--iterations=1000 \
--save_steps=2000 \
--train_steps=2400 \
--warmup_steps=240"
!{xlnet}
I ran it in colab, would it be the cause?
I could initialize from the ckpt
I0120 14:25:44.918656 139675852879744 model_utils.py:71] Initialize from the ckpt /content/drive/My Drive/chinese_xlnet_mid_L_24_H_768_A_12/xlnet_model.ckpt
INFO:tensorflow:**** Global Variables ****
I0120 14:25:44.925628 139675852879744 model_utils.py:85] **** Global Variables ****
INFO:tensorflow: name = model/transformer/r_w_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
I0120 14:25:44.925852 139675852879744 model_utils.py:91] name = model/transformer/r_w_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/r_r_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
I0120 14:25:44.926049 139675852879744 model_utils.py:91] name = model/transformer/r_r_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/word_embedding/lookup_table:0, shape = (32000, 768), *INIT_FROM_CKPT*
I0120 14:25:44.926191 139675852879744 model_utils.py:91] name = model/transformer/word_embedding/lookup_table:0, shape = (32000, 768), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/r_s_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
I0120 14:25:44.926325 139675852879744 model_utils.py:91] name = model/transformer/r_s_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
INFO:tensorflow: name = model/transformer/seg_embed:0, shape = (24, 2, 12, 64), *INIT_FROM_CKPT*
But after graph was finalized, it shows that failed to get matching files
from chinese-xlnet.
OK, I see.
As far as I know, if you are using TPU for computing, the checkpoint should be loaded from GCS (Google Cloud Storage) instead of local file system.
The GCS path looks like 'gs://your-bucket-name/dir/file'.
You are advised to refer to the BERT fine-tuning tutorial on Colab.
from chinese-xlnet.
Thank you,my problem was solved!
from chinese-xlnet.
Related Issues (20)
- train.py HOT 1
- 你好,我用 pytorch 版本的 XLNet-base进行测试生成,未 fine-tuning,发现效果贼差,不知道怎么回事? HOT 7
- 正在训练的时候就报错,重新尝试了几次都是这个错误,不知道是代码原因还是数据原因,跪求解决 HOT 2
- 如何对chinese xlnet 蒸馏?产生小模型 HOT 1
- 相对于官方版本,中文版的xlnet对算法上有改动吗,如果有的话改动在什么地方呢? HOT 2
- 预训练时设置的mem_len=384但是下载的pytorch模型里mem_len=null HOT 4
- XLNet其实不能稳压RoBERTa吧? HOT 2
- 如何做预测 HOT 2
- 在huggingface.co的chinese-xlnet-mid预训练模型做生成任务,没有结果 HOT 2
- 你好,我使用 pytorch 版本的 XLNet 跑 baseline 二分类,效果非常差 HOT 3
- 有没有比过GPU (train_gpu.py)和TPU (train.py)版本的预训练效果 HOT 2
- 关于分词上的一点问题 HOT 5
- Performance issues in the program HOT 5
- Performance issue in src/data_utils.py (by P3) HOT 7
- 想在自己领域数据集上进行二次pretrain,正确的操作方式是什么呢? HOT 6
- 请问大佬,关于中文XLNet自回归的问题 HOT 4
- ValueError: not enough values to unpack (expected 2, got 1) HOT 2
- Feature: cls_index (data type: int64) is required but could not be found HOT 4
- 请教一下有适合的CPU推理加速的框架么? HOT 1
- ChnSentiCorp 生成的模型怎么使用?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chinese-xlnet.