Git Product home page Git Product logo

Comments (4)

ymcui avatar ymcui commented on May 24, 2024

It is xlnet_model.ckpt not xlnet_modal.ckpt.
Would you please check your command lines to see if your model names are correct.
Also, you are encouraged to post your command lines here so that I can look into what's wrong.

from chinese-xlnet.

liuziyi219 avatar liuziyi219 commented on May 24, 2024

It is xlnet_model.ckpt not xlnet_modal.ckpt.
Would you please check your command lines to see if your model names are correct.
Also, you are encouraged to post your command lines here so that I can look into what's wrong.

Yes, I copy the directory of the file

here is my command line

xlnet="python -u /content/drive/'My Drive'/Chinese-PreTrained-XLNet-master/src/run_cmrc_drcd.py \
	--spiece_model_file=/content/drive/'My Drive'/spiece.model \
	--model_config_path=/content/drive/'My Drive'/xlnet_config.json \
	--init_checkpoint=/content/drive/'My Drive'/chinese_xlnet_mid_L_24_H_768_A_12/xlnet_model.ckpt \
	--use_tpu=True \
	--num_hosts=1 \
	--num_core_per_host=8 \
	--output_dir=/content/drive/'My Drive'/cmrc2018-master/squad-style-data \
	--model_dir=/content/drive/'My Drive'/chinese_xlnet_mid_L-24_H-768_A-12 \
	--predict_dir=/content/drive/'My Drive'/chinese_xlnet_mid_L-24_H-768_A-12/eval \
	--train_file=/content/drive/'My Drive'/cmrc2018-master/squad-style-data/cmrc2018_train.json \
	--uncased=False \
	--max_answer_length=40 \
	--max_seq_length=512 \
	--do_train=True \
	--train_batch_size=16 \
	--do_predict=False \
	--predict_batch_size=16 \
	--learning_rate=3e-5 \
	--adam_epsilon=1e-6 \
	--iterations=1000 \
	--save_steps=2000 \
	--train_steps=2400 \
	--warmup_steps=240"
!{xlnet}

I ran it in colab, would it be the cause?
I could initialize from the ckpt

I0120 14:25:44.918656 139675852879744 model_utils.py:71] Initialize from the ckpt /content/drive/My Drive/chinese_xlnet_mid_L_24_H_768_A_12/xlnet_model.ckpt
INFO:tensorflow:**** Global Variables ****
I0120 14:25:44.925628 139675852879744 model_utils.py:85] **** Global Variables ****
INFO:tensorflow:  name = model/transformer/r_w_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
I0120 14:25:44.925852 139675852879744 model_utils.py:91]   name = model/transformer/r_w_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
INFO:tensorflow:  name = model/transformer/r_r_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
I0120 14:25:44.926049 139675852879744 model_utils.py:91]   name = model/transformer/r_r_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
INFO:tensorflow:  name = model/transformer/word_embedding/lookup_table:0, shape = (32000, 768), *INIT_FROM_CKPT*
I0120 14:25:44.926191 139675852879744 model_utils.py:91]   name = model/transformer/word_embedding/lookup_table:0, shape = (32000, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = model/transformer/r_s_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
I0120 14:25:44.926325 139675852879744 model_utils.py:91]   name = model/transformer/r_s_bias:0, shape = (24, 12, 64), *INIT_FROM_CKPT*
INFO:tensorflow:  name = model/transformer/seg_embed:0, shape = (24, 2, 12, 64), *INIT_FROM_CKPT*

But after graph was finalized, it shows that failed to get matching files

from chinese-xlnet.

ymcui avatar ymcui commented on May 24, 2024

OK, I see.
As far as I know, if you are using TPU for computing, the checkpoint should be loaded from GCS (Google Cloud Storage) instead of local file system.
The GCS path looks like 'gs://your-bucket-name/dir/file'.
You are advised to refer to the BERT fine-tuning tutorial on Colab.

from chinese-xlnet.

liuziyi219 avatar liuziyi219 commented on May 24, 2024

Thank you,my problem was solved!

from chinese-xlnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.