Git Product home page Git Product logo

misca's People

Contributors

datquocnguyen avatar thinhphp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

misca's Issues

Low expected f1score

when I train the MISCA model, the following command is
python main.py --token_level word-level \ --model_type roberta \ --model_dir misca \ --task mixatis \ --data_dir data \ --attention_mode label \ --do_train \ --do_eval \ --num_train_epochs 100 \ --intent_loss_coef 0.5 \ --learning_rate 1e-5 \ --num_intent_detection \ --use_crf \ --base_model dir_base \ --intent_slot_attn_type coattention
Finally, I got a low expected f1 score. What's wrong with it?
image

The reproduction result is not good on the Overall indicator.

The reproduction of the results on Overall is not very good. I ran it on V100, and here are my parameter settings and experimental results. May I ask what the reason is, or how should I reproduce it correctly? Thank you!
python main.py --token_level word-level
--model_type roberta
--model_dir dir_base
--task mixatis
--data_dir data
--attention_mode label
--do_train
--do_eval
--num_train_epochs 100
--intent_loss_coef 0.5
--learning_rate 1e-5
--train_batch_size 32
--num_intent_detection
--use_crf

python main.py --token_level word-level
--model_type roberta
--model_dir misca
--task mixatis
--data_dir data
--attention_mode label
--do_train
--do_eval
--num_train_epochs 100
--intent_loss_coef 0.5
--learning_rate 1e-5
--num_intent_detection
--use_crf \
--base_model dir_base
--intent_slot_attn_type coattention
not_good_overall

Why doesn't training a model this way work well?

I first train the base model using bert backbone,the following command is
python main.py --token_level word-level
--model_type bert
--model_dir dir_base
--task my dataset
--data_dir data
--attention_mode label
--do_train
--do_eval
--num_intent_detection
--use_crf,
and then loads dir_base model,the following command is
python main.py --token_level word-level
--model_type bert
--model_dir misca
--task my dataset
--data_dir data
--attention_mode label
--do_train
--do_eval
--num_intent_detection
--use_crf
--base_model dir_base
--intent_slot_attn_type coattention,
however, the result still low.
image

predict.py line 29 missing slot_hier, and missing 2 required positional arguments

Hi, can I check if line 29 of predict.py has a missing attribute?

Also, when i try to run predict.py with this command
python predict.py --input_file ./data/sample_pred_in.txt --output_file ./data/sample_pred_out.txt --model_dir ./dir_newmodel_3

i get this error
TypeError: JointRoberta.forward() missing 2 required positional arguments: 'heads' and 'seq_lens'

can not reproduce performance

Hello, i tried to reproduce model performance by train from scratch and from given base model checkpoint. However, none of these way produce performance claimed in paper. Could you give me more details to reproduce?

I download given checkpoint in best_model folder and run evaluate :
** evaluate scrips **
python main.py --token_level word-level
--model_type roberta
--model_dir misca
--task mixatis
--data_dir data
--attention_mode label
--do_train
--do_eval
--model_dir best_model
--num_train_epochs 100
--learning_rate 1e-5
--num_intent_detection
--use_crf
--intent_slot_attn_type coattention

However, i got much lower performance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.