vinairesearch / misca Goto Github PK
View Code? Open in Web Editor NEWMISCA: A Joint Model for Multiple Intent Detection and Slot Filling with Intent-Slot Co-Attention (EMNLP 2023 - Findings)
License: GNU Affero General Public License v3.0
MISCA: A Joint Model for Multiple Intent Detection and Slot Filling with Intent-Slot Co-Attention (EMNLP 2023 - Findings)
License: GNU Affero General Public License v3.0
when I train the MISCA model, the following command is
python main.py --token_level word-level \ --model_type roberta \ --model_dir misca \ --task mixatis \ --data_dir data \ --attention_mode label \ --do_train \ --do_eval \ --num_train_epochs 100 \ --intent_loss_coef 0.5 \ --learning_rate 1e-5 \ --num_intent_detection \ --use_crf \ --base_model dir_base \ --intent_slot_attn_type coattention
Finally, I got a low expected f1 score. What's wrong with it?
The reproduction of the results on Overall is not very good. I ran it on V100, and here are my parameter settings and experimental results. May I ask what the reason is, or how should I reproduce it correctly? Thank you!
python main.py --token_level word-level
--model_type roberta
--model_dir dir_base
--task mixatis
--data_dir data
--attention_mode label
--do_train
--do_eval
--num_train_epochs 100
--intent_loss_coef 0.5
--learning_rate 1e-5
--train_batch_size 32
--num_intent_detection
--use_crf
python main.py --token_level word-level
--model_type roberta
--model_dir misca
--task mixatis
--data_dir data
--attention_mode label
--do_train
--do_eval
--num_train_epochs 100
--intent_loss_coef 0.5
--learning_rate 1e-5
--num_intent_detection
--use_crf \
--base_model dir_base
--intent_slot_attn_type coattention
I first train the base model using bert backbone,the following command is
python main.py --token_level word-level
--model_type bert
--model_dir dir_base
--task my dataset
--data_dir data
--attention_mode label
--do_train
--do_eval
--num_intent_detection
--use_crf,
and then loads dir_base model,the following command is
python main.py --token_level word-level
--model_type bert
--model_dir misca
--task my dataset
--data_dir data
--attention_mode label
--do_train
--do_eval
--num_intent_detection
--use_crf
--base_model dir_base
--intent_slot_attn_type coattention,
however, the result still low.
while loading model its not able to load model due to no config.json file present in the generated model directory.
OSError: misca does not appear to have a file named config.json. Checkout 'https://huggingface.co/misca/main' for available files.
also after loading model while prediction its asking for sequence_length and heads which is not present in the inputs dictionary.
Hi, can I check if line 29 of predict.py has a missing attribute?
Also, when i try to run predict.py with this command
python predict.py --input_file ./data/sample_pred_in.txt --output_file ./data/sample_pred_out.txt --model_dir ./dir_newmodel_3
i get this error
TypeError: JointRoberta.forward() missing 2 required positional arguments: 'heads' and 'seq_lens'
Hello, i tried to reproduce model performance by train from scratch and from given base model checkpoint. However, none of these way produce performance claimed in paper. Could you give me more details to reproduce?
I download given checkpoint in best_model folder and run evaluate :
** evaluate scrips **
python main.py --token_level word-level
--model_type roberta
--model_dir misca
--task mixatis
--data_dir data
--attention_mode label
--do_train
--do_eval
--model_dir best_model
--num_train_epochs 100
--learning_rate 1e-5
--num_intent_detection
--use_crf
--intent_slot_attn_type coattention
However, i got much lower performance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.