Hi, I was trying to implement <a href="https://github.com/huggingface/transfer-learnin

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Thank's for your quick response <a class="user-mention notranslate" data-hovercard-typ

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thank you for supporting this model <a class="user-mention notranslate" data-hovercard

Support for OpenAIGPTDoubleHeadsModel, GPT2DoubleHeadsModel, SequenceSummary,AdamW about rust-bert HOT 8 CLOSED

guillaume-be commented on May 18, 2024 1

Support for OpenAIGPTDoubleHeadsModel, GPT2DoubleHeadsModel, SequenceSummary,AdamW

from rust-bert.

Comments (8)

guillaume-be commented on May 18, 2024

Hello @sachaarbonel ,

Could you please describe a bit further your use-case? Are you trying to implement a GPT/GPT2 model for language generation in the context of conversations? If so, the second head of the OpenAIGPTDoubleHeadsModel and GPT2DoubleHeadsModel is only used as a multitask objective during training (BERT next sentence prediction task), and is not needed at inference time.

As you can see in the interact script this specific model is not used, and the standard LM models are used for inference. The difference to the standard generation is that history is being recorded, the segment IDs are used to keep track of each conversation side and the decoding is simpler than the full-fledged generation normally used with beam search/top-p/top-k/temperature decoding.

I was planning to eventually implement the conversation capabilities, if you confirm this is of interest to you I can move this higher in my priority list.

from rust-bert.

sachaarbonel commented on May 18, 2024

Thank's for your quick response @guillaume-be, I was indeed trying to implement conversations. I didn't realize it wasn't needed at inference time. Looking forward to this feature.

from rust-bert.

guillaume-be commented on May 18, 2024

@sachaarbonel Do you have specific requirements for the ConvAI model? I was looking at available implementations for conversational generation and a more recent implementation from Microsoft called DialoGPT would also be available.

from rust-bert.

sachaarbonel commented on May 18, 2024

@guillaume-be I don't have specific requirements but yeah it seems to me that DialoGPT is a better fit as I'm not very convinced by the personality approach of Hugginface (after some experiments with it)

from rust-bert.

guillaume-be commented on May 18, 2024

@sachaarbonel I have added multi-turn conversation capability in #57. Please try it out and let me know if this is what you were looking for. So far the medium version of DialoGPT is available as a ready-to-use model.

from rust-bert.

sachaarbonel commented on May 18, 2024

Thank you for supporting this model @guillaume-be! I tried to run it on my mac (other models worked) and I'm getting this PyTorch error:
Error: TorchError { c_error: "[enforce fail at inline_container.cc:143] . PytorchStreamReader failed reading zip archive: failed finding central directory\nframe #0: c10::ThrowEnforceNotMet(char const*, int, char const*, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, void const*) + 191 (0x108fa99cf in libc10.dylib)\nframe #1: caffe2::serialize::PyTorchStreamReader::valid(char const*, char const*) + 131 (0x113e7de53 in libtorch_cpu.dylib)\nframe #2: caffe2::serialize::PyTorchStreamReader::init() + 315 (0x113e7ce1b in libtorch_cpu.dylib)\nframe #3: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::__1::unique_ptr<caffe2::serialize::ReadAdapterInterface, std::__1::default_deletecaffe2::serialize::ReadAdapterInterface >) + 133 (0x113e7dd45 in libtorch_cpu.dylib)\nframe #4: torch::jit::load(std::__1::unique_ptr<caffe2::serialize::ReadAdapterInterface, std::__1::default_deletecaffe2::serialize::ReadAdapterInterface >, c10::optionalc10::Device, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > > > >&) + 120 (0x115485b38 in libtorch_cpu.dylib)\nframe #5: torch::jit::load(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, c10::optionalc10::Device, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > > > >&) + 103 (0x115485fa7 in libtorch_cpu.dylib)\nframe #6: at_load_callback_with_device + 114 (0x107ff83f2 in conversation)\nframe #7: tch::wrappers::tensor::Tensor::load_multi_with_device::h3cdf869685af4646 + 429 (0x10799ee0d in conversation)\nframe #8: tch::nn::var_store::VarStore::load::h1d3c6a9fb0c8ad5b + 63 (0x107994f5f in conversation)\nframe #9: rust_bert::pipelines::generation::GPT2Generator::new::h31195c4b189f1bbd + 1429 (0x1079d6e75 in conversation)\nframe #10: rust_bert::pipelines::conversation::ConversationModel::new::h979241362c26b122 + 881 (0x107996a91 in conversation)\nframe #11: conversation::main::hc6c491d2528533d2 + 61 (0x10798aded in conversation)\nframe #12: std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h3d8ed5424b557375 + 14 (0x10798ad8e in conversation)\nframe #13: std::rt::lang_start_internal::h026a2ad90a8dc9de + 441 (0x1081041e9 in conversation)\nframe #14: std::rt::lang_start::hb0a847d8103881d1 + 65 (0x10798ad71 in conversation)\nframe #15: main + 34 (0x10798b2c2 in conversation)\nframe #16: start + 1 (0x7fff64c273d5 in libdyld.dylib)\nframe #17: 0x0 + 1 (0x1 in ???)\n" }

Apparently it is due to the model itself but I'm not sure. Might also be because the model file still being zipped?

EDIT:
pytorch version : 1.5.0
python : 3.7.4
macos : 10.14.6

from rust-bert.

guillaume-be commented on May 18, 2024

my mistake - forgot to upload the model. Can you please try again?

from rust-bert.

sachaarbonel commented on May 18, 2024

Nice it worked

from rust-bert.

Support for OpenAIGPTDoubleHeadsModel, GPT2DoubleHeadsModel, SequenceSummary,AdamW about rust-bert HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent