Comments (8)
Hello @sachaarbonel ,
Could you please describe a bit further your use-case? Are you trying to implement a GPT/GPT2 model for language generation in the context of conversations? If so, the second head of the OpenAIGPTDoubleHeadsModel
and GPT2DoubleHeadsModel
is only used as a multitask objective during training (BERT
next sentence prediction task), and is not needed at inference time.
As you can see in the interact
script this specific model is not used, and the standard LM models are used for inference. The difference to the standard generation
is that history is being recorded, the segment IDs are used to keep track of each conversation side and the decoding is simpler than the full-fledged generation
normally used with beam search/top-p/top-k/temperature decoding.
I was planning to eventually implement the conversation capabilities, if you confirm this is of interest to you I can move this higher in my priority list.
from rust-bert.
Thank's for your quick response @guillaume-be, I was indeed trying to implement conversations. I didn't realize it wasn't needed at inference time. Looking forward to this feature.
from rust-bert.
@sachaarbonel Do you have specific requirements for the ConvAI model? I was looking at available implementations for conversational generation and a more recent implementation from Microsoft called DialoGPT would also be available.
from rust-bert.
@guillaume-be I don't have specific requirements but yeah it seems to me that DialoGPT is a better fit as I'm not very convinced by the personality approach of Hugginface (after some experiments with it)
from rust-bert.
@sachaarbonel I have added multi-turn conversation capability in #57. Please try it out and let me know if this is what you were looking for. So far the medium
version of DialoGPT is available as a ready-to-use model.
from rust-bert.
Thank you for supporting this model @guillaume-be! I tried to run it on my mac (other models worked) and I'm getting this PyTorch error:
Error: TorchError { c_error: "[enforce fail at inline_container.cc:143] . PytorchStreamReader failed reading zip archive: failed finding central directory\nframe #0: c10::ThrowEnforceNotMet(char const*, int, char const*, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, void const*) + 191 (0x108fa99cf in libc10.dylib)\nframe #1: caffe2::serialize::PyTorchStreamReader::valid(char const*, char const*) + 131 (0x113e7de53 in libtorch_cpu.dylib)\nframe #2: caffe2::serialize::PyTorchStreamReader::init() + 315 (0x113e7ce1b in libtorch_cpu.dylib)\nframe #3: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::__1::unique_ptr<caffe2::serialize::ReadAdapterInterface, std::__1::default_deletecaffe2::serialize::ReadAdapterInterface >) + 133 (0x113e7dd45 in libtorch_cpu.dylib)\nframe #4: torch::jit::load(std::__1::unique_ptr<caffe2::serialize::ReadAdapterInterface, std::__1::default_deletecaffe2::serialize::ReadAdapterInterface >, c10::optionalc10::Device, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > > > >&) + 120 (0x115485b38 in libtorch_cpu.dylib)\nframe #5: torch::jit::load(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, c10::optionalc10::Device, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > > > >&) + 103 (0x115485fa7 in libtorch_cpu.dylib)\nframe #6: at_load_callback_with_device + 114 (0x107ff83f2 in conversation)\nframe #7: tch::wrappers::tensor::Tensor::load_multi_with_device::h3cdf869685af4646 + 429 (0x10799ee0d in conversation)\nframe #8: tch::nn::var_store::VarStore::load::h1d3c6a9fb0c8ad5b + 63 (0x107994f5f in conversation)\nframe #9: rust_bert::pipelines::generation::GPT2Generator::new::h31195c4b189f1bbd + 1429 (0x1079d6e75 in conversation)\nframe #10: rust_bert::pipelines::conversation::ConversationModel::new::h979241362c26b122 + 881 (0x107996a91 in conversation)\nframe #11: conversation::main::hc6c491d2528533d2 + 61 (0x10798aded in conversation)\nframe #12: std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h3d8ed5424b557375 + 14 (0x10798ad8e in conversation)\nframe #13: std::rt::lang_start_internal::h026a2ad90a8dc9de + 441 (0x1081041e9 in conversation)\nframe #14: std::rt::lang_start::hb0a847d8103881d1 + 65 (0x10798ad71 in conversation)\nframe #15: main + 34 (0x10798b2c2 in conversation)\nframe #16: start + 1 (0x7fff64c273d5 in libdyld.dylib)\nframe #17: 0x0 + 1 (0x1 in ???)\n" }
Apparently it is due to the model itself but I'm not sure. Might also be because the model file still being zipped?
EDIT:
pytorch version : 1.5.0
python : 3.7.4
macos : 10.14.6
from rust-bert.
my mistake - forgot to upload the model. Can you please try again?
from rust-bert.
Nice it worked
from rust-bert.
Related Issues (20)
- Does the marian model have a method like huggingface generate? HOT 4
- Fine-tuning Marian model can't use pipeline doing translate task HOT 2
- When label mapping aren't provided - we get a crash HOT 1
- Zeroshot with DeBerta v2 vs BART - is it worth it? HOT 2
- Can I use this lib with onnx but without libtorch? HOT 1
- how to use Cross-Encoder for MS Marco by rust-bert?
- Seek Assistance and Support for DeBERTa Model HOT 2
- Is multilabel prediction correct? HOT 1
- Evaluation fails when trying to extract keywords from a specific sentence HOT 2
- Please expose tonekizer params on models where `forward_t` is exposed
- Downloading a model to a local Directory HOT 4
- Question: Configuring ZeroShotClassificationModel with DeBERTaV2 - Documentation HOT 1
- Upgrade Cargo dependencies HOT 2
- GPT-2 text generation throws an unexpected error HOT 4
- Any plan to release a new version? HOT 2
- Question: is it ok to continue after OOM error from `encode`
- update to be working with torch 2.2.0
- support for huggingface access token
- Error on running example in Linux
- RemoteResource doesn't allow loading safetensors models
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rust-bert.