Git Product home page Git Product logo

Comments (8)

guillaume-be avatar guillaume-be commented on May 18, 2024

Hello @sachaarbonel ,

Could you please describe a bit further your use-case? Are you trying to implement a GPT/GPT2 model for language generation in the context of conversations? If so, the second head of the OpenAIGPTDoubleHeadsModel and GPT2DoubleHeadsModel is only used as a multitask objective during training (BERT next sentence prediction task), and is not needed at inference time.

As you can see in the interact script this specific model is not used, and the standard LM models are used for inference. The difference to the standard generation is that history is being recorded, the segment IDs are used to keep track of each conversation side and the decoding is simpler than the full-fledged generation normally used with beam search/top-p/top-k/temperature decoding.

I was planning to eventually implement the conversation capabilities, if you confirm this is of interest to you I can move this higher in my priority list.

from rust-bert.

sachaarbonel avatar sachaarbonel commented on May 18, 2024

Thank's for your quick response @guillaume-be, I was indeed trying to implement conversations. I didn't realize it wasn't needed at inference time. Looking forward to this feature.

from rust-bert.

guillaume-be avatar guillaume-be commented on May 18, 2024

@sachaarbonel Do you have specific requirements for the ConvAI model? I was looking at available implementations for conversational generation and a more recent implementation from Microsoft called DialoGPT would also be available.

from rust-bert.

sachaarbonel avatar sachaarbonel commented on May 18, 2024

@guillaume-be I don't have specific requirements but yeah it seems to me that DialoGPT is a better fit as I'm not very convinced by the personality approach of Hugginface (after some experiments with it)

from rust-bert.

guillaume-be avatar guillaume-be commented on May 18, 2024

@sachaarbonel I have added multi-turn conversation capability in #57. Please try it out and let me know if this is what you were looking for. So far the medium version of DialoGPT is available as a ready-to-use model.

from rust-bert.

sachaarbonel avatar sachaarbonel commented on May 18, 2024

Thank you for supporting this model @guillaume-be! I tried to run it on my mac (other models worked) and I'm getting this PyTorch error:
Error: TorchError { c_error: "[enforce fail at inline_container.cc:143] . PytorchStreamReader failed reading zip archive: failed finding central directory\nframe #0: c10::ThrowEnforceNotMet(char const*, int, char const*, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, void const*) + 191 (0x108fa99cf in libc10.dylib)\nframe #1: caffe2::serialize::PyTorchStreamReader::valid(char const*, char const*) + 131 (0x113e7de53 in libtorch_cpu.dylib)\nframe #2: caffe2::serialize::PyTorchStreamReader::init() + 315 (0x113e7ce1b in libtorch_cpu.dylib)\nframe #3: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::__1::unique_ptr<caffe2::serialize::ReadAdapterInterface, std::__1::default_deletecaffe2::serialize::ReadAdapterInterface >) + 133 (0x113e7dd45 in libtorch_cpu.dylib)\nframe #4: torch::jit::load(std::__1::unique_ptr<caffe2::serialize::ReadAdapterInterface, std::__1::default_deletecaffe2::serialize::ReadAdapterInterface >, c10::optionalc10::Device, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > > > >&) + 120 (0x115485b38 in libtorch_cpu.dylib)\nframe #5: torch::jit::load(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, c10::optionalc10::Device, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > > > >&) + 103 (0x115485fa7 in libtorch_cpu.dylib)\nframe #6: at_load_callback_with_device + 114 (0x107ff83f2 in conversation)\nframe #7: tch::wrappers::tensor::Tensor::load_multi_with_device::h3cdf869685af4646 + 429 (0x10799ee0d in conversation)\nframe #8: tch::nn::var_store::VarStore::load::h1d3c6a9fb0c8ad5b + 63 (0x107994f5f in conversation)\nframe #9: rust_bert::pipelines::generation::GPT2Generator::new::h31195c4b189f1bbd + 1429 (0x1079d6e75 in conversation)\nframe #10: rust_bert::pipelines::conversation::ConversationModel::new::h979241362c26b122 + 881 (0x107996a91 in conversation)\nframe #11: conversation::main::hc6c491d2528533d2 + 61 (0x10798aded in conversation)\nframe #12: std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h3d8ed5424b557375 + 14 (0x10798ad8e in conversation)\nframe #13: std::rt::lang_start_internal::h026a2ad90a8dc9de + 441 (0x1081041e9 in conversation)\nframe #14: std::rt::lang_start::hb0a847d8103881d1 + 65 (0x10798ad71 in conversation)\nframe #15: main + 34 (0x10798b2c2 in conversation)\nframe #16: start + 1 (0x7fff64c273d5 in libdyld.dylib)\nframe #17: 0x0 + 1 (0x1 in ???)\n" }

Apparently it is due to the model itself but I'm not sure. Might also be because the model file still being zipped?

EDIT:
pytorch version : 1.5.0
python : 3.7.4
macos : 10.14.6

from rust-bert.

guillaume-be avatar guillaume-be commented on May 18, 2024

my mistake - forgot to upload the model. Can you please try again?

from rust-bert.

sachaarbonel avatar sachaarbonel commented on May 18, 2024

Nice it worked

from rust-bert.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.