Git Product home page Git Product logo

mucoco's People

Contributors

sachin19 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mucoco's Issues

CUDA out of memory when using a 3090 gpu

My primary model is from GoogleDrive, which is a formality model provided by the author of the paraphrase model,
the issue came when I was running the example .sh file of the Style Transfer folder
Thanks a lot if anyone help me to fix it

>(base) root@84d353835da2:/workspace/mucoco# bash decode_example.sh data output plain debug plain
Some weights of the model checkpoint at /workspace/mucoco/primary_model were not used when initializing GPT2LMHeadModel: ['transformer.extra_embedding_project.bias', 'transformer.extra_embedding_project.weight']
- This IS expected if you are initializing GPT2LMHeadModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing GPT2LMHeadModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
batch_size为1
skip this example? Fears for T N pension after talks . Unions representing workers at Turner Newall say they are ' disappointed ' after talks with stricken parent firm Federal Mogul . [yes(y)/maybe(m)/no(n)]n
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Fears for T N pension after talks . Unions representing workers at Turner Newall say they are ' disappointed ' after talks with stricken parent firm Federal Mogul . unions representing workers at Turner Newall have expressed disappointment after talks with the firm's parent company, Federal Mogul. tensor([[ 3260,  6130,   351, 15406,   968,   439,    11,   791,   507, 10200,
          3259,   379, 15406,   968,   439,  6241, 18641,   287,   262,  4081,
          1222,   499,   418,    26,    82,  2560,  1664,    11,  5618, 30926,
           377,    13]])
predicting a sentence length:  32
Traceback (most recent call last):
  File "/workspace/mucoco/decode.py", line 4, in <module>
    cli_main()
  File "/workspace/mucoco/mucoco/decode.py", line 821, in cli_main
    main(args)
  File "/workspace/mucoco/mucoco/decode.py", line 565, in main
    optimizer.backward(total_batchloss, retain_graph=True, scaler=scaler) 
  File "/workspace/mucoco/mucoco/utils/optim.py", line 360, in backward
    loss.backward(retain_graph=retain_graph)
  File "/opt/conda/lib/python3.9/site-packages/torch/_tensor.py", line 396, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/opt/conda/lib/python3.9/site-packages/torch/autograd/__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: CUDA out of memory. Tried to allocate 7.67 GiB (GPU 0; 23.70 GiB total capacity; 15.35 GiB already allocated; 3.61 GiB free; 19.13 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Regarding sentiment-controlled generation scripts

Hi @Sachin19,

I checked the sentiment related scripts you kindly pushed to the repository!
And there were a minor correction and a question regarding them.

  1. I think in line 31 of your readme file, bash examples/training_constraint_models/train_sentiment_classifier.sh sst2 should be bash examples/training_constraint_models/train_sentiment_classifiers.sh sst2 with s at the end of the shell file name.

  2. In the examples/training_constraint_models/train_sentiment_classifiers.sh file, in line 8, it's executing data/sentiment/create_sst_sentiment_data.py which I currently do not see in the data/sentiment directory. Could it be missing?
    (I checked if the files currently in the directory could be just renamed and used, and it failed with following error.)

(mucoco2) ~/mucoco$ bash examples/training_constraint_models/train_sentiment_classifiers.sh sst2
download and preprocessing sst data
/home/hyeryungson/mucoco
[3310, 3610]
[428, 444]
[912, 909]
training sst2 classifier
Traceback (most recent call last):
  File "examples/training_constraint_models/train_classifier.py", line 31, in <module>
    train_paths.append(open(f"{base_path}/{sys.argv[3]}_{label}.{filetype}"))
FileNotFoundError: [Errno 2] No such file or directory: 'data/sentiment/sst2/train_0.jsonl'

Thanks! And I want to reiterate that I really enjoyed and appreciated your work. Look forward to exploring it further! :)

More instructions to train the primary model

Hi @Sachin19, thanks for your exciting work!

I had some doubts regarding the primary model pretraining. In the paper, it mentioned:

For our primary objective, we use a inverse-paraphrasing model as defined in §3.1, which we train on a corpus of Yelp Reviews6 [45]. First, we paraphrase each sentence in the corpus as described in Krishna et al. [28] creating a pseudo-parallel corpus (of reviews and their paraphrases) and train G as an inverse-paraphrase model to translate the paraphrases back to the original reviews.

Is there any existing code to pretrain this primary model? Can you point us to it?

If not, are there any pretrained models which we can directly use in:

PRIMARYMODEL=path/to/primary/model

Conda environment related issue

Hi @Sachin19! Thank you for sharing the code for your exciting work!

I was trying to create conda environment following the readme file, and the environment creation would fail due to package version conflicts. Could you please check if the issue occurs in your setting as well?

Here are some settings I've been creating environment at:

Operating System: Ubuntu 20.04.4 LTS
Kernel: Linux 5.4.0-100-generic
Architecture: x86-64
Python 3.9.12
conda version 23.1.0

Here is a screenshot of intermediate outputs I got while waiting for the environment to be created. After several attempts to resolve conflicts after this, the whole creation process aborted.
Screen Shot 2023-02-26 at 2 01 19 AM

Thank you in advance!

Need more instructions to run the code

Hi, @Sachin19, thanks to share your interesting work.

I think the idea of continuous optimization for controllable text generation is great, but when I'm going to run your code, I find there is no dataset and model checkpoint available.

I noticed that there are some code comments in the file decode_example.sh, #L29-21. But I still have no idea how to download the model checkpoint and dataset?

It would be grateful if you could add more instructions on how to download your training data and primary model(or classification model).

How to apply this to models without tie_embedding_weights?

In the paper, it is noted

"Even if the embedding tables are not shared, this loss may be computed and optimized using vectors from the output embedding table as parameters without any significant loss in performance."

But I am not sure how this really works; does this mean that you are choosing the embedding to optimize to be from the output embedding table only? If you do this, I'm not sure how you would get the input to feed into the model (since the parameters are not input embeddings anymore) and how will the gradient from the current step's loss will reach the embeddings of previous steps, since they are not directly dependent.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.