Git Product home page Git Product logo

pytorch_gbw_lm's Issues

how to build Log_Uniform Sampler?

On my macbook, I run 'python setup.py install' or 'python setup.py build_ext --inplace' in log_uniform folder and got error:

➜  log_uniform git:(master) ✗ ~/miniconda3/bin/python setup.py install
running install
running build
running build_ext
building 'log_uniform' extension
creating build
creating build/temp.macosx-10.7-x86_64-3.7
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/gaoxianglu/miniconda3/include -arch x86_64 -I/Users/gaoxianglu/miniconda3/include -arch x86_64 -I/Users/gaoxianglu/miniconda3/lib/python3.7/site-packages/numpy/core/include -I/Users/gaoxianglu/miniconda3/include/python3.7m -c log_uniform.cpp -o build/temp.macosx-10.7-x86_64-3.7/log_uniform.o -std=c++11
warning: include path for stdlibc++ headers not found; pass '-stdlib=libc++' on the command line to use the libc++ standard library instead
      [-Wstdlibcxx-not-found]
log_uniform.cpp:635:10: fatal error: 'ios' file not found
#include "ios"
         ^~~~~
1 warning and 1 error generated.
error: command 'gcc' failed with exit status 1

I installed xcode command line, but the error still exists

Preprocess problem

It seems torch.load() cannot load train_data.th7?
I cannot figure out how to "run "process_gbw.py" on the "train_data.th7" file to create the "train_data.sid" file."

RuntimeError: inconsistent tensor size

I have problem:
load word frequency mapping - complete
loaded tensor torch.Size([798949912])
loaded tensor torch.Size([798949912, 3])
#sentences 798949912
load train data - complete
#sentences 6073
load test data - complete
Traceback (most recent call last):
File "main.py", line 195, in
train()
File "main.py", line 157, in train
for batch, item in enumerate(train_loader):
File "/home/xxxx/PyTorch_LM/lm/fast_gbw.py", line 89, in batch_generator
tracker_list[idx] = self.add(seq_length, source, target, idx, tracker)
File "/home/xxxx/lm/PyTorch_LM/lm/fast_gbw.py", line 124, in add
source[curr:batch_end, batch_idx] = self.corpus[seq_start:seq_end]
RuntimeError: inconsistent tensor size, expected tensor [19] and src [798949911] to have the same number of elements, but got 19 and 798949911 elements respectively at /pytorch/torch/lib/TH/generic/THTensorCopy.c:86

sample_ids being ignored?

Hi! thanks for your code. I've been reading through it to understand the approach and I've noticed that the output of sampled is actually always a zero long-tensor:

https://github.com/rdspring1/PyTorch_GBW_LM/blob/master/lm/model.py#L68-L69

Is this the way is supposed to work? I was understanding that the sampled softmax obtains the speed up by computing the loss on only a sample of the entire vocabulary. But the way it's setup the loss would always be computed with respect to the same target (0).

Or is there something else I might be missing?

greetings!

Pretrained Model?

Nice work! It's so tragic that when I type "pytorch language models", this is not the first repo that shows up!

Do you plan to release the pre-trained model?

(I see it takes roughly 3 days...so probably it's ok)

missing dataset

The link of Torch Data Format is broken, can you offer a link of google drive?thank you!

Nondeterministic result?

Hi, I was trying to run your example, but the result is non-deterministic each time (even if I set dropout=0.0). Is that expected?
(BTW, I'm using GBWStream to read the dataset with deterministic=True, I can post the code if you want to take a look)

TypeError: iteration over a 0-d tensor

File "main_dev.py", line 99, in repackage_hidden
return [repackage_hidden(state) for state in h]
File "/Users/admin/anaconda3/lib/python3.7/site-packages/torch/tensor.py", line 381, in iter
raise TypeError('iteration over a 0-d tensor')
TypeError: iteration over a 0-d tensor

Have you met this kind of question before?

state of the art performance?

Nice work! I have a question regarding the result:
In the paper "Exploring the limits of language modeling", it reports test ppl of 54.1 using LSTM-512-512. Does it mean two 2 layers are used in the paper, while your result is obtained from 4 layers ? If so, what makes the difference?

missing train_data.pt

It seems that process_gbw.py is looking for train_data.pt but couldn't find it. Are there any instructions on how to create this file (or does it belong to the dataset downloaded)?

Thanks!

Resume Training?

Hi, I am wondering whether it is possible to resume training using the saved checkpoint? Based on the code I think I just need to re-define the scheduler by myself. Is there anything that you think I missed?

Thank you so much for your code btw.

build Log_Uniform Sampler

Hi

I have Cython installed, but I'm not sure how to do the step "build Log_Uniform Sampler".
Could you be more detailed in what commands should I run?

I tried to do python setup.py install but I got the following error:

running install
running build
running build_ext
building 'log_uniform' extension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.5m -I/home/goncalo/.virtualenvs/nmtpy/include/python3.5m -c log_uniform.cpp -o build/temp.linux-x86_64-3.5/log_uniform.o -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
log_uniform.cpp:608:31: fatal error: numpy/arrayobject.h: No such file or directory
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

So I'm not sure if I'm doing the right thing.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.