Git Product home page Git Product logo

lstm-char-cnn-tensorflow's Introduction

Character-Aware Neural Language Models

Tensorflow implementation of Character-Aware Neural Language Models. The original code of author can be found here.

model.png

This implementation contains:

  1. Word-level and Character-level Convolutional Neural Network
  2. Highway Network
  3. Recurrent Neural Network Language Model

The current implementation has a performance issue. See #3.

Prerequisites

Usage

To train a model with ptb dataset:

$ python main.py --dataset ptb

To test an existing model:

$ python main.py --dataset ptb --forward_only True

To see all training options, run:

$ python main.py --help

which will print

usage: main.py [-h] [--epoch EPOCH] [--word_embed_dim WORD_EMBED_DIM]
              [--char_embed_dim CHAR_EMBED_DIM]
              [--max_word_length MAX_WORD_LENGTH] [--batch_size BATCH_SIZE]
              [--seq_length SEQ_LENGTH] [--learning_rate LEARNING_RATE]
              [--decay DECAY] [--dropout_prob DROPOUT_PROB]
              [--feature_maps FEATURE_MAPS] [--kernels KERNELS]
              [--model MODEL] [--data_dir DATA_DIR] [--dataset DATASET]
              [--checkpoint_dir CHECKPOINT_DIR]
              [--forward_only [FORWARD_ONLY]] [--noforward_only]
              [--use_char [USE_CHAR]] [--nouse_char] [--use_word [USE_WORD]]
              [--nouse_word]

optional arguments:
  -h, --help            show this help message and exit
  --epoch EPOCH         Epoch to train [25]
  --word_embed_dim WORD_EMBED_DIM
                        The dimension of word embedding matrix [650]
  --char_embed_dim CHAR_EMBED_DIM
                        The dimension of char embedding matrix [15]
  --max_word_length MAX_WORD_LENGTH
                        The maximum length of word [65]
  --batch_size BATCH_SIZE
                        The size of batch images [100]
  --seq_length SEQ_LENGTH
                        The # of timesteps to unroll for [35]
  --learning_rate LEARNING_RATE
                        Learning rate [1.0]
  --decay DECAY         Decay of SGD [0.5]
  --dropout_prob DROPOUT_PROB
                        Probability of dropout layer [0.5]
  --feature_maps FEATURE_MAPS
                        The # of feature maps in CNN
                        [50,100,150,200,200,200,200]
  --kernels KERNELS     The width of CNN kernels [1,2,3,4,5,6,7]
  --model MODEL         The type of model to train and test [LSTM, LSTMTDNN]
  --data_dir DATA_DIR   The name of data directory [data]
  --dataset DATASET     The name of dataset [ptb]
  --checkpoint_dir CHECKPOINT_DIR
                        Directory name to save the checkpoints [checkpoint]
  --forward_only [FORWARD_ONLY]
                        True for forward only, False for training [False]
  --noforward_only
  --use_char [USE_CHAR]
                        Use character-level language model [True]
  --nouse_char
  --use_word [USE_WORD]
                        Use word-level language [False]
  --nouse_word

but more options can be found in models/LSTMTDNN and models/TDNN.

Performance

Failed to reproduce the results of paper (2016.02.12). If you are looking for a code that reproduced the paper's result, see https://github.com/mkroutikov/tf-lstm-char-cnn.

loss

The perplexity on the test sets of Penn Treebank (PTB) corpora.

Name Character embed LSTM hidden units Paper (Y Kim 2016) This repo.
LSTM-Char-Small 15 100 92.3 in progress
LSTM-Char-Large 15 150 78.9 in progress

Author

Taehoon Kim / @carpedm20

lstm-char-cnn-tensorflow's People

Contributors

carpedm20 avatar jonathanraiman avatar mpv35 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lstm-char-cnn-tensorflow's Issues

did anybody successfully train?

with GPU I just reduced a lot of options such as batch size, feature_maps etc.

Always OOM occured!!

Resource exhausted: OOM when allocating tensor with shape[50,450537]
W tensorflow/core/common_runtime/executor.cc:1102] 0x8bcfce0 Compute status: Resource exhausted: OOM when allocating tensor with shape[50,450537]
[[Node: LSTMTDNN/LSTM/Linear_34/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](LSTMTDNN/LSTM/dropout_34/mul_1, LSTMTDNN/LSTM/Linear/Matrix/read)]]
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 2987 get requests, put_count=1597 evicted_count=1000 eviction_rate=0.626174 and unsatisfied allocation rate=0.833612

Condition never ran

Line 159 in batch_loader.py

if len(char) == max_word_length:
  chars[-1] = char2idx['}']

AttributeError: 'dict' object has no attribute 'has_key'

I use anaconda with python3.5 and tensorflow 1.2 in win10. Running the example codes:
python main.py --dataset ptb
and
python main.py --dataset ptb --forward_only_True
both give the same error message.

Traceback shows there may be something wrong with:
if not word2idx.has_key(word):
in batch_loader.py (line 144)

`tf.Tensor` as Python boolean error

Hi,
While running python main.py --dataset ptb, I got an error :

data = data[: batch_size * seq_length * math.floor(length / (batch_size * seq_length))]
data load done. Number of batches in train: 265, val: 21, test: 23
Word vocab size: 10001, Char vocab size: 51, Max word length (incl. padding): 21
Traceback (most recent call last):
File "main.py", line 66, in <module> tf.app.run()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv))
File "main.py", line 60, in main model.run(FLAGS.epoch, FLAGS.learning_rate, FLAGS.decay)
File "/Users/artur-imac/nn-models/lstm-char-cnn-tensorflow/models/LSTMTDNN.py", line 269, in run
    if grad:
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 475, in __nonzero__
    raise TypeError("Using a `tf.Tensor` as a Python `bool` is not allowed. "
TypeError: Using a `tf.Tensor` as a Python `bool` is not allowed. Use `if t is not None:` instead of `if t:` to test if a tensor is defined, and use the logical TensorFlow ops to test the value of a tensor.

I am using TensorFlow 0.8 with Python2.

No module named 'TDNN'

Using Python 3.5

I tried to run this project with its basic settings and cannot even run main.py as it claims there's a missing module.
Though init.py in the models folder says it can't find TDNN, I verified that the file is indeed there.

C:\Users\dbishop\Desktop\lstm-char-cnn-tensorflow-master>python main.py --dataset ptb
Traceback (most recent call last):
File "main.py", line 5, in
from models import LSTMTDNN
File "C:\Users\dbishop\Desktop\lstm-char-cnn-tensorflow-master\models_init_.py", line 1, in
from TDNN import TDNN
ImportError: No module named 'TDNN'

Validation perplexity is 146.71 at the end of training (24 epochs)

(it should get ~82 on valid and ~79 on test)

$ python main.py --dataset ptb

.....

epoch: [24] [ 250/ 265] loss: 3.466149
Valid: loss: 5.225354, perplexity: 185.927017
{'perplexity': 83.749542031012467, 'epoch': 24, 'valid_perplexity': 146.71359295576036, 'learning_rate': 0.5}
[] Saving checkpoints...
Test: loss: 4.836956, perplexity: 126.084908
[
] Test loss: 4.954320, perplexity: 141.786226

crushed when use both char and word

hi carpedm20,
When I set both "use_char" and "use_word" to True, program cannot run correctly:
flags.DEFINE_boolean("use_char", True, "Use character-level language model [True]")
flags.DEFINE_boolean("use_word", True, "Use word-level language [False]")

models/LSTMTDNN.py line 102:
if self.use_char:
char_W = tf.get_variable("char_embed",
[self.char_vocab_size, self.char_embed_dim])
else:
word_W = tf.get_variable("word_embed",
[self.word_vocab_size, self.word_embed_dim])

should modified to:
if self.use_char:
char_W = tf.get_variable("char_embed",
[self.char_vocab_size, self.char_embed_dim])
if self.use_word:
word_W = tf.get_variable("word_embed",
[self.word_vocab_size, self.word_embed_dim])

In addition, line 132(models/LSTMTDNN.py), is it a syntax error when using tf.concat:
cnn_output = tf.concat(1, char_cnn.output, word_embed)

shoud it be modified to:
cnn_output = tf.concat(1, [char_cnn.output, tf.squeeze(word_embed, [1])]) ?

feature_maps and kernels?

feature_maps=[50, 100, 150, 200, 200, 200, 200],
kernels=[1,2,3,4,5,6,7]
Is it mean the 1 filter which size is 50, 2 filter which size is 100?
(feature_maps: list of feature maps (for each kernel width)
kernels: list of kernel widths)

Data flow

Hi,
Thanks for the code. Could you explain what is ydata in batch_loader.py? And another question: is possible to use your code in classification task, i.e. sentiment analysis, by applying OneHot class vectors?

usr_word

Thanks for sharing your code. I want to know how can I train a model in word_level? I found you code has the things like ( use_char = Ture, use_word = False). Is it useful to adjust the 'use_word = Ture'? Looking forward to your answer, thank you.

Problem testing on East Asian languages

I am experimenting this code on East Asian languages such as Chinese or Japanese, which do not have apparent word boundary like the white space in Latin languages. Looking into the code I found that batch_loader.py is splitting each line using white space, which I thought essentially makes it impossible to use this code on those languages.

Format for new data?

Silly question: How should new data be formatted, in order to learn/predict over it?
Assuming I have sentences and a label for each sentence.

(I didn't see labels in the PBT)

Thanks!

TypeError: 'module' object is not callable

I tried to run the code (python main.py --dataset ptb), and it failed with this exception:

...
Creating vocab...
After first pass of data, max word length is: 21
Token count: train 929589, val 73760, test 82430
Loading vocab...
Word vocab size: 10001, Char vocab size: 51
Reshaping tensors...
data load done. Number of batches in train: 265, val: 21, test: 23
Word vocab size: 10001, Char vocab size: 51, Max word length (incl. padding): 21
Traceback (most recent call last):
  File "main.py", line 66, in <module>
    tf.app.run()
  File "/home/ndavid/venvs/embedding/local/lib/python2.7/site-packages/tensorflow/python/platform/default/_app.py", line 11, in run
    sys.exit(main(sys.argv))
  File "main.py", line 57, in main
    data_dir=FLAGS.data_dir)
  File "/mnt/store/ndavid/LM/Others/lstm-char-cnn-tensorflow/models/LSTMTDNN.py", line 86, in __init__
    self.prepare_model()
  File "/mnt/store/ndavid/LM/Others/lstm-char-cnn-tensorflow/models/LSTMTDNN.py", line 144, in prepare_model
    cnn_output = highway(cnn_output, cnn_output.get_shape()[1], self.highway_layers, 0)
  File "/mnt/store/ndavid/LM/Others/lstm-char-cnn-tensorflow/models/ops.py", line 28, in highway
    output = f(rnn_cell.linear(output, size, 0, scope='output_lin_%d' % idx))
TypeError: 'module' object is not callable

I am using tensorflow version 0.5 (I think that's the latest one, but I am not sure).

did anybody successfully train?

with GPU I just reduced a lot of options such as batch size, feature_maps etc.

Always OOM occured!!

Resource exhausted: OOM when allocating tensor with shape[50,450537]
W tensorflow/core/common_runtime/executor.cc:1102] 0x8bcfce0 Compute status: Resource exhausted: OOM when allocating tensor with shape[50,450537]
[[Node: LSTMTDNN/LSTM/Linear_34/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](LSTMTDNN/LSTM/dropout_34/mul_1, LSTMTDNN/LSTM/Linear/Matrix/read)]]
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 2987 get requests, put_count=1597 evicted_count=1000 eviction_rate=0.626174 and unsatisfied allocation rate=0.833612

Program break due to tensorflow version

Hi,
I'm using Python2.7 and tensorflow1.0.0 to run this project.
The project first got "TypeError: slice indices must be integers or None" in:
"batch_loader.py", line 49, in init
data = data[: batch_size * seq_length * math.floor(length / (batch_size * seq_length))]
I fixed this by change it to:
data = data[: int(batch_size * seq_length * math.floor(length / (batch_size * seq_length)))]
Then the project got ValueError in
"LSTMTDNN.py", line 126, in prepare_model
char_index = tf.reshape(char_indices[idx], [-1, self.max_word_length])
ValueError: Dimension size must be evenly divisible by 21 but is 1 for 'LSTMTDNN/CNN/Reshape' (op: 'Reshape') with input shapes: [], [2].

The number 21 for ptb dataset and 65 for nsmc dataset.

What should I do? Thanks!

Failed to train on the ptb dataset

Python 2.7.6 on Ubuntu 14.04.3

$ python main.py --dataset ptb

Reshaping tensors...
data load done. Number of batches in train: 265, val: 21, test: 23
Word vocab size: 10001, Char vocab size: 51, Max word length (incl. padding): 21
Traceback (most recent call last):
File "main.py", line 66, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/default/app.py", line 11, in run
sys.exit(main(sys.argv))
File "main.py", line 57, in main
data_dir=FLAGS.data_dir)
File "/mnt/win/lstm-char-cnn-tensorflow/models/LSTMTDNN.py", line 86, in init
self.prepare_model()
File "/mnt/win/lstm-char-cnn-tensorflow/models/LSTMTDNN.py", line 144, in prepare_model
cnn_output = highway(cnn_output, cnn_output.get_shape()[1], self.highway_layers, 0)
File "/mnt/win/lstm-char-cnn-tensorflow/models/ops.py", line 28, in highway
output = f(rnn_cell.linear(output, size, 0, scope='output_lin
%d' % idx))
TypeError: 'module' object is not callable

TypeError: slice indices must be integers or None or have an __index__ method

I tried to run the code line (python main.py --dataset ptb) and got the following error. I am using tensorflow 1.0.1
danish@amax:~/trycode/lstm-char-cnn-tensorflow-master$ python main.py --dataset ptb
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
{'batch_size': 100,
'char_embed_dim': 15,
'checkpoint_dir': 'checkpoint',
'data_dir': 'data',
'dataset': 'ptb',
'decay': 0.5,
'dropout_prob': 0.5,
'epoch': 25,
'feature_maps': '[50,100,150,200,200,200,200]',
'forward_only': False,
'kernels': '[1,2,3,4,5,6,7]',
'learning_rate': 1.0,
'max_word_length': 65,
'model': 'LSTMTDNN',
'seq_length': 35,
'use_char': True,
'use_word': False,
'word_embed_dim': 650}
[*] Creating checkpoint directory...
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla M40 24GB
major: 5 minor: 2 memoryClockRate (GHz) 1.112
pciBusID 0000:04:00.0
Total memory: 22.40GiB
Free memory: 22.29GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x2112820
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 1 with properties:
name: Tesla M40 24GB
major: 5 minor: 2 memoryClockRate (GHz) 1.112
pciBusID 0000:05:00.0
Total memory: 22.40GiB
Free memory: 22.29GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x21161b0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 2 with properties:
name: Tesla M40 24GB
major: 5 minor: 2 memoryClockRate (GHz) 1.112
pciBusID 0000:08:00.0
Total memory: 22.40GiB
Free memory: 22.29GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x2119b40
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 3 with properties:
name: Tesla M40 24GB
major: 5 minor: 2 memoryClockRate (GHz) 1.112
pciBusID 0000:09:00.0
Total memory: 22.40GiB
Free memory: 22.29GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x211d4d0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 4 with properties:
name: Tesla M40 24GB
major: 5 minor: 2 memoryClockRate (GHz) 1.112
pciBusID 0000:83:00.0
Total memory: 22.40GiB
Free memory: 22.29GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x2120e60
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 5 with properties:
name: Tesla M40 24GB
major: 5 minor: 2 memoryClockRate (GHz) 1.112
pciBusID 0000:84:00.0
Total memory: 22.40GiB
Free memory: 22.29GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x21247f0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 6 with properties:
name: Tesla M40 24GB
major: 5 minor: 2 memoryClockRate (GHz) 1.112
pciBusID 0000:87:00.0
Total memory: 22.40GiB
Free memory: 22.29GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x2128430
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 7 with properties:
name: Tesla M40 24GB
major: 5 minor: 2 memoryClockRate (GHz) 1.112
pciBusID 0000:88:00.0
Total memory: 22.40GiB
Free memory: 22.29GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 0 and 4
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 0 and 5
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 0 and 6
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 0 and 7
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 1 and 4
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 1 and 5
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 1 and 6
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 1 and 7
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 2 and 4
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 2 and 5
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 2 and 6
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 2 and 7
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 3 and 4
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 3 and 5
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 3 and 6
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 3 and 7
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 4 and 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 4 and 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 4 and 2
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 4 and 3
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 5 and 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 5 and 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 5 and 2
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 5 and 3
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 6 and 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 6 and 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 6 and 2
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 6 and 3
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 7 and 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 7 and 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 7 and 2
I tensorflow/core/common_runtime/gpu/gpu_device.cc:777] Peer access not supported between device ordinals 7 and 3
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 1 2 3 4 5 6 7
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y Y Y Y N N N N
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 1: Y Y Y Y N N N N
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 2: Y Y Y Y N N N N
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 3: Y Y Y Y N N N N
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 4: N N N N Y Y Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 5: N N N N Y Y Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 6: N N N N Y Y Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 7: N N N N Y Y Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla M40 24GB, pci bus id: 0000:04:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla M40 24GB, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:2) -> (device: 2, name: Tesla M40 24GB, pci bus id: 0000:08:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:3) -> (device: 3, name: Tesla M40 24GB, pci bus id: 0000:09:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:4) -> (device: 4, name: Tesla M40 24GB, pci bus id: 0000:83:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:5) -> (device: 5, name: Tesla M40 24GB, pci bus id: 0000:84:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:6) -> (device: 6, name: Tesla M40 24GB, pci bus id: 0000:87:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:7) -> (device: 7, name: Tesla M40 24GB, pci bus id: 0000:88:00.0)
Creating vocab...
After first pass of data, max word length is: 21
Token count: train 929589, val 73760, test 82430
Loading vocab...
Word vocab size: 10001, Char vocab size: 51
Reshaping tensors...
Traceback (most recent call last):
File "main.py", line 66, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 57, in main
data_dir=FLAGS.data_dir)
File "/home/danish/trycode/lstm-char-cnn-tensorflow-master/models/LSTMTDNN.py", line 78, in init
self.loader = BatchLoader(self.data_dir, self.dataset_name, self.batch_size, self.seq_length, self.max_word_length)
File "/home/danish/trycode/lstm-char-cnn-tensorflow-master/batch_loader.py", line 49, in init
data = data[: batch_size * seq_length * math.floor(length / (batch_size * seq_length))]
TypeError: slice indices must be integers or None or have an index method

dataset : nsmc - ResourceExhaustedError ?

I have a question about running on nsmc dataset
python main.py --dataset nsmc doesn't seem to work on both gtx1080 (8 GB) and k40 (12 GB).
I get such an error
" ... tensorflow.python.framework.errors.ResourceExhaustedError: OOM when allocating tensor with shape[100,450537] ..."
Is this "python main.py --dataset nsmc" supposed to run on a specific gpu? are there certain requirements?

ValueError: Dimensions must be equal, but are 1300 and 1750 for 'LSTMTDNN/LSTM/rnn/rnn/multi_rnn_cell/cell_0/cell_0/basic_lstm_cell/MatMul_1' (op: 'MatMul') with input shapes: [100,1300], [1750,2600].

After I did some modification following the guides given by
#23 and #20,
there is still some errors when running the program.

Traceback (most recent call last):
File "/home/ymzhu/pycharm-2017.2.3/helpers/pydev/pydevd.py", line 1599, in
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/ymzhu/pycharm-2017.2.3/helpers/pydev/pydevd.py", line 1026, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/ymzhu/Desktop/code/lstm-char-cnn-tensorflow/main.py", line 66, in
tf.app.run()
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/ymzhu/Desktop/code/lstm-char-cnn-tensorflow/main.py", line 57, in main
data_dir=FLAGS.data_dir)
File "/home/ymzhu/Desktop/code/lstm-char-cnn-tensorflow/models/LSTMTDNN.py", line 87, in init
self.prepare_model()
File "/home/ymzhu/Desktop/code/lstm-char-cnn-tensorflow/models/LSTMTDNN.py", line 155, in prepare_model
dtype=tf.float32)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 1253, in static_rnn
(output, state) = call_cell()
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 1240, in
call_cell = lambda: cell(input
, state)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 183, in call
return super(RNNCell, self).call(inputs, state)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 575, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 1066, in call
cur_inp, new_state = cell(cur_inp, cur_state)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 183, in call
return super(RNNCell, self).call(inputs, state)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 575, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 441, in call
value=self._linear([inputs, h]), num_or_size_splits=4, axis=1)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 1189, in call
res = math_ops.matmul(array_ops.concat(args, 1), self._weights)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 1891, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 2437, in _mat_mul
name=name)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2958, in create_op
set_shapes_for_outputs(ret)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2209, in set_shapes_for_outputs
shapes = shape_func(op)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2159, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 627, in call_cpp_shape_fn
require_shape_fn)
File "/home/ymzhu/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 691, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Dimensions must be equal, but are 1300 and 1750 for 'LSTMTDNN/LSTM/rnn/rnn/multi_rnn_cell/cell_0/cell_0/basic_lstm_cell/MatMul_1' (op: 'MatMul') with input shapes: [100,1300], [1750,2600].

Why did this happen? Could anyone help?

IndexError: index 0 is out of bounds for axis 0 with size 0

Python 3.6.3, tensorflow-1.4.0 on Windows 7 Ultimate SP1
python main.py --dataset ptb
self.loader = BatchLoader(self.data_dir, self.dataset_name, self.batch_size, self.seq_length, self.max_word_length)
File "C:\Users\akuo\lstm-char-cnn-tensorflow\batch_loader.py", line 52, in init
ydata[-1] = data[0].copy()
IndexError: index 0 is out of bounds for axis 0 with size 0

there are quite a few -1 in batch_loader.py, should I replace them all to 0 ?

can you provide a ptb or link?

hello, I am a fresh hand for text. I do not know how to make a ptb file for the command python main.py --dataset ptb. Can you provide a ptb file or link?

thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.