inejc / paragraph-vectors Goto Github PK
View Code? Open in Web Editor NEW:page_facing_up: A PyTorch implementation of Paragraph Vectors (doc2vec).
License: MIT License
:page_facing_up: A PyTorch implementation of Paragraph Vectors (doc2vec).
License: MIT License
I.e. training in which only the paragraph matrix is updated (word vectors are pre-trained and frozen), as described in Q. V. Le et al., Distributed Representations of Sentences and Documents.
I would like to volunteer for the implementation of "prior word vector from word2vec or glove"
I think the library could use the pre-trained word vector as an initialization instead of just random vector.
Batch generator (paragraphvec.data.NCEGenerator) is implemented in a thread-safe manner (edit: needs fixing). The training procedure should take advantage of this (i.e. simultaneously construct new batches on CPU while the model is trained on a GPU).
A subsampling approach described in Distributed Representations of Words and Phrases and their Compositionality, T. Mikolov et al.
Coverage of data.py, which uses multiprocessing is incorrect. Currently, data.py is simply ignored.
torch version : 0.1.9_2.
cuda is not installed.
I am running the train.py as it is mentioned in the description with the "example.csv" file and the same parameters. I am running into
AttributeError: 'module' object has no attribute 'HalfTensor'
Traceback (most recent call last):
File "train.py", line 8, in
from paragraphvec.data import load_dataset, NCEData
File "/home/prathmesh/paragraph-vectors/paragraphvec/data.py", line 11, in
from torchtext.data import Field, TabularDataset
File "/usr/local/lib/python2.7/dist-packages/torchtext/init.py", line 1, in
from . import data
File "/usr/local/lib/python2.7/dist-packages/torchtext/data/init.py", line 4, in
from .field import RawField, Field, ReversibleField, SubwordField
File "/usr/local/lib/python2.7/dist-packages/torchtext/data/field.py", line 61, in
class Field(RawField):
File "/usr/local/lib/python2.7/dist-packages/torchtext/data/field.py", line 117, in Field
torch.HalfTensor: float,
AttributeError: 'module' object has no attribute 'HalfTensor'
Implement the Distributed Bag of Words version of the model as proposed by Q. V. Le et al., Distributed Representations of Sentences and Documents.
Compare the implementation with Genism's, identify bottlenecks.
How should I fix this error?
[jalal@goku paragraph-vectors]$ python paragraphvec/train.py start --data_file_name 'data/example.csv' --num_epochs 10 --batch_size 32 --num_noise_words 2 --vec_dim 100 --lr 1e-3
Traceback (most recent call last):
File "paragraphvec/train.py", line 8, in <module>
from paragraphvec.data import load_dataset, NCEData
ModuleNotFoundError: No module named 'paragraphvec'
$ python -V
Python 3.6.4 :: Anaconda custom (64-bit)
After the model is trained, it should be possible to easily get trained paragraph vectors based on document index (index of the row in the dataset).
pytorch - 0.4.1
CUDA - 9.0
I am having an issue with data.py. on line 119, process.start() yields the following error:
TypeError: 'generator' object is not callable. Could you please help?
Save loss value after each epoch and implement a function for visualization.
Thanks for making this available. Would you happen to have some sample data and good initial value of the parameters to test this with?
Hi! I run into an error when trying to train a model.
My PyTorch version is 0.2.0_4
, and the cuda version is 8.0
.
Can you give me some insights about what's happening? Here is the trace:
(paravec) [zhangsheng@bbox1 paragraphvec]$ python train.py start --data_file_name 'example.csv' --num_epochs 100 --batch_size 32 --context_size 4 --num_noise_words 5 --vec_dim 150 --lr 1e-4
Dataset comprised of 4 documents.
Vocabulary size is 109.
Training started.
Epoch 1 - 20%/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [64,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [65,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [66,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [67,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [68,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [69,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [70,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [71,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [72,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [73,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [74,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [75,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [76,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [77,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [78,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [79,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [80,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [81,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [82,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [83,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [84,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [85,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [86,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [87,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [88,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [89,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [90,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [91,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [92,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [93,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [94,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [95,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [96,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [97,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [98,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [99,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [100,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [101,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [102,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [103,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [104,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [105,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [106,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [107,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [108,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [109,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [110,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [111,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [112,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [113,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [114,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [115,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [116,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [117,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [118,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [119,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [120,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [121,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [122,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [123,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [124,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [125,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [126,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/THCTensorIndex.cu:378: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 2U]: block: [9,0,0], thread: [127,0,0] Assertion `indexAtDim < data.baseSizes[dim]` failed.
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/generated/../THCReduceAll.cuh line=334 error=59 : device-side assert triggered
Traceback (most recent call last):
File "train.py", line 195, in <module>
fire.Fire()
File "/home/zhangsheng/anaconda2/envs/paravec/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/home/zhangsheng/anaconda2/envs/paravec/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/home/zhangsheng/anaconda2/envs/paravec/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "train.py", line 95, in start
save_all)
File "train.py", line 143, in _run
x = cost_func.forward(x)
File "/export/ssd/sheng/projects/paragraph-vectors/paragraphvec/loss.py", line 26, in forward
+ torch.sum(self._log_sigmoid(-scores[:, 1:]), dim=1) / k
File "/home/zhangsheng/anaconda2/envs/paravec/lib/python3.6/site-packages/torch/autograd/variable.py", line 476, in sum
return Sum.apply(self, dim, keepdim)
File "/home/zhangsheng/anaconda2/envs/paravec/lib/python3.6/site-packages/torch/autograd/_functions/reduce.py", line 16, in forward
return input.new((input.sum(),))
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THC/generated/../THCReduceAll.cuh:334
terminate called without an active exception
Aborted (core dumped)
Hi there!
I kind of don't seem to understand the idealogy behind the class NCEData
in data.py
. The description seems to tell "An infinite, parallel (multiprocess) batch generator fo rnoise-contrastive estimation of word vector models". Can I have some more resources on it and some tips on using/implementing it?
Thanks :)
The model name ends with the extension .pth.tar
and given an error "The archive is either in unknown format or damaged"
PyTorch: 0.2.0.post3
CUDA: 8.0
Loss suddenly becomes nan (was somewhere between 0.4 and 0.5 in the previous epoch) when training on a dataset comprised of 74219 documents with vocabulary size 91417.
Training on a GPU with the following parameters:
--data_file_name 'arxiv_min.csv' --num_epochs 500 --batch_size 512 --context_size 4 --num_noise_words 10 --vec_dim 300 --lr 0.001 --max_generated_batches 500 --num_workers 8
I did all the steps you mentioned but I felt the step that would show how to create the vector for a given document was missing.
Basically, I am interested in outputing example_vectors.txt/.csv that in each line has a vector for each document in example.csv.
How should I do that? Can you please add it to your tutorial?
The user should be able to continue training from the last saved checkpoint.
Implement concatenation of word and paragraph vectors for the Distributed Memory model. Currently, only the sum operation is supported.
Python version = 3.7
CUDA = 9.0
When I try "pip install -e ." I failed to install the code.
error: Command "gcc -pthread -B /home/jikunkang/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE=1 -D_LARGEFILE64_SOURCE=1 -Inumpy/core/include -Ibuild/src.linux-x86_64-3.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -I/home/jikunkang/anaconda3/include/python3.7m -Ibuild/src.linux-x86_64-3.7/numpy/core/src/private -Ibuild/src.linux-x86_64-3.7/numpy/core/src/npymath -Ibuild/src.linux-x86_64-3.7/numpy/core/src/private -Ibuild/src.linux-x86_64-3.7/numpy/core/src/npymath -Ibuild/src.linux-x86_64-3.7/numpy/core/src/private -Ibuild/src.linux-x86_64-3.7/numpy/core/src/npymath -c numpy/random/mtrand/mtrand.c -o build/temp.linux-x86_64-3.7/numpy/random/mtrand/mtrand.o -MMD -MF build/temp.linux-x86_64-3.7/numpy/random/mtrand/mtrand.o.d" failed with exit status 1
----------------------------------------
Command "/home/jikunkang/git/paragraph-vectors/env/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-qn2takz9/numpy/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-cf3tjp63/install-record.txt --single-version-externally-managed --compile --install-headers /home/jikunkang/git/paragraph-vectors/env/include/site/python3.7/numpy" failed with error code 1 in /tmp/pip-install-qn2takz9/numpy/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.