ruidan / unsupervised-aspect-extraction Goto Github PK
View Code? Open in Web Editor NEWCode for acl2017 paper "An unsupervised neural attention model for aspect extraction"
License: Apache License 2.0
Code for acl2017 paper "An unsupervised neural attention model for aspect extraction"
License: Apache License 2.0
Hi Ruidan, thank you for the really excellent research and coding.
I was also really lucky to hear your presentation in ACL2017.
In the ending of each training epoch, you showed an info about "ortho_reg: loss - max_margin_loss".
I don't really understand the meaning of this. Could you please explain it? or you can show me some references about it.
Thank you so much.
I got this output:
Creating vocab ...
108060026 total words, 17156 unique words
keep the top 9000 words
Reading dataset ...
train set
<num> hit rate: 0.61%, <unk> hit rate: 1.35%
test set
<num> hit rate: 0.55%, <unk> hit rate: 1.31%
Traceback (most recent call last):
File "train.py", line 54, in <module>
train_x = sequence.pad_sequences(train_x, maxlen=overall_maxlen)
File "/home/italo/.local/lib/python2.7/site-packages/keras/preprocessing/sequence.py", line 61, in pad_sequences
x = (np.ones((num_samples, maxlen) + sample_shape) * value).astype(dtype)
File "/home/italo/.local/lib/python2.7/site-packages/numpy/core/numeric.py", line 188, in ones
a = empty(shape, dtype, order)
MemoryError
What's happening? There's any way to change Python 2.7 memory?
I read your paper and codes. You apply tanh
to an inner product between word embeddings e_w
and context vector y_s
at here, though you don't refer to this in your paper.
Please, tell me why tanh
is needed.
I tried to implement ABAE, using tensorflow and chainer (deeplearning framework) respectively, but in both implementation, attentions get not sparse but uniform, like [0.25, 0.25, 0.25, 0.25]
(if length is 4.)
It needs tanh
to fix this bug?
Thanks.
I'm solved the last issue using "KERAS_BACKEND=theano" before your example script. But I'm still having problems with python libraries. Take a look in my exit: (Sorry for my lack of knowledge, I'm a beginner with python)
2018-02-21 11:59:49,538 INFO Arguments:
2018-02-21 11:59:49,538 INFO algorithm: adam
2018-02-21 11:59:49,538 INFO aspect_size: 14
2018-02-21 11:59:49,538 INFO batch_size: 50
2018-02-21 11:59:49,539 INFO command: train.py --emb ../preprocessed_data/restaurant/w2v_embedding --domain restaurant -o output_dir
2018-02-21 11:59:49,539 INFO domain: restaurant
2018-02-21 11:59:49,539 INFO emb_dim: 200
2018-02-21 11:59:49,539 INFO emb_path: ../preprocessed_data/restaurant/w2v_embedding
2018-02-21 11:59:49,539 INFO epochs: 15
2018-02-21 11:59:49,539 INFO maxlen: 0
2018-02-21 11:59:49,539 INFO neg_size: 20
2018-02-21 11:59:49,539 INFO ortho_reg: 0.1
2018-02-21 11:59:49,539 INFO out_dir_path: output_dir
2018-02-21 11:59:49,539 INFO seed: 1234
2018-02-21 11:59:49,539 INFO vocab_size: 9000
Using Theano backend.
WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29
2018-02-21 11:59:49,873 WARNING The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29
ERROR (theano.sandbox.cuda): nvcc compiler not found on $PATH. Check your nvcc installation and try again.
2018-02-21 11:59:49,879 ERROR nvcc compiler not found on $PATH. Check your nvcc installation and try again.
Reading data from restaurant
Creating vocab ...
2061408 total words, 45041 unique words
keep the top 9000 words
Reading dataset ...
train set
<num> hit rate: 0.84%, <unk> hit rate: 3.73%
test set
<num> hit rate: 0.41%, <unk> hit rate: 4.22%
Number of training examples: 279885
Length of vocab: 9003
2018-02-21 12:00:00,628 INFO Building model
Traceback (most recent call last):
File "train.py", line 108, in <module>
model = create_model(args, overall_maxlen, vocab)
File "/home/italo/Unsupervised-Aspect-Extraction-master/code/model.py", line 51, in create_model
from w2vEmbReader import W2VEmbReader as EmbReader
File "/home/italo/Unsupervised-Aspect-Extraction-master/code/w2vEmbReader.py", line 4, in <module>
import gensim
ImportError: No module named gensim```
Im getting an AttributeError: 'Dimension' object has no attribute 'eval' when trying to train. Can someone assist me to solve this issue.
File "train.py", line 107, in <module>
model = create_model(args, overall_maxlen, vocab)
File "/home/src/Unsupervised-Aspect-Extraction/code/model.py", line 43, in create_model
W_regularizer=ortho_reg)(p_t)
File "/home/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 592, in __call__
self.build(input_shapes[0])
File "/home/src/Unsupervised-Aspect-Extraction/code/my_layers.py", line 127, in build
constraint=self.W_constraint)
File "/home/.local/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 418, in add_weight
self.add_loss(regularizer(weight))
File "/home/src/Unsupervised-Aspect-Extraction/code/model.py", line 17, in ortho_reg
reg = K.sum(K.square(K.dot(w_n, K.transpose(w_n)) - K.eye(w_n.shape[0].eval())))
AttributeError: 'Dimension' object has no attribute 'eval'
Hi , first of all thank you for this amazing study , i am trying to apply this model to my own dataset but my initial results are meaningless , i am guessing that i am doing something wrong with the dataset , therefore i want to ask you in which format should i put my data before using the preprocess.py . Or should i just use my raw text without stopwords etc . if you can give me some directions about it, I ll be gratefull thank you.
Hi Ruidan,
First of all, really great work. I tried it on some really noisy data set and some really cohesive aspects popped out.
I do find the coherence score as a very interesting way to evaluate this unsupervised model. But I did not find the corresponding code in your evaluation(Or I missed it?). So I am wondering how did you compute the coherence score.
Hello, I try to evaluate the uploaded trained restaurant model by running evaluation.py directly, did not make any changes to the code, but did not get the same results as the paper, is it necessary to tune some parameters? Thank you for your answer.
That's the result I got.
--- Results on restaurant domain ---
precision recall f1-score support
Food 0.654 0.493 0.562 887
Staff 0.406 0.270 0.324 352
Ambience 0.245 0.143 0.181 251
Anecdotes 0.000 0.000 0.000 0
Price 0.000 0.000 0.000 0
Miscellaneous 0.000 0.000 0.000 0
avg / total 0.527 0.381 0.442 1490
I run the run_script.sh with a GPU machine, but get the error message below
It seems the cuda version is not suitable. So which cuda version should be used?
Using Theano backend.
WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29
2018-05-30 01:28:42,026 WARNING The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29
Using gpu device 0: GRID K520 (CNMeM is disabled, cuDNN not available)
Traceback (most recent call last):
File "train.py", line 50, in <module>
from keras.preprocessing import sequence
File "/home/ubuntu/miniconda3/envs/py27/lib/python2.7/site-packages/keras/__init__.py", line 2, in <module>
from . import backend
File "/home/ubuntu/miniconda3/envs/py27/lib/python2.7/site-packages/keras/backend/__init__.py", line 64, in <module>
from .theano_backend import *
File "/home/ubuntu/miniconda3/envs/py27/lib/python2.7/site-packages/keras/backend/theano_backend.py", line 1, in <module>
import theano
File "/home/ubuntu/miniconda3/envs/py27/lib/python2.7/site-packages/theano/__init__.py", line 116, in <module>
theano.sandbox.cuda.tests.test_driver.test_nvidia_driver1()
File "/home/ubuntu/miniconda3/envs/py27/lib/python2.7/site-packages/theano/sandbox/cuda/tests/test_driver.py", line 41, in test_nvidia_driver1
raise Exception("The nvidia driver version installed with this OS "
Exception: The nvidia driver version installed with this OS does not give good results for reduction.Installing the nvidia driver available on the same download page as the cuda package will fix the problem: http://developer.nvidia.com/cuda-downloads
And if I use the CPU to train, there are also some log info:
2018-05-30 10:43:44,656 INFO Pattern library is not installed, lemmatization won't be available.
2018-05-30 10:43:44,892 INFO 'pattern' package not found; tag filters are not available for English
What is the Pattern library. And I can not find where these log info are writen in the python file. It seems in the utils.py.
Beside this, I also got errors that tell me there are no module named pygpu and h5py.
What are the version of pygpu and h5py?
I read this issue,
But I caught the following error.
Traceback (most recent call last):
File "train.py", line 107, in <module>
model = create_model(args, overall_maxlen, vocab)
File "/home/hiroshimatsui/ruidan_abae/code/model.py", line 52, in create_model
emb_reader = EmbReader(args.emb_path, emb_dim=args.emb_dim)
File "/home/hiroshimatsui/ruidan_abae/code/w2vEmbReader.py", line 21, in __init__
model = gensim.models.Word2Vec.load(emb_path)
File "/home/hiroshimatsui/ruidan_abae/2abae/local/lib/python2.7/site-packages/gensim/models/word2vec.py", line 1485, in load
model = super(Word2Vec, cls).load(*args, **kwargs)
File "/home/hiroshimatsui/ruidan_abae/2abae/local/lib/python2.7/site-packages/gensim/utils.py", line 248, in load
obj = unpickle(fname)
File "/home/hiroshimatsui/ruidan_abae/2abae/local/lib/python2.7/site-packages/gensim/utils.py", line 912, in unpickle
return _pickle.loads(f.read())
AttributeError: 'module' object has no attribute 'call_on_class_only'
Which version of gensim should I install?
Thanks.
Hey,
I got the code a while ago and trying to run it. However, after setting everything up I used to get Nan loss after the first epoch. I tried so many things, from changing optimizers to checking for null values in inputs, to changing learning rates. Only after your latest change of hard setting the batches_per_epoch to 1000 things became better, however still starting from epoch 6 it gives back nan loss values. What seems to be the problem? also, I couldn't reproduce the paper precision and recall values the best I could get after which is only after 5 epochs are:
This is the restaurant dataset
precision-recall f1-score support
Food 0.786 0.183 0.296 887
Staff 0.527 0.281 0.367 352
Ambience 0.327 0.131 0.188 251
在我使用1.13.1版本的tensorflow后,运行代码出现了一些问题。
like title..
I want to know the version of cuda that you have used, because the version of cuda that I have used is 2.0, but the aspect.log file that is running is different from yours. I checked the configuration file of theano. The above shows that the cuda version will get different results.
Hi Ruidan,
Thanks for your great work. I have a question of execution order of word2vec.py
and preprocess.py
if I use other data. As mentioned in README, word2vec
should be run first, but after reading the code I found that preprocess
reads raw text in dataset
folder and generates output to preprocessed_data
folder, then word2vec
read preprocessed data and generate word embeddings. I wonder whether it is correct to run preprocess
first to clean the data and then run word2vec
to generate embeddings. Looking forward to your reply. Thanks!
Hi, How do I use train it on cpu instead of Gpu ?
Hello Ruidan,
Why are the Representative Words listed in table 2 of the paper different from the aspect.log that has been uploaded?
Did you list up the Representative Words in the paper according to aspect.log?
Also, I pulled your project and executed it, but I obtained a different result compared with your aspect.log.
Could you please tell me how did you create the Representative Words listed in the paper?
Best regards.
I am using gensim 0.12.4 on python 3 and still getting the following error
Traceback (most recent call last):
File "", line 1, in
model = create_model(ortho_reg, neg_size, emb_dim, aspect_size, emb_path, overall_maxlen, vocab)
File "/home/fractaluser/Projects/workspace/UnsupervisedAspectExtraction/code/model.py", line 63, in create_model
emb_reader = EmbReader(emb_path, emb_dim=emb_dim)
File "/home/fractaluser/Projects/workspace/UnsupervisedAspectExtraction/code/w2v_emb_reader.py", line 23, in init
model = gensim.models.Word2Vec.load(emb_path)
File "/home/fractaluser/anaconda3/envs/venv_keras/lib/python3.5/site-packages/gensim/models/word2vec.py", line 1485, in load
model = super(Word2Vec, cls).load(*args, **kwargs)
File "/home/fractaluser/anaconda3/envs/venv_keras/lib/python3.5/site-packages/gensim/utils.py", line 248, in load
obj = unpickle(fname)
File "/home/fractaluser/anaconda3/envs/venv_keras/lib/python3.5/site-packages/gensim/utils.py", line 912, in unpickle
return _pickle.loads(f.read())
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc1 in position 0: ordinal not in range(128)
Hello, this is really a nice work, and I just begin to transfer the idea to another task. But I don't really understand the code in my_layers->Average->call->K.sum(x, axis=-2) / K.sum(mask, axis=-2). I just know this class is a layer to compute ys, and the function is to average the weight, and I just find the source code in tensorflow_backend->sum, can't konw axis ues in [-rank(x), rank(x)), why axis=-2. By the way, I'm confused by the mask. Could you please tell me some details to raised me up, thank you.
@ruidan
I downloaded this code and seem that the code does not work. The error message:
Please help!
Im facing this issue. Can anyone help me resolve this?
Traceback (most recent call last):
File "train.py", line 107, in
model = create_model(args, overall_maxlen, vocab)
File "D:\Projects\Epilepsy project\Unsupervised-Aspect-Extraction-master\code\model.py", line 52, in create_model
emb_reader = EmbReader(args.emb_path, emb_dim=args.emb_dim)
File "D:\Projects\Epilepsy project\Unsupervised-Aspect-Extraction-master\code\w2vEmbReader.py", line 21, in init
model = gensim.models.Word2Vec.load(emb_path)
File "C:\Users\namei.conda\envs\py27\lib\site-packages\gensim\models\word2vec.py", line 1485, in load
model = super(Word2Vec, cls).load(*args, **kwargs)
File "C:\Users\namei.conda\envs\py27\lib\site-packages\gensim\utils.py", line 248, in load
obj = unpickle(fname)
File "C:\Users\namei.conda\envs\py27\lib\site-packages\gensim\utils.py", line 912, in unpickle
return _pickle.loads(f.read())
AttributeError: 'module' object has no attribute 'call_on_class_only'
can you please help me with this error in self.name and name thing, thank you!
Traceback (most recent call last):
File "/home/manaswini/adv_nlp/Unsupervised-Aspect-Extraction/code/train.py", line 107, in
model = create_model(args, overall_maxlen, vocab)
File "/home/manaswini/adv_nlp/Unsupervised-Aspect-Extraction/code/model.py", line 34, in create_model
att_weights = Attention(name='att_weights')([e_w, y_s])
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 925, in call
return self._functional_construction_call(inputs, args, kwargs,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1098, in _functional_construction_call
self._maybe_build(inputs)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 2643, in _maybe_build
self.build(input_shapes) # pylint:disable=not-callable
File "/home/manaswini/adv_nlp/Unsupervised-Aspect-Extraction/code/my_layers.py", line 36, in build
self.W = self.add_weight((input_shape[0][-1], input_shape[1][-1]),
TypeError: add_weight() got multiple values for argument 'name'
The code runs fine and results as per table 6 of the paper are produced too, though with slight variance. But is there a script to label (possibly using predict_label) the individual texts in test dataset? else, how is evaluation script getting to calculating F-scores?
I am trying to run the code using keras 2. I am getting following error:
Traceback (most recent call last):
File "", line 1, in
model = create_model(ortho_reg, neg_size, emb_dim, aspect_size, emb_path, overall_maxlen, vocab)
File "/home/fractaluser/Projects/workspace/UnsupervisedAspectExtraction/code/model.py", line 43, in create_model
r_s = WeightedAspectEmb(aspect_size, emb_dim, name='aspect_emb', W_regularizer=ortho_reg)(p_t)
File "/home/fractaluser/anaconda3/envs/venv_keras/lib/python3.5/site-packages/keras/engine/base_layer.py", line 433, in call
self.build(unpack_singleton(input_shapes))
File "/home/fractaluser/Projects/workspace/UnsupervisedAspectExtraction/code/my_layers.py", line 131, in build
constraint=self.W_constraint)
File "/home/fractaluser/anaconda3/envs/venv_keras/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/fractaluser/anaconda3/envs/venv_keras/lib/python3.5/site-packages/keras/engine/base_layer.py", line 257, in add_weight
self.add_loss(regularizer(weight))
File "/home/fractaluser/Projects/workspace/UnsupervisedAspectExtraction/code/model.py", line 19, in ortho_reg
return ortho_reg*reg
File "/home/fractaluser/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 893, in r_binary_op_wrapper
x = ops.convert_to_tensor(x, dtype=y.dtype.base_dtype, name="x")
File "/home/fractaluser/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1050, in convert_to_tensor
as_ref=False)
File "/home/fractaluser/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1146, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/fractaluser/.local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 229, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/home/fractaluser/.local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 208, in constant
value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/home/fractaluser/.local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 442, in make_tensor_proto
_AssertCompatible(values, dtype)
File "/home/fractaluser/.local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 353, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).name))
TypeError: Expected float32, got <function create_model..ortho_reg at 0x7fa3ee0d88c8> of type 'function' instead.
de/my_layers.py", line 39, in build
constraint=self.W_constraint)
TypeError: add_weight() got multiple values for argument 'name'
Hello, I noticed there's in the dictionary and we only keep the most frequent words in the dictionary. But I don't really understand what happened to the new words (they are all 'unk' in the dictionary, is that right? )that's only in the test data but not in the training data set? Please tell me what I'm missing. Appreciate it.
Hi @ruidan
I was wondering where you got the beer dataset, since it does not seem to be publicly available any more. It seems to me that the beer dataset you provide already has some stopword removal applied to it. Do you know how this was done, and whether the original datasets can be found anywhere?
Kind regards,
Stéphan
Hi, what version of tensorflow was used?
I have tried TF 0.11.0, 1.4.1 and 1.8.0.
With TF 0.11.0 I get:
File "/notebooks/model.py", line 15, in ortho_reg
reg = K.sum(K.square(K.dot(w_n, K.transpose(w_n)) - K.eye(w_n.shape[0].eval())))
AttributeError: 'Tensor' object has no attribute 'shape'
and with 1.4.1 and above I get:
File "/notebooks/model.py", line 30, in create_model
att_weights = Attention(name='att_weights')([e_w, y_s])
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 572, in __call__
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 635, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 172, in create_node
output_tensors = to_list(outbound_layer.call(input_tensors, mask=input_masks))
File "/notebooks/layers.py", line 57, in call
y = K.repeat_elements(y, self.steps, axis=1)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 1525, in repeat_elements
return concatenate(x_rep, axis)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 1427, in concatenate
return tf.concat(axis, [to_dense(x) for x in tensors])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1096, in concat
dtype=dtypes.int32).get_shape().assert_is_compatible_with(
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 836, in convert_to_tensor
as_ref=False)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 926, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 229, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 208, in constant
value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 383, in make_tensor_proto
_AssertCompatible(values, dtype)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 303, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.
The first error suggests to me you used something later than 0.11.0 since they added the 'shape' property to tensors, but 0.12.0 and above reversed the parameters in tf.concat
, so I'm not yet sure how keras 1.2.1 was used.
Any help would be appreciated :)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.