iliaschalkidis / elmo-keras Goto Github PK
View Code? Open in Web Editor NEWRe-implementation of ELMo on Keras
Re-implementation of ELMo on Keras
hi ,first of all thanks for implementation. but the question is when i change token_encoding
in parameters from word
to char
, it gives below error:
File "/path/to/ELMo-keras/elmo/model.py", line 162, in compile_elmo tied_to=embeddings if self.parameters['weight_tying'] else None) UnboundLocalError: local variable 'embeddings' referenced before assignment
aren't your code still support character embedding ?
for i in range(self.parameters['n_lstm_layers']):
if self.parameters['cuDNN']:
lstm = CuDNNLSTM(units=self.parameters['lstm_units_size'], return_sequences=True,
kernel_constraint=MinMaxNorm(-1*self.parameters['cell_clip'],
self.parameters['cell_clip']),
recurrent_constraint=MinMaxNorm(-1*self.parameters['cell_clip'],
self.parameters['cell_clip']))(lstm_inputs)
some problem here: always one lstm layer
Can you provide the keras and tensorflow's version in this repo?
My version, respectively, is:
keras='2.4.3'
tensorflow='2.3.1'
I suffer some problem.
I think it would be helpful to rewrite evaluate and perplexity function. They are really slow and memory consuming at the present.
Hi
I am trying to load weight from elmo_best_weights.hdf5
file and I am getting this error:
Traceback (most recent call last):
File ".../ELMo-keras/train_demo.py", line 93, in <module>
elmo_model.evaluate(test_generator)
File "...\ELMo-keras\elmo\model.py", line 225, in evaluate
print('Forward Langauge Model Perplexity: {}'.format(ELMo.perplexity(y_pred_forward, y_true_forward)))
File "...\ELMo-keras\elmo\model.py", line 297, in perplexity
y_true_seq = to_categorical(y_true_seq, y_pred_seq.shape[-1])
File "...\ELMo-keras\.venv\lib\site-packages\keras\utils\np_utils.py", line 34, in to_categorical
categorical[np.arange(n), y] = 1
IndexError: index 22246 is out of bounds for axis 1 with size 200
In model.py
was added method load_weights
:
def load_weights(self):
self._model.load_weights(os.path.join(MODELS_DIR, 'elmo_best_weights.hdf5'))
And my train_demo.py
looks like this:
parameters = # ... unchaged
# Set-up Generators
# ... unchanged
# Compile ELMo
elmo_model = ELMo(parameters)
elmo_model.compile_elmo(print_summary=True)
elmo_model.load_weights()
# Evaluate Bidirectional Language Model
elmo_model.evaluate(test_generator)
# Build ELMo meta-model to deploy for production and persist in disk
elmo_model.wrap_multi_elmo_encoder(print_summary=True, save=True)
Any idea how to get this work or where can be the problem?
Hi
I am having some problems in evaluation of trained model. I think the perplexity algorithm is not memory efficient. What hardware did you use?
I was trying to evaluate on PC with 32GB RAM and NVidia 1080TI 11GB.
In second test I wat trying to evaluate trained model on compute server with 64GB RAM and 3x NVidia 1080TI 11GB.
On both cases, I am getting MemoryError on evaluation step.
Any idea, how to evaluate and compute perplexity of trained model?
Great work!
So I tried to run train_demo.py but got the below error.
"ValueError: The two structures don't have the same sequence length. Input structure has length 0, while shallow structure has length 2. "
I used tensorflow==2.3.1 as suggested in the requirement.txt. I wonder if you have any ideas or suggestions for this error? Thanks in advance for your help!
As the param state: 'last' for 2nd LSTMs outputs,
but in your function, this is:
if state == 'last':
elmo_vectors = preds[0]
however, I think if we want to return the 2nd LSTMs outputs, the code is:
if state == 'last':
elmo_vectors = preds[-1]
I hope that you can answer my doubts, I will be grateful.
hello, when I test the code of test_generator.py,I find a bug in get_token_char_indices() of lm_generator.py
in line 132:
'if token_ids[1]' should changed to 'if token_ids[1].any()'?
first of all, i'm poor at english.
if you find impolite parts of my sentences, please forgive me generously.
lines 128~134 , in ELMo-keras/elmo/model.py .
re_lstm layer got lstm_inputs layer, but lstm_inputs layer is forward_lstm.
isn't weird? i think you intented to use lstm_inputs, in line 90. i think you have to change name.
anyway, i also implement elmo now, your code is very helpful. thank you so much.
Hi,
Great work! But I am wondering how to use the weight files after training. For example, how could I turn a sentence into a vector.
Best
Hello, When I use def load(self) in model.py, I find this error:
File "C:\Users\Anaconda3\lib\site-packages\keras\engine\saving.py", line 419, in load_model
model = _deserialize_model(f, custom_objects, compile)
ValueError: Unknown layer: SampledSoftmax
So I add 'SampledSoftmax': SampledSoftmax to custom_objects, and I meet the following error:
File "ELMo-keras/train_demo.py", line 131, in test_eval
elmo_model.load()
File "ELMo-keras\elmo\model.py", line 299, in load
'SampledSoftmax': SampledSoftmax})
File "C:\Users\Anaconda3\lib\site-packages\keras\engine\saving.py", line 419, in load_model
model = _deserialize_model(f, custom_objects, compile)
File "C:\Users\Anaconda3\lib\site-packages\keras\engine\saving.py", line 284, in _deserialize_model
str(len(weight_values)) +
TypeError: must be str, not bytes
ELMo-keras/elmo/lm_generator.py
Line 132 in 010bab9
The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
File "/ELMo-keras/elmo/lm_generator.py", line 132, in get_token_char_indices
if token_ids[1]:
File "/ELMo-keras/elmo/lm_generator.py", line 51, in __getitem__
word_char_indices_batch[i] = self.get_token_char_indices(sent_id=batch_id)
Hi, I test the elmo code and encounter building errors.
Could you post the requirements, specifically the tf and keras version ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.