[ACM MM 2017 & IEEE TMM 2020] This is the Theano code for the paper "Video Description with Spatial Temporal Attention"

Home Page: https://github.com/tuyunbin/Video-Description-with-Spatial-Temporal-Attention

Python 100.00%

video-description-with-spatial-temporal-attention's People

Contributors

Stargazers

Watchers

Forkers

wangyt19 156aasdfg ammarkamoona shiyaya xiaobingdu chnhuangyan xiaoyongbest1234 ammieqi leiwill gintsuki9349 amengi haoruilee cocacolabai rongfei-chen lzd0825

video-description-with-spatial-temporal-attention's Issues

Last program gave error

this error came when i run train_model.py
Please resolve this.
I am trying many times
plessss

args must be like a=X b.c=X
('current save dir ', './exp/test_non/')
./exp/test_non/ already exists!
preparing reload
('save dir ', './exp/test_non/')
('from_dir ', 'C:/Users/KARAN AGGARWAL/Downloads/workplace2/Video-Description-with-Spatial-Temporal-Attention-master/pre-trained_model/pre-trained_model/')
setting current model config with the old one
('erasing everything in ', './exp/test_non/')
saving model config into ./exp/test_non/model_config.pkl
Model Type: attention
Host: karan
Command: C:\Users\KARAN AGGARWAL\Anaconda3\lib\site-packages\ipykernel_launcher.py -f C:\Users\KARAN AGGARWAL\AppData\Roaming\jupyter\runtime\kernel-c84b856d-7eba-49a8-af4f-6bbb14f0c953.json
training an attention model
Loading data
loading youtube2text googlenet features
C:\Users\KARAN AGGARWAL\Downloads\workplace2\Video-Description-with-Spatial-Temporal-Attention-master\model_attention.py:37: UserWarning: Feeding context to output directly seems to hurt.
warnings.warn('Feeding context to output directly seems to hurt.')
uneven minibath chunking, overall 32, last one 12
uneven minibath chunking, overall 200, last one 91
uneven minibath chunking, overall 200, last one 168
init params
no lstm on ctx
Reloading model params...
buliding sampler
Building f_init... Done
building f_next...
Done
building f_log_probs
building f_alphal
building f_alphag
building f_alpham
building f_alphalt
compute grad
build train fns
compilation took 250.9914 sec
Optimization
loading history error...
Epoch 0
C:\Users\KARAN AGGARWAL\Anaconda3\lib\site-packages\theano\tensor\subtensor.py:2197: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use arr[tuple(seq)] instead of arr[seq]. In the future this will be interpreted as an array index, arr[np.array(seq)], which will result either in an error or a different result.
rval = inputs[0].getitem(inputs[1:])

should I need theano just for testing the pretrained model ?

hi,dear
If I just want to test the trained model, Must I install the theano ?
thx

Can not download dataset ! i can't open the link, is there any other way to download?

i can't open the link, is there any other way to download?

test

Hello, blogger!I would like to ask how this project implements the input video file, such as.MP4 file, and then output video description file, such as. TXT file.Can the whole process bother the blogger to write a column?

How to reproduce your Motion Features?

Hi, I want to reproduce how did you get the Motion Features?
In your paper, I saw that you used C3D to extract motion features. Can you provide C3D links that you have used?

I cannot download the data of these two links('pre-trained_model' and 'local features').

"Firstly, you need to download the pre-trained model at this link" This MultCloud cloud disk link is also undownloadable.Can you change it?
Also, can you provide the 'extraction code' for "local features"?

The download link of local features needs a password, which can be provided？

This link https://pan.baidu.com/s/1TvZL0ktP2tMxNJV4kYCLeg
thanks

Hi, I have some data that I don't know how to use, can you help me?

Hi. Your work is very good, thank you very much for releasing your code.
I have a few questions to consult you:

When I download you offered links, such as pre-pocessed datasets, global features, motion features, where the features' link, I need to put these on?How do I use it?
How can I get the MSVD dataset and the MSR-VTT dataset? Can you provide the corresponding download link?
I look forward to your reply and thank you very much for your help.

Can you change the links of three Google cloud to Baidu cloud？

Because there is no way to get the vpn and turn over the wall, so can you please change the link to Baidu cloud? Thank you very much!

请问“local features”的百度云盘的提取码是多少？

which function should I use to generate caption on new video data?

Hi. Your work is very good.I have some questions to consult you.
1.Have you figured out how to generate the captions for new video data?
2.How do I use trained model?
can you provide some guidance?Thanks.

How to use pre-trained model to generate description for my own video?

file

def _filter_googlenet(self, vidID):
    f = h5py.File('/home/sdc/tuyunbin/msvd/msvd_google/msvd_%s_google.h5' % vidID, 'r')
    global_feat = f[vidID][:]
    f.close()
    global_feat = self.get_sub_frames(global_feat)
    return global_feat

Where are the .h5 files

Theano error

Hello Sir,

I got again error. When i run the train_model.py.
The following error occurred.
Please help me to resolve this error.

args must be like a=X b.c=X
('current save dir ', './exp/test_non/')
./exp/test_non/ already exists!
preparing reload
('save dir ', './exp/test_non/')
('from_dir ', 'C:/Users/KARAN AGGARWAL/Downloads/workplace2/Video-Description-with-Spatial-Temporal-Attention-master/pre-trained_model/pre-trained_model/')
setting current model config with the old one
('erasing everything in ', './exp/test_non/')
saving model config into ./exp/test_non/model_config.pkl
Model Type: attention
Host: karan
Command: C:\Users\KARAN AGGARWAL\Anaconda3\lib\site-packages\ipykernel_launcher.py -f C:\Users\KARAN AGGARWAL\AppData\Roaming\jupyter\runtime\kernel-d1b6b355-6bab-4566-aa86-fc28aff0a72c.json
training an attention model
Loading data
loading youtube2text googlenet features
C:\Users\KARAN AGGARWAL\Downloads\workplace2\Video-Description-with-Spatial-Temporal-Attention-master\model_attention.py:37: UserWarning: Feeding context to output directly seems to hurt.
warnings.warn('Feeding context to output directly seems to hurt.')
uneven minibath chunking, overall 32, last one 12
uneven minibath chunking, overall 200, last one 91
uneven minibath chunking, overall 200, last one 168
init params
no lstm on ctx
Reloading model params...
buliding sampler
Building f_init... Done
building f_next...
Done
building f_log_probs
building f_alphal
building f_alphag
building f_alpham
building f_alphalt
compute grad
build train fns

TypeError Traceback (most recent call last)
~\Anaconda3\lib\site-packages\theano\compile\pfunc.py in rebuild_collect_shared(outputs, inputs, replace, updates, rebuild_strict, copy_inputs_over, no_default_updates)
192 update_val = store_into.type.filter_variable(update_val,
--> 193 allow_convert=False)
194 except TypeError:

~\Anaconda3\lib\site-packages\theano\tensor\type.py in filter_variable(self, other, allow_convert)
233 other=other,
--> 234 self=self))
235

TypeError: Cannot convert Type TensorType(float64, matrix) (of Variable Elemwise{add,no_inplace}.0) into Type TensorType(float32, matrix). You can try to manually convert Elemwise{add,no_inplace}.0 into a TensorType(float32, matrix).

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last)
in ()
102
103 state = expand(args)
--> 104 sys.exit(main(state))

in main(state, channel)
88 def main(state, channel=None):
89 set_config(config, state)
---> 90 train_from_scratch(config, state, channel)
91
92

in train_from_scratch(config, state, channel)
81 print(('Command: %s' % ' '.join(sys.argv)))
82 if config.model == 'attention':
---> 83 model_attention.train_from_scratch(state, channel)
84 else:
85 raise NotImplementedError()

~\Downloads\workplace2\Video-Description-with-Spatial-Temporal-Attention-master\model_attention.py in train_from_scratch(state, channel)
1560 print('training an attention model')
1561 model = Attention(channel)
-> 1562 model.train(**state.attention)
1563 print('training time in total %.4f sec'%(time.time()-t0))
1564

~\Downloads\workplace2\Video-Description-with-Spatial-Temporal-Attention-master\model_attention.py in train(self, random_seed, dim_word, ctxglm_dim, ctxg_dim, ctxl_dim, ctxm_dim, dim, n_layers_out, n_layers_init, encoder, encoder_dim, prev2out, ctx2out, patience, max_epochs, dispFreq, decay_c, alpha_c, alpha_entropy_r, lrate, selector, n_words, maxlen, optimizer, clip_c, batch_size, valid_batch_size, save_model_dir, validFreq, saveFreq, sampleFreq, metric, dataset, video_feature, use_dropout, reload_, from_dir, K, OutOf, verbose, debug)
1207 f_grad_shared, f_update = eval(optimizer)(lr, tparams, grads,
1208 [x, mask, ctxg, mask_ctxg,ctxl, mask_ctxl,ctxm,mask_ctxm], cost,
-> 1209 extra + grads)
1210
1211 print('compilation took %.4f sec'%(time.time()-t0))

~\Downloads\workplace2\Video-Description-with-Spatial-Temporal-Attention-master\common.py in adadelta(lr, tparams, grads, inp, cost, extra)
185
186 f_grad_shared = theano.function(inp, [cost] + extra, updates=zgup+rg2up,
--> 187 profile=False, on_unused_input='ignore')
188
189 updir = [-tensor.sqrt(ru2 + 1e-6) / tensor.sqrt(rg2 + 1e-6) * zg for zg, ru2, rg2 in zip(zipped_grads, running_up2, running_grads2)]

~\Anaconda3\lib\site-packages\theano\compile\function.py in function(inputs, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input)
315 on_unused_input=on_unused_input,
316 profile=profile,
--> 317 output_keys=output_keys)
318 return fn

~\Anaconda3\lib\site-packages\theano\compile\pfunc.py in pfunc(params, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input, output_keys)
447 rebuild_strict=rebuild_strict,
448 copy_inputs_over=True,
--> 449 no_default_updates=no_default_updates)
450 # extracting the arguments
451 input_variables, cloned_extended_outputs, other_stuff = output_vars

~\Anaconda3\lib\site-packages\theano\compile\pfunc.py in rebuild_collect_shared(outputs, inputs, replace, updates, rebuild_strict, copy_inputs_over, no_default_updates)
206 ' function to remove broadcastable dimensions.')
207
--> 208 raise TypeError(err_msg, err_sug)
209 assert update_val.type == store_into.type
210

TypeError: ('An update must have the same type as the original shared variable (shared_var=Wemb_rgrad2, shared_var.type=TensorType(float32, matrix), update_val=Elemwise{add,no_inplace}.0, update_val.type=TensorType(float64, matrix)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')

run model_attention.py

D:\Anaconda2\envs\Theano\python.exe D:/PythonCode/Video-Description-with-Spatial-Temporal-Attention-master/train_model.py
Traceback (most recent call last):
File "", line 1, in
File "D:\Anaconda2\envs\Theano\lib\multiprocessing\forking.py", line 380, in main
prepare(preparation_data)
File "D:\Anaconda2\envs\Theano\lib\multiprocessing\forking.py", line 510, in prepare
'parents_main', file, path_name, etc
File "D:\PythonCode\Video-Description-with-Spatial-Temporal-Attention-master\train_model.py", line 8, in
import model_attention
File "D:\PythonCode\Video-Description-with-Spatial-Temporal-Attention-master\model_attention.py", line 21, in
import metrics
File "D:\PythonCode\Video-Description-with-Spatial-Temporal-Attention-master\metrics.py", line 74, in
manager = Manager()
File "D:\Anaconda2\envs\Theano\lib\multiprocessing_init_.py", line 99, in Manager
m.start()
File "D:\Anaconda2\envs\Theano\lib\multiprocessing\managers.py", line 524, in start
self._process.start()
File "D:\Anaconda2\envs\Theano\lib\multiprocessing\process.py", line 130, in start
self._popen = Popen(self)
File "D:\Anaconda2\envs\Theano\lib\multiprocessing\forking.py", line 258, in init
cmd = get_command_line() + [rhandle]
File "D:\Anaconda2\envs\Theano\lib\multiprocessing\forking.py", line 358, in get_command_line
is not going to be frozen to produce a Windows executable.''')
RuntimeError:
Attempt to start a new process before the current process
has finished its bootstrapping phase.

        This probably means that you are on Windows and you have
        forgotten to use the proper idiom in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce a Windows executable.

Traceback (most recent call last):
File "D:/PythonCode/Video-Description-with-Spatial-Temporal-Attention-master/train_model.py", line 8, in
import model_attention
File "D:\PythonCode\Video-Description-with-Spatial-Temporal-Attention-master\model_attention.py", line 21, in
import metrics
File "D:\PythonCode\Video-Description-with-Spatial-Temporal-Attention-master\metrics.py", line 74, in
manager = Manager()
File "D:\Anaconda2\envs\Theano\lib\multiprocessing_init_.py", line 99, in Manager
m.start()
File "D:\Anaconda2\envs\Theano\lib\multiprocessing\managers.py", line 528, in start
self._address = reader.recv()
EOFError

Process finished with exit code 1

what specific implementation steps can see the experimental results in the paper?

Hello, if I use the model you have trained, what specific execution steps can see the experimental results in the paper, and what files need to be run.

tuyunbin / video-description-with-spatial-temporal-attention Goto Github PK

video-description-with-spatial-temporal-attention's People

Contributors

Stargazers

Watchers

Forkers

video-description-with-spatial-temporal-attention's Issues

Recommend Projects

Recommend Topics

Recommend Org