zhengshou / autoloc Goto Github PK
View Code? Open in Web Editor NEWAutoLoc: Weakly-supervised Temporal Action Localization in Untrimmed Videos. ECCV'18.
License: MIT License
AutoLoc: Weakly-supervised Temporal Action Localization in Untrimmed Videos. ECCV'18.
License: MIT License
Hi, many thanks for your work! May I ask what is the format of video width when used to clip [x1,x2]. I thought the video width is a number that shows the number of snippets, but it seems to be an array in the code.
Dear author:
Really great work!
Could you share the final caffemodel on both datasets so that we can generate the detection result without training by ourselves?
That would be really nice of you!
Dear author:
When I tried to compile your BVLC_Caffe, I encountered a bug which told me that "caffe/layers/log_layer. hpp " was missing. I downloaded it from GITHUB of BVLC and covered it with your version. After that, I successfully compiled Caffe.
I'm not sure if this is a bug or if there's something wrong with my operation.
Thanks a lot!
当我尝试去编译你的BVLC_CAFFE时,我遇到了一个bug,它提示我缺少了 "caffe/layers/log_layer.hpp"这个文件,我从BVLC的GITHUB上下载了CAFFE并用你们的版本覆盖他的版本后,我成功的编译了CAFFE。不知道这是不是一个问题? 顺便感谢~
Hi,
Thanks for the nice work. I thought about the question a lot, but can't figure out why. If I may, can I take several minutes from you to answer that?
The thing is: I have the impression that, there are no shared parameters between localization branch and classification branch, so classification branch should be trained firstly, and then the localization branch, right? That is to say, OIC /layer or OIC selection is serving as the supervision for the localization branch. If all are correct, I don't understand why the autoloc result is better than the supervision(OIC selection)? How it outperforms the supervision if no shared parameters allowed?
That confused me a lot, and also hard for me to find the answer in the paper. I really appreciate it if you can help. Hope to get you back soon!
Thanks!
June
Hello,
Is there a scheduled release date for the code?
Thank you,
Which temporal annotation tool did you use?Thank you.
Hi, I tried to download features from OneDrive. However, If found this message:
This item might not exist or is no longer available
This item might have been deleted, expired, or you might not have permission to view it. Contact the owner of this item for more information
Could you check and re-upload the feature, please?
Thank you.
Dear author:
I got shape mismatch error when loading the pretrained model from your autoloc_iter_200.caffemodel.
The error is:
F1227 23:08:33.047071 149335 net.cpp:759] Cannot copy param 0 weights from layer 'bbox_pred'; shape mismatch. Source param shape is 26 128 1 1 (3328); target param shape is 12 128 1 1 (1536). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
The num_output of layer 'bbox_pred' is calculated by 2 * len(cfg.TRAIN.ANCHOR_SCALES)
and should be 12*128*1*1
, but the given shape from the caffemodel is 26*128*1*1
. Maybe you train the model with 13 anchors not 6 anchors that reported in the config.
So how to fix this problem?
Dear author:
I read the code about oic_loss layer. I am wondering what's the purpose of _frame_prior when you select the most likely anchor per temporal position?
Why don't you use the original score instead of calculating ref_score using following code?
len_frame = inner_frames[1, :, x] - inner_frames[0, :, x]
ref_scores = cur_scores * norm.pdf(len_frame, *cur_prior)
a = np.argmax(ref_scores)
Dear author:
Why there are only features of 2304 validation videos provided, while in your paper it is claimed that there are 2383 videos in validation set.
Looking forward to your answer! Thanks a lot!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.