Comments (9)
Hi takuyara, we extracted our features by I3D, InceptionResNetV2 , and BUTD
from rmn.
Thanks for your quick reply!
from rmn.
Hi takuyara, we extracted our features by I3D, InceptionResNetV2 , and BUTD
Hi tgc,
I learned the I3D code you provided above. I am wondering how to set max_interval and overlap to get equally-spaced 26 features for each video? Or it's no need to set these two parameters, and we just need to extract around 209 frames as the input of the i3d model?
from rmn.
Hi takuyara, we extracted our features by I3D, InceptionResNetV2 , and BUTD
Hi tgc,
I learned the I3D code you provided above. I am wondering how to set max_interval and overlap to get equally-spaced 26 features for each video? Or it's no need to set these two parameters, and we just need to extract around 209 frames as the input of the i3d model?
We first set max_interval=64, overlap=8 to extract features and then sample 26 of them.
from rmn.
e first set max_interval=64, overlap=8 to extract features and then sa
Hi tgc,
Thanks for your reply! Sorry to bother you again, I still have 2 questions about this,
- why do you need to set max_interval and overlap? If you just input 209 frames as the 'clip', you can exactly get 1x26x1024 as the feature by 'features = get_features(clip, i3d_rgb)'. So these two parameters are used for saving the computation cost? If not, how to determine them specifically?
- For each video, 2D features (irv2, 1x26x1536) are extracted from 26 images (frames). And i3d features are extracted from 26 segments, which is not the same as the 26 frames. Is it good to concatenate these two features in this dimension (26)? For example, 1x26x2560 cannot be represented as " one video has 26 frames, each frame has 2560 features".
from rmn.
Hi tgc,
Thanks for your reply! Sorry to bother you again, I still have 2 questions about this,
- why do you need to set max_interval and overlap? If you just input 209 frames as the 'clip', you can exactly get 1x26x1024 as the feature by 'features = get_features(clip, i3d_rgb)'. So these two parameters are used for saving the computation cost? If not, how to determine them specifically?
- For each video, 2D features (irv2, 1x26x1536) are extracted from 26 images (frames). And i3d features are extracted from 26 segments, which is not the same as the 26 frames. Is it good to concatenate these two features in this dimension (26)? For example, 1x26x2560 cannot be represented as " one video has 26 frames, each frame has 2560 features".
- Maybe the frames of some videos are less than 209, and 209 frames may miss important information for those long videos with much more than 209 frames. These two parameters are used for saving the computation cost.
- It is difficult to perfectly align 2d and 3d features, if you know how to do it, you are welcome to comment.
from rmn.
Hi tgc,
Thanks for your reply! Sorry to bother you again, I still have 2 questions about this,
- why do you need to set max_interval and overlap? If you just input 209 frames as the 'clip', you can exactly get 1x26x1024 as the feature by 'features = get_features(clip, i3d_rgb)'. So these two parameters are used for saving the computation cost? If not, how to determine them specifically?
- For each video, 2D features (irv2, 1x26x1536) are extracted from 26 images (frames). And i3d features are extracted from 26 segments, which is not the same as the 26 frames. Is it good to concatenate these two features in this dimension (26)? For example, 1x26x2560 cannot be represented as " one video has 26 frames, each frame has 2560 features".
- Maybe the frames of some videos are less than 209, and 209 frames may miss important information for those long videos with much more than 209 frames. These two parameters are used for saving the computation cost.
- It is difficult to perfectly align 2d and 3d features, if you know how to do it, you are welcome to comment.
Thanks for your explanations!
from rmn.
Hi tgc,
Thanks for your reply! Sorry to bother you again, I still have 2 questions about this,
- why do you need to set max_interval and overlap? If you just input 209 frames as the 'clip', you can exactly get 1x26x1024 as the feature by 'features = get_features(clip, i3d_rgb)'. So these two parameters are used for saving the computation cost? If not, how to determine them specifically?
- For each video, 2D features (irv2, 1x26x1536) are extracted from 26 images (frames). And i3d features are extracted from 26 segments, which is not the same as the 26 frames. Is it good to concatenate these two features in this dimension (26)? For example, 1x26x2560 cannot be represented as " one video has 26 frames, each frame has 2560 features".
- Maybe the frames of some videos are less than 209, and 209 frames may miss important information for those long videos with much more than 209 frames. These two parameters are used for saving the computation cost.
- It is difficult to perfectly align 2d and 3d features, if you know how to do it, you are welcome to comment.
Sorry to bother you again! I am still confused about the steps to determine the 'max_interval' and 'overlap'. Could you please give an example, like when we have 2 videos, and one is 10-min long at 25 FPS while another is 8-min long at 30 FPS? Many thanks!
from rmn.
Hi, I have tried to extract features using I3D, IRV2 and BUTD like you mentioned but I am not able to get the same features as you. The features that I obtained seem to be very different from the provided h5 file...
How to get the same features as produced in the h5 file?
How were the equally spaced frames selected? Is it by the following method: index = [int(ceil(i*len(l)/26)) for i in range(26)]
Are the equally spaced frames only needed for irv2 and butd, while i3d takes whole video as input to generate 43 x 1024 for msvd videos then equally spaced by the same method as above?
May I know what other steps are required during extraction?
Thank you very much!
from rmn.
Related Issues (20)
- hi! would like to know how to resolve the following issue
- hi! would like to know how to get these
- the mismatch error happened when using the pretarined model you provide. HOT 5
- a bug report HOT 4
- result reproduce for msr-vtt dataset HOT 9
- TypeError: h5py objects cannot be pickled
- Spatial Feats HOT 3
- What's the range of cider score?
- POS
- The link of visual and text features cannot be opened HOT 1
- a problem about region_feature file HOT 3
- A refinement report
- text feature processing HOT 1
- Would I ask one question?
- When I tried to run evaluatie.py it reported the function incorrectly HOT 2
- a problem about msr-vtt_model.pth HOT 1
- a problem about sample.py
- a problem about features HOT 2
- problems about feature extraction models
- Problem about Feature Extraction HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rmn.