Comments (5)
Oh, I read his code and found out that he had used the svm binary features from this link http://pr.cs.cornell.edu/humanactivities/data/features.tar
Once you successfully downloaded it, you need to edit and run readData.py for converting svm binary features to node and edge features (which are described in his paper). After that, you can run his model :)
from rnnexp.
Thank you so much for the tip! :)
The code does not make it easy to figure out what is happening at times, some comments would be really useful! Nevertheless it's already nice to have the source code to replicate the experiments!
Now I have the following directory structure:
/features_cad120_ground_truth_segmentation
- /features_binary_svm_format
- /segments_svm_format
Am I right to assume that this directory:
/scail/scratch/group/cvgl/ashesh/activity-anticipation/features_ground_truth'
Corresponds to the main directory?
/features_cad120_ground_truth_segmentation
or?
/features_cad120_ground_truth_segmentation/features_binary_svm_format
The author also mentions this cryptic directory:
'/scail/scratch/group/cvgl/ashesh/activity-anticipation/activityids_fold{0}.txt'
Have you figured out where I can find or how to create this file?
Thank you for the help!
EDIT: Managed to generate the pik files, in case there is anyone else with the same problem:
- Follow Ahn's link, download the features, extract them into the folders features_binary_svm_format and segments_svm_format,
- Substitute the file names in the readData file: where it says s='' just plug in s= path to the feature folder/features_binary_svm
- The fold files are just new line separated lists of the activities divided in N sets. You can easily generate them executing this command: "ls | tr -d '.txt' | split -l 32 - fold" in the /segments_svm_format folder, this command just lists all the activities and stores the list in 4 files (32 activities in each file)
Good luck!
from rnnexp.
I'm having the same problem, are you using the CAD120 dataset directly? how do you preprocess the dataset? would it be possible to release the prepared dataset for training?
Thank you very much in advance!
from rnnexp.
After reading through the readData code it's really hard to decipher what the feature arrays represent, have you managed to figure it out?
For example, X_tr_human_disjoint is an array with dimensions 25x93x790 where 790 is the dimension of the feature vector, do you know what the other two dimensions are?
The same with X_tr_objects_disjoint whose dimensions are 25x226x620 where 620 stands for the object feature vector.
In the human feature structure the 25 as far as I understood stands for the maximum number of segments and 93 for the training examples (segments) set size, but this is not coherent with the dimensions of the object structure, what does the 226 stand for?
Thanks in advance for your time and attention!
EDIT: In case anyone has the same question, the mysterious 226 is a dimension that represents the concatenation of the objects, to avoid having a variable sized frame the author just concatenates the objects along the dimension of the activity. The 93 corresponds to the activities and 226 to the sum of the objects along all activities, the average number of objects in every activity is 2.43, 2.43*93 = 226 = Total number of objects ever seen along all segments (not distinct!)
Since the author never stores which object corresponds to which activity I now wonder how the author is able to reconstruct the original structure in the end?
from rnnexp.
-
Dimensions of the feature is T x N x D
where T is number of time steps (segment), N is number of training (testing) samples and D is the dimension of the feature vector -
Why the number of training samples of objects and human are different?
- Actually, CAD-120 is human-object interaction dataset (in one activity, we can have 2-3 interacted objects). CAD-120 dataset is not only used for activity recognition, but the authors also used it for object affordances recognition. Briefly, affordance is the possibility of an action on an object or environment. It means one object at one time have one affordance. Because X_tr_objects_disjoint is used for both object affordance detection and anticipation, it has much more training examples than X_tr_human_disjoint which is only used for human activity labelling.
- So, you just misunderstood the purpose of X_tr_objects_disjoint.
- How the author is able to reconstruct the original structure in the end?
- I think you have not fully understood the paper of S-RNN. He did mentioned about sharing parameter mechanism in his paper. He trained human activity recognition and affordance recognition at the same time.
loss_layer_1 = self.train_layer_1(X_shared_1_minibatch,X_1_minibatch,Y_1_minibatch)
loss_layer_2 = self.train_layer_2(X_shared_2_minibatch,X_2_minibatch,Y_2_minibatch)
- And, Human-Object relation feature (for human activity recognition) and Object-Human relation (for object affordance recognition) are fed into the same RNN node (shared_layers), while X_tr_human_disjoint is fed into layer_1 and X_tr_objects_disjoint is fed into (layer_2).
self.X = shared_layers[0].input
self.X_1 = layer_1[0].input
self.X_2 = layer_2[0].input
- You can find the above code in sharedRNN file. To see how he used sharedRNN use can read this code in activity-rnn-full-model.
shared_input_layer = TemporalInputFeatures(inputJointFeatures)
shared_hidden_layer = LSTM('tanh', 'sigmoid', lstm_init, 4, 128, rng=rng)
shared_layers = [shared_input_layer, shared_hidden_layer]
human_layers = [ConcatenateFeatures(inputHumanFeatures), LSTM('tanh', 'sigmoid', lstm_init, 4, 256, rng=rng),
softmax(num_sub_activities, softmax_init, rng=rng)]
object_layers = [ConcatenateFeatures(inputObjectFeatures), LSTM('tanh', 'sigmoid', lstm_init, 4, 256, rng=rng),
softmax(num_affordances, softmax_init, rng=rng)]
trY_1 = T.lmatrix()
trY_2 = T.lmatrix()
sharedrnn = SharedRNN(shared_layers, human_layers, object_layers, softmax_loss, trY_1, trY_2, 1e-3)
Good luck!!!
from rnnexp.
Related Issues (18)
- Missing dataset and checkpoints HOT 2
- How to display the generated sequence? HOT 2
- nodeRNN input
- Checkpoint path does not exist. Exiting!! HOT 2
- how could I run maneuver-rnn.py with index and fold?
- How did you go from Poses_RawAngles_S1.tgz to preprocessed data on exponential map? HOT 4
- Dataset and checkpoints for maneuver anticipation
- Can you provide the preprocessed data represented with quaternions?
- How to save model HOT 1
- Run the /RNNexp/anticipatory-rnn/activity-anticipation/readData.py, error: ImportError: No module named math
- Error in MultipleRNNCombine HOT 1
- hot to run s-rnn HOT 5
- Pik file format HOT 1
- S-RNN, ERD, and LSTM-3LR pretrained models HOT 4
- Pre-trained models do not reproduce paper results HOT 16
- Pretrained models notavailable HOT 1
- nodeFeatureRanges HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rnnexp.