michigancog / vip Goto Github PK

View Code? Open in Web Editor NEW

219.0 15.0 37.0 711 KB

Video Platform for Action Recognition and Object Detection in Pytorch

License: MIT License

Python 99.58% Shell 0.42%

pytorch action-recognition object-detection c3d neural-networks deep-learning ssd imagenetvid mscoco resnet

vip's People

Contributors

Stargazers

Watchers

vip's Issues

Apply preprocessing functions to point coordinates

Addition of feature extraction option

Inclusion of a feature extraction option, possibly from any desired layer in a selected model.

Add author accuracies in Readme

Along with citations and links

Can you provide the link for 2.2 times cropped dataset of MPII+NZSL hand dataset?

Hi,

Great work guys!
Can you also provide the link for the MPII+NZSL hand dataset which is 2.2B times cropped?

Create outline for usage of ViP

In Wiki

Add Saliency Metrics

Initially add cc and nss

Separate each metric and loss into files

Create two new directories: metrics and losses. Each metric or loss would be self-contained in a separate file.
Avoids extremely long .py files

Scaling gradients when params requires_grad is False

If a network layer is frozen, the pseudo batch loop code crashes when it tries to scale the gradient.

Allow user to ignore final shape argument

Currently train.py (and maybe eval.py?) checks that the final_shape argument matches the actual image returned from the dataloader (line 149). Some architectures are able to handle multiple input shapes. Providing a method of ignoring this assertion (perhaps by setting final_shape to -1) would be helpful in some cases.

Add YouCook2-BB Dataset

config error

When I run python eval.py --cfg_file models/c3d/config_test.yaml:

Traceback (most recent call last):
File "eval.py", line 132, in
eval(**args)
File "eval.py", line 62, in eval
model = create_model_object(**args).to(device)
File "/home/byronnar/pyprojects/cv/video_re/models/models_import.py", line 30, in create_model_object
model = getattr(module, dir(module)[model_index])(**kwargs)
File "/home/byronnar/pyprojects/cv/video_re/models/c3d/c3d.py", line 56, in init
self.__load_pretrained_weights()
File "/home/byronnar/pyprojects/cv/video_re/models/c3d/c3d.py", line 128, in __load_pretrained_weights
p_dict = torch.load('weights/c3d-pretrained.pth')
File "/home/byronnar/anaconda3/envs/vip/lib/python3.6/site-packages/torch/serialization.py", line 387, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/home/byronnar/anaconda3/envs/vip/lib/python3.6/site-packages/torch/serialization.py", line 581, in _load
deserialized_objects[key].set_from_file(f, offset, f_should_read_directly)
RuntimeError: unexpected EOF, expected 25948574 more bytes. The file might be corrupted.
terminate called after throwing an instance of 'c10::Error'
what(): owning_ptr == NullType::singleton() || owning_ptr->refcount.load() > 0 ASSERT FAILED at /pytorch/c10/util/intrusive_ptr.h:350, please report a bug to PyTorch. intrusive_ptr: Can only intrusive_ptr::reclaim() owning pointers that were created using intrusive_ptr::release(). (reclaim at /pytorch/c10/util/intrusive_ptr.h:350)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f5186cc9441 in /home/byronnar/anaconda3/envs/vip/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f5186cc8d7a in /home/byronnar/anaconda3/envs/vip/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #2: THStorage_free + 0xca (0x7f510dab629a in /home/byronnar/anaconda3/envs/vip/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #3: + 0x149bbd (0x7f5187277bbd in /home/byronnar/anaconda3/envs/vip/lib/python3.6/site-packages/torch/lib/libtorch_python.so)

frame #21: __libc_start_main + 0xe7 (0x7f518be4ab97 in /lib/x86_64-linux-gnu/libc.so.6)

What should I do to solve this problem?
my devices:
os 1804
cuda 9.0
cudnn 7.3.1
python 3.6

Remove feature extraction code from c3d

Remove self.features = kwargs['model_features'] in c3d.py.
Also remove features from c3d config.

RuntimeError: Given input size: (512x1x7x7). Calculated output size: (512x0x4x4). Output size is too small

I try to use shorter clips to train. eg.15

Redundant learning rate decay when resuming experiment

Relevant lines of code: https://github.com/MichiganCOG/ViP/blob/master/train.py#L114-L115

Loading saved weights using pretrained argument also loads the last saved learning rate (after decaying per config file). However the learning rate is further decayed from the lines above, because the scheduler "loops" through all of the epochs again.

Example: If I ended an experiment with a learning rate of 1e-6 after decaying twice from 1e-4. Resuming that experiments gives me a starting learning rate of 1e-8.

Variable clip length

In certain cases the input to the network is not raw frames, but some computed features. All of the processed frames can be loaded at once, so it'd be useful to not specify the clip length and just read all available data per video.

This only works with batch_size = 1, but this is when the pseudo batch loop can come in handy.

Where should I put the dataset？

Allow Unseeded Training

Sometimes it is necessary to produce networks seeded randomly (for showing robust performance, or for ensembling). It would be nice to be able to do this without changing the config at each launch, especially if there is a delay between sending the start command and actually launching the program.

Pytorch definition of optimizers and schedulers

Allow you to specify the exact pytorch class and related parameters in the yaml file for the optimizer and scheduler you want to run. (e.g. torch.schedulers.MultiStepLR)

About gen_json_UCF101

Could you please provide the gen_json_UCF101 file? Cuz I am using the UCF101 dataset for I3D training.

Set layer-specific learning rates

Specify different learning rates for the layers of a network

Change x-axis of validation accuracy

Plot the validation accuracy and training loss on the same x-axis. Train and validation will then always be on the same scale.

JSON documentation does not match implementation (frame_size)

The detection_template.json file indicates that each individual frame should populate the frame_size parameter. However, when the file is read "KeyError: 'frame_size'" is returned, unless the frame size is a parameter nested directly under video (on the same level as base_path).

My guess is that the behavior is correct (since videos can't dynamically change size), but the documentation is incorrect.

End of epoch divides incorrectly

The logging loss sum appears to be divided by the expected number of samples (i.e., the full minibatch size), instead of the actual samples processed. This results in an abrupt reduction in magnitude of the logged loss when an epoch ends.

Seed Numpy in addition to Torch

Currently numpy is unseeded so all random function using it are not repeatable. The expectation is that the seed will be used for Torch and Numpy so that experiments will have identical results with the same seed.

Support for multi-gpu training & evaluation

Add option for DataParallel training in PyTorch.
It's pretty straight forward, the only issues are when accessing the state_dict and functions belonging to model (for multi-gpu training). It becomesmodel.module.state_dict instead of model.state_dict.

Add Gradient Clipping

No such file or directory: '/z/dat/HMDB51/train.json'

So, I installed every depedencies with "install.sh" on a Python 3.6. And whenever I try to train or eval the example's model with "python <eval.py/train.py> --cfg_file models/c3d/config_test.yaml" I get the following Python error : "FileNotFoundError: [Errno 2] No such file or directory: '/z/dat/HMDB51/train.json'". Can someone help me ?

Parser does not support scientific notation

Using scientific notation in the config file (e.g., lr: 1e-4) causes the json parser to read as a string, resulting in an error. Specifically in the case of learning rate this results in an error at line 96 of train (during the optimizer init), but is likely to result in errors elsewhere for different params.

Add logging class

Create a class object to be passed through every model, loss, and metric that has a method allowing you to add a plot to tensorboard for any specified variable.

Add DHF1K Dataset

Add the video saliency dataset DHF1K.

Add preprocessing for x-y translation

Randomly translate image along with object bounding box and point coordinates. Include bounds for the translation distances

Fails on zero grad

In instances where a neuron doesn't factor into the loss (e.g., a component of the loss is disabled for a specific experiment, resulting in a neuron or set of neurons being unused), autograd returns None for the unused connections. This results in a crash at the line:

param.grad *= 1./float(args['psuedo_batch_loop']*args['batch_size']

With the error:

TypeError: unsupported operand type(s) for *=: 'NoneType' and 'float'

This can be remedied by inserting:
if param.grad is not None:
prior to the line in question, but I'm unsure of any upstream consequences.

Reset validation accuracy every epoch

Currently validation accuracy is tracked throughout all of training.

Resuming experiment results in incorrect test logs

I've been seeing this phenomenon in my experiments, when restarting from already loaded weights. Could be related to #19? Note that logging is fine for training.

Attached are the relevant config files (in a tarball).