Git Product home page Git Product logo

multimodal_movie_analysis's Introduction

multimodal_movie_analysis

Audio

To analyze a movie in terms of its auditory content, do the following:

cd analyze_audio
python3 analyze_audio.py -f movie.wav

Note: You will need to create a folder in analyze_audio/segment_models where you will store your audio SVM segment classifiers. See analyze_audio/readme.md for instructions on how to train these audio classifiers. Currently the audio analysis module expects 5 audio classifiers: (1) a generic audio classifier (4-classes) (2) two speech emotion classifiers and (3) two musical emotion classifiers.

Visual

To extract hand-crafted audio features run the following:

python3 analyze_visual.py -f ../V236_915000__0.mp4

The features are saved in npy files. The main functionality is implemented in function process_video that extracts features from specific file. See analyze_visual/Readme.md for more details.

You can also train a supervised model of video shots (e.g. types of shots):

python3 train.py -v data/class1 data/class2 -a SVM

The following files will be saved to disk:

  • shot_classifier_SVM.pkl the classifier
  • shot_classifier_SVM_scaler.pkl the scaler
  • shot_classifier_SVM_results.json the cross validation results
  • shot_classifier_conf_mat_SVC().jpg the confusion matrix of the cross validation procedure

As soon as the supervised model is trained you can classify an unknown shot (or shots organized in folders):

python3 wrapper.py -m SVM -i test.mp4

The following script detects the change of the shots in a video file and it stores the respective shots in individual files. It can be used in combination with the wrapper.py script above to analyze a movie per shot.

python3 shot_generator.py -f data/file.mp4

multimodal_movie_analysis's People

Contributors

apoman38 avatar kbogas avatar pakoromilas avatar theopsall avatar tyiannak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

multimodal_movie_analysis's Issues

MINOR: Printing feature stats vector error

In analyze_visual.analyze_visual.py, in line 397 the shape of the feature matrix is printed instead of the shape of the feature_stats vector.
Change:

print('Shape of feature stats vector including'
      ' object features (after smoothing'
      ' object confidences): {}'.format(feature_matrix.shape))

to

print('Shape of feature stats vector including'
      ' object features (after smoothing'
      ' object confidences): {}'.format(feature_stats.shape))

TypeError with scikit-image==0.18.0

With the scikit-image==0.18.0, i get the error :
Using: cuda:0
Downloading: "https://github.com/NVIDIA/DeepLearningExamples/archive/torchhub.zip" to /home/theo/.cache/torch/hub/torchhub.zip
Downloading: "https://download.pytorch.org/models/resnet50-19c8e357.pth" to /home/theo/.cache/torch/hub/checkpoints/resnet50-19c8e357.pth
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 97.8M/97.8M [00:02<00:00, 37.3MB/s]
Using cache found in /home/theo/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub
Traceback (most recent call last):
File "analyze_visual.py", line 35, in
generic_model = gmodel.SsdNvidia()
File "/home/theo/Downloads/multimodal_movie_analysis/analyze_visual/object_detection/generic_model.py", line 107, in init
self.utils = torch.hub.load('NVIDIA/DeepLearningExamples:torchhub',
File "/home/theo/.local/lib/python3.8/site-packages/torch/hub.py", line 370, in load
model = _load_local(repo_or_dir, model, *args, **kwargs)
File "/home/theo/.local/lib/python3.8/site-packages/torch/hub.py", line 399, in _load_local
model = entry(*args, **kwargs)
File "/home/theo/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub/hubconf.py", line 230, in nvidia_ssd_processing_utils
import skimage
File "/home/theo/.local/lib/python3.8/site-packages/skimage/init.py", line 135, in
from .data import data_dir
File "/home/theo/.local/lib/python3.8/site-packages/skimage/data/init.py", line 156, in
image_fetcher, data_dir = create_image_fetcher()
File "/home/theo/.local/lib/python3.8/site-packages/skimage/data/init.py", line 136, in create_image_fetcher
image_fetcher = pooch.create(
TypeError: create() got an unexpected keyword argument 'retry_if_failed'

By the way, with the scikit-image==0.17.2 there is no any error.

Problem when trying to download SSD model endpoint

Getting this error @apoman38

analyze_visual|master⚡ ⇒ python3 shot_generator.py -d ~/Downloads/videos                         
Using: cpu
Downloading: "https://github.com/NVIDIA/DeepLearningExamples/archive/torchhub.zip" to /Users/tyiannak/.cache/torch/hub/torchhub.zip
Downloading: "https://download.pytorch.org/models/resnet50-19c8e357.pth" to /Users/tyiannak/.cache/torch/hub/checkpoints/resnet50-19c8e357.pth
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 97.8M/97.8M [00:18<00:00, 5.61MB/s]
Downloading checkpoint from https://api.ngc.nvidia.com/v2/models/nvidia/ssdpyt_fp32/versions/1/files/nvidia_ssdpyt_fp32_20190225.pt
Traceback (most recent call last):
  File "shot_generator.py", line 5, in <module>
    from analyze_visual import *
  File "/Users/tyiannak/Research/libraries/multimodal_movie_analysis/analyze_visual/analyze_visual.py", line 35, in <module>
    generic_model = gmodel.SsdNvidia()
  File "/Users/tyiannak/Research/libraries/multimodal_movie_analysis/analyze_visual/object_detection/generic_model.py", line 64, in __init__
    ckpt_file = _download_checkpoint(checkpoint_str, force_reload=False)
  File "/Users/tyiannak/Research/libraries/multimodal_movie_analysis/analyze_visual/object_detection/generic_model.py", line 17, in _download_checkpoint
    urllib.request.urlretrieve(checkpoint, ckpt_file)
  File "/Users/tyiannak/.pyenv/versions/3.7.3/lib/python3.7/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/Users/tyiannak/.pyenv/versions/3.7.3/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/Users/tyiannak/.pyenv/versions/3.7.3/lib/python3.7/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/Users/tyiannak/.pyenv/versions/3.7.3/lib/python3.7/urllib/request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "/Users/tyiannak/.pyenv/versions/3.7.3/lib/python3.7/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/Users/tyiannak/.pyenv/versions/3.7.3/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/Users/tyiannak/.pyenv/versions/3.7.3/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: 

importing problems

When opening in Pycharm (PyCharm 2020.2.3 (Professional Edition), in Ubuntu 20.04.1 LTS) using the root folder as the root of the project, I get the following import error when running from analyze_visual.analyze_visual import process_video:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/snap/pycharm-professional/218/plugins/python/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "/home/zappatistas20/PycharmProjects/multimodal_movie_analysis/analyze_visual/analyze_visual.py", line 30, in <module>
    from object_detection import detection_utils as dutils
  File "/snap/pycharm-professional/218/plugins/python/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
ModuleNotFoundError: No module named 'object_detection'

I fixed that (and all the errors that followed after that concerning ModuleNotFoundError) by adding the absolute path to the targeted module like so:

In the analyze_visual.py:

from analyze_visual.object_detection import detection_utils as dutils
from analyze_visual.object_detection import generic_model as gmodel
from analyze_visual.utils import *

In the detection_utils.py:

from analyze_visual.utils import rect_area
from analyze_visual.utils import intersect_rectangles

I am guessing the same should happen in the analyze_textual files as well.

analyze_visual problem with empty tensors

Hello!
I am trying to extract features from this video

I am using the following command: python analyze_visual.py -f ../data/V236_915000__0.mp4
and I am getting the following result:

Using: cpu
Using cache found in /home/zappatistas20/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub
Using cache found in /home/zappatistas20/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub
Began processing video : ../data/V236_915000__0.mp4
FPS      = 23.976023976023978
Duration = 40.04 - 00:00:40.03
[W NNPACK.cpp:80] Could not initialize NNPACK! Reason: Unsupported hardware.
Traceback (most recent call last):
  File "analyze_visual.py", line 476, in <module>
    main(sys.argv)
  File "analyze_visual.py", line 454, in main
    save_results)
  File "analyze_visual.py", line 303, in process_video
    objects = generic_model.detect(frame, 0.1)
  File "/home/zappatistas20/PycharmProjects/multimodal_movie_analysis/analyze_visual/object_detection/generic_model.py", line 132, in detect
    results = self.utils.decode_results(detections_batch)
  File "/home/zappatistas20/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub/hubconf.py", line 298, in decode_results
    results = encoder.decode_batch(ploc, plabel, criteria=0.5, max_output=20)
  File "/home/zappatistas20/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub/PyTorch/Detection/SSD/src/utils.py", line 154, in decode_batch
    output.append(self.decode_single(bbox, prob, criteria, max_output))
  File "/home/zappatistas20/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub/PyTorch/Detection/SSD/src/utils.py", line 197, in decode_single
    bboxes_out, labels_out, scores_out = torch.cat(bboxes_out, dim=0), \
RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat.  This usually means that this function requires a non-empty list of Tensors.  Available functions are [CPU, QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

CPU: registered at /pytorch/build/aten/src/ATen/CPUType.cpp:2127 [kernel]
QuantizedCPU: registered at /pytorch/build/aten/src/ATen/QuantizedCPUType.cpp:297 [kernel]
BackendSelect: fallthrough registered at /pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Named: registered at /pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
AutogradOther: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradCPU: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradCUDA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradXLA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradPrivateUse1: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradPrivateUse2: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradPrivateUse3: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
Tracer: registered at /pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:9654 [kernel]
Autocast: registered at /pytorch/aten/src/ATen/autocast_mode.cpp:258 [kernel]
Batched: registered at /pytorch/aten/src/ATen/BatchingRegistrations.cpp:511 [backend fallback]
VmapMode: fallthrough registered at /pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]

Up to a point in the video, the process runs very smoothly. The same happens with a few other videos in my collection as well.

My guess is that there's an empty frame hidden somewhere in the file, and as a result an empty tensor is passed in the bboxes_out argument in decode_single. What would be the best way to try/catch this and return 0 or NaN, or just skip the frame completely, so that the process completes?

Thanks!

UnboundLocalError

I just run the visual extraction script and get the following error:

aV.process_video(video_path, process_mode, print_flag, online_display, save_results)
File "/home/theo/Pictures/EnorasiDb/multimodal_movie_analysis/analyze_visual/analyze_visual.py", line 308, in process_video
objects_boxes_all.append(objects[0])
UnboundLocalError: local variable 'objects_boxes_all' referenced before assignment

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.