Git Product home page Git Product logo

transportationdetection_yolo_imagedetection's Introduction

Project to detect transportation modes in video footage

Into

This work investigates the potential of analysing user-generated street view imagery from videos to extract transportation modes, for example whether individuals are walking or cycling, using deep neural-networks. It addresses existing data gaps with respect to monitoring the sustainability of the mix of transport modes in cities, and complements existing methods based on stationary count stations or Google Street View. Preliminary results show the potential of the proposed workflow to automatically extract forms of transportation from videos as alternative data form.

Workflow

workflow

The workflow uses object detection YOLOv5 and object tracking DeepSort to identify transportation mode relevant objects and track anc count them across frames. Relations of relevant objects are used to infer transportation modes, such as a person in a certain relation to a bicycle suspects a cyclist. --> ๐Ÿšด = ๐Ÿšถ + ๐Ÿšฒ

visual

Geolocation

Additionally to the transportation mode detection the workflow also perfroms text recognition which is referred to as Object Character Recognition (OCR) in Computer Vision terms. The extracted text is filtered for potential street names which are matched with the Levenshtein Distance (word similarity) algorithm against a compiled dataset of OpenStreetMap street names which functions as gazetteer. The OCR is perfromed through the library EasyOCR. Ultimately, the workflow compiles all detected geolocations from a video in a map as output.

map

Output

The workflow generates a project folder for each video with the following output files:

  • track log CSV of all object IDs and their respective object class across frames
  • OCR log CSV of all text strings extracted from all frames
  • classnames CSV, a statistical summary of all detected classnames
  • transportation mode CSV, a statistical summary of all detected transportation modes
  • video MP4, containing visual bounding boxes and object ids

Plots and figures

  • map PNG, showing all detected geolocations from potential streetnames, each with a bargraph displaying transportation mode counts (see figure above)
  • interactive plot HTML for each video (1), showing the distribution of detected objects and locations over the entire video
  • interactive plot HTML, one for all videos (2), showing the detected object distribution for all locations from all videos

plots

(respective examples can be found in ./example_plots)

Setup

To run the workflow.py complete the folling steps:

  1. Create the necessary virtual environment with Ananaconda
  • conda env create --file transport_env.yml
  • conda activate transport_env
  1. Download video material (preferably as mp4), for YouTube one can use

git clone https://github.com/Bellador/YoutubeDownloader

  1. Place videos in folder input_videos
  2. Run nohup python -u workflow.py > mylogfile.log &

Presentation slides

GISRUK 2022 short presentation slides can be found under ./related_documents/markdown_presentation_FINAL/marp_presentation.html

Future Work

  • Using sound-feeds of videos to infer traffic volumes based on the varying decibel levels

External resources

The workflow is based on object (1) detection and (2) tracking as well as (3) Optical Character Recognition (OCR). These resources were taken from:

transportationdetection_yolo_imagedetection's People

Contributors

bellador avatar

Stargazers

Alex avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.