Git Product home page Git Product logo

kcg2015 / vehicle-detection-and-tracking Goto Github PK

View Code? Open in Web Editor NEW
536.0 19.0 191.0 74.43 MB

Computer vision based vehicle detection and tracking using Tensorflow Object Detection API and Kalman-filtering

Python 100.00%
detection tracking kalman-filtering object-detection keras hungarian-algorithm tensorflow-object-detection-api single-shot-multibox-detector mobilenet-ssd linear-assignment-problem occlusion computer-vision bounding-boxes bayesian-filter

vehicle-detection-and-tracking's Introduction

Vehicle Detection and Tracking

Overview

This repo illustrates the detection and tracking of multiple vehicles using a camera mounted inside a self-driving car. The aim here is to provide developers, researchers, and engineers a simple framework to quickly iterate different detectors and tracking algorithms. In the process, I focus on simplicity and readability of the code. The detection and tracking pipeline is relatively staight forward. It first initializes a detector and a tracker. Next, detector localizes the vehicles in each video frame. The tracker is then updated with the detection results. Finally the tracking results are annotated and displayed in a video frame.

Key files in this repo

  • detector.py -- implements CarDetector class to output car detection results
  • tracker.py -- implements Kalman Filter-based prediction and update for tracking
  • main.py -- implements the detection and tracking pipeline, including detection-track assignment and track management
  • helpers.py -- helper functions
  • ssd_mobilenet_v1_coco_11_06_2017/frozen_inference_graph.pb -- pre-trained mobilenet-coco model

Detection

In the pipeline, vehicle (car) detection takes a captured image as input and produces the bounding boxes as the output. We use TensorFlow Object Detection API, which is an open source framework built on top of TensorFlow to construct, train and deploy object detection models. The Object Detection API also comes with a collection of detection models pre-trained on the COCO dataset that are well suited for fast prototyping. Specifically, we use a lightweight model: ssd_mobilenet_v1_coco that is based on Single Shot Multibox Detection (SSD) framework with minimal modification. Though this is a general-purpose detection model (not specifically optimized for vehicle detection), we find this model achieves the balance between bounding box accuracy and inference time.

The detector is implemented in CarDetector class in detector.py. The output are the coordinates of the bounding boxes (in the format of [y_up, x_left, y_down, x_right] ) of all the detected vehicles.

The COCO dataset contains images of 90 classes, with the first 14 classes all related to transportation, including bicycle, car, and bus, etc. The ID for car is 3.

category_index={1: {'id': 1, 'name': u'person'},
                        2: {'id': 2, 'name': u'bicycle'},
                        3: {'id': 3, 'name': u'car'},
                        4: {'id': 4, 'name': u'motorcycle'},
                        5: {'id': 5, 'name': u'airplane'},
                        6: {'id': 6, 'name': u'bus'},
                        7: {'id': 7, 'name': u'train'},
                        8: {'id': 8, 'name': u'truck'},
                        9: {'id': 9, 'name': u'boat'},
                        10: {'id': 10, 'name': u'traffic light'},
                        11: {'id': 11, 'name': u'fire hydrant'},
                        13: {'id': 13, 'name': u'stop sign'},
                        14: {'id': 14, 'name': u'parking meter'}} 

The following code snippet implements the actual detection using TensorFlow API.

(boxes, scores, classes, num_detections) = self.sess.run(
                  [self.boxes, self.scores, self.classes, self.num_detections],
                  feed_dict={self.image_tensor: image_expanded})

Here boxes, scores, and classes represent the bounding box, confidence level, and class name corresponding to each of the detection, respectively. Next, we select the detections that are cars and have a confidence greater than a threshold ( e.g., 0.3 in this case).

idx_vec = [i for i, v in enumerate(cls) if ((v==3) and (scores[i]>0.3))]

To detect all kinds of vehicles, we also include the indices for bus and truck.

idx_vec = [i for i, v in enumerate(cls) if (((v==3) or (v==6) or (v==8)) and (scores[i]>0.3))]

To further reduce possible false positives, we include thresholds for bounding box width, height, and height-to-width ratio.

if ((ratio < 0.8) and (box_h>20) and (box_w>20)):
    tmp_car_boxes.append(box)
    print(box, ', confidence: ', scores[idx], 'ratio:', ratio)
else:
     print('wrong ratio or wrong size, ', box, ', confidence: ', scores[idx], 'ratio:', ratio)

Kalman Filter for Bounding Box Measurement

We use Kalman filter for tracking objects. Kalman filter has the following important features that tracking can benefit from:

  • Prediction of object's future location
  • Correction of the prediction based on new measurements
  • Reduction of noise introduced by inaccurate detections
  • Facilitating the process of association of multiple objects to their tracks

Kalman filter consists of two steps: prediction and update. The first step uses previous states to predict the current state. The second step uses the current measurement, such as detection bounding box location , to correct the state. The formula are provided in the following:

Kalman Filter Equations:

Prediction phase: notations

Drawing

#### Prediction phase: equations

Drawing

#### Update phase: notations

Drawing

#### Update phase: equations

Drawing

Kalman Filter Implementation

In this section, we describe the implementation of the Kalman filter in detail.

The state vector has eight elements as follows:

[up, up_dot, left, left_dot, down, down_dot, right, right_dot]

That is, we use the coordinates and their first-order derivatives of the up left corner and lower right corner of the bounding box.

The process matrix, assuming the constant velocity (thus no acceleration), is:

self.F = np.array([[1, self.dt, 0,  0,  0,  0,  0, 0],
                    [0, 1,  0,  0,  0,  0,  0, 0],
                    [0, 0,  1,  self.dt, 0,  0,  0, 0],
                    [0, 0,  0,  1,  0,  0,  0, 0],
                    [0, 0,  0,  0,  1,  self.dt, 0, 0],
                    [0, 0,  0,  0,  0,  1,  0, 0],
                    [0, 0,  0,  0,  0,  0,  1, self.dt],
                    [0, 0,  0,  0,  0,  0,  0,  1]])

The measurement matrix, given that the detector only outputs the coordindate (not velocity), is:

self.H = np.array([[1, 0, 0, 0, 0, 0, 0, 0],
                   [0, 0, 1, 0, 0, 0, 0, 0],
                   [0, 0, 0, 0, 1, 0, 0, 0], 
                   [0, 0, 0, 0, 0, 0, 1, 0]])

The state, process, and measurement noises are :

 # Initialize the state covariance
 self.L = 100.0
 self.P = np.diag(self.L*np.ones(8))
        
        
 # Initialize the process covariance
 self.Q_comp_mat = np.array([[self.dt**4/2., self.dt**3/2.],
                                    [self.dt**3/2., self.dt**2]])
 self.Q = block_diag(self.Q_comp_mat, self.Q_comp_mat, 
                            self.Q_comp_mat, self.Q_comp_mat)
        
# Initialize the measurement covariance
self.R_scaler = 1.0/16.0
self.R_diag_array = self.R_ratio * np.array([self.L, self.L, self.L, self.L])
self.R = np.diag(self.R_diag_array)

Here self.R_scaler represents the "magnitude" of measurement noise relative to state noise. A low self.R_scaler indicates a more reliable measurement. The following figures visualize the impact of measurement noise to the Kalman filter process. The green bounding box represents the prediction (initial) state. The red bounding box represents the measurement. If measurement noise is low, the updated state (aqua colored bounding box) is very close to the measurement (aqua bounding box completely overlaps over the red bounding box).

Drawing

In contrast, if measurement noise is high, the updated state is very close to the initial prediction (aqua bounding box completely overlaps over the green bounding box).

Drawing

Detection-to-Tracker Assignment

The module assign_detections_to_trackers(trackers, detections, iou_thrd = 0.3) takes from current list of trackers and new detections, output matched detections, unmatched trackers, unmatched detections.

Drawing

Linear Assignment and Hungarian (Munkres) algorithm

If there are multiple detections, we need to match (assign) each of them to a tracker. We use intersection over union (IOU) of a tracker bounding box and detection bounding box as a metric. We solve the maximizing the sum of IOU assignment problem using the Hungarian algorithm (also known as Munkres algorithm). The machine learning package scikit-learn has a build-in utility function that implements the Hungarian algorithm.

matched_idx = linear_assignment(-IOU_mat)   

Note that linear_assignment by default minimizes an objective function. So we need to reverse the sign of IOU_mat for maximization.

Unmatched detections and trackers

Based on the linear assignment results, we keep two lists for unmatched detections and unmatched trackers, respectively. When a car enters into a frame and is first detected, it is not matched with any existing tracks, thus this particular detection is referred to as an unmatched detection, as shown in the following figure. In addition, any matching with an overlap less than iou_thrd signifies the existence of an untracked object. When a car leaves the frame, the previously established track has no more detection to associate with. In this scenario, the track is referred to as unmatched track. Thus, the tracker and the detection associated in the matching are added to the lists of unmatched trackers and unmatched detection, respectively.

Drawing

Pipeline

We include two important design parameters, min_hits and max_age, in the pipeline. The parameter min_hits is the number of consecutive matches needed to establish a track. The parameter max_age is number of consecutive unmatched detections before a track is deleted. Both parameters need to be tuned to improve the tracking and detection performance.

The pipeline deals with matched detection, unmatched detection, and unmatched trackers sequentially. We annotate the tracks that meet the min_hits and max_age condition. Proper book keeping is also needed to delete the stale tracks.

The following examples show the process of the pipeline. When the car is first detected in the first video frame, running the following line of code returns an empty list, an one-element list, and an empty list for matched, unmatched_dets, and unmatched_trks, respectively.

matched, unmatched_dets, unmatched_trks \
    = assign_detections_to_trackers(x_box, z_box, iou_thrd = 0.3) 

We thus have a situation of unmatched detections. Unmatched detections are processed by the following code block:

if len(unmatched_dets)>0:
        for idx in unmatched_dets:
            z = z_box[idx]
            z = np.expand_dims(z, axis=0).T
            tmp_trk = Tracker() # Create a new tracker
            x = np.array([[z[0], 0, z[1], 0, z[2], 0, z[3], 0]]).T
            tmp_trk.x_state = x
            tmp_trk.predict_only()
            xx = tmp_trk.x_state
            xx = xx.T[0].tolist()
            xx =[xx[0], xx[2], xx[4], xx[6]]
            tmp_trk.box = xx
            tmp_trk.id = track_id_list.popleft() # assign an ID for the tracker
            tracker_list.append(tmp_trk)
            x_box.append(xx)

This code block carries out two important tasks, 1) creating a new tracker tmp_trk for the detection; 2) carrying out the Kalman filter's predict stage tmp_trk.predict_only(). Note that this newly created track is still in probation period, i.e., trk.hits =0, so this track is yet established at the end of pipeline. The output image is the same as the input image - the detection bounding box is not annotated. Drawing

When the car is detected again in the second video frame, running the following assign_detections_to_trackers returns an one-element list , an empty list, and an empty list for matched, unmatched_dets, and unmatched_trks, respectively. As shown in the following figure, we have a matched detection, which will be processed by the following code block:

if matched.size >0:
        for trk_idx, det_idx in matched:
            z = z_box[det_idx]
            z = np.expand_dims(z, axis=0).T
            tmp_trk= tracker_list[trk_idx]
            tmp_trk.kalman_filter(z)
            xx = tmp_trk.x_state.T[0].tolist()
            xx =[xx[0], xx[2], xx[4], xx[6]]
            x_box[trk_idx] = xx
            tmp_trk.box =xx
            tmp_trk.hits += 1

This code block carries out two important tasks, 1) carrying out the Kalman filter's prediction and update stages tmp_trk.kalman_filter(); 2) increasing the hits of the track by one tmp_trk.hits +=1. With this update,
the condition if ((trk.hits >= min_hits) and (trk.no_losses <=max_age)) is statified, so the track is fully established. As the result, the bounding box is annotated in the output image, as shown in the figure below. Drawing

Issues

The main issue is occlusion. For example, when one car is passing another car, the two cars can be very close to each other. This can fool the detector into outputing a single (and possibly bigger bounding) box, instead of two separate bounding boxes. In addition, the tracking algorithm may treat this detection as a new detection and sets up a new track. The tracking algorithm may fail again when one of the passing car moves away from another car.

vehicle-detection-and-tracking's People

Contributors

kcg2015 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vehicle-detection-and-tracking's Issues

A question about video tracking

Hi, thanks for your amazing project, i have a question about video tracking, how can i see the different ID for car in the final output video?

Assign Id to the tracked object

I am trying to assign id to the tracked object but nothing is giving me an output , can any please give some idea about this, and can you please tell me about the tracker's id defined in tracker.py

About state, process, measurement covariance

Hi Sir,

Thank you for your great work.
I'm curious about how you set the initial value of each covariance ?
Any design consideration or if it's only random values?
Thank you in advance.

Best Regards,
Lai

moviepy.editor.VideoClip not working

Traceback (most recent call last):
File "/usr/lib/python3.6/code.py", line 91, in runcode
exec(code, self.locals)
File "", line 1, in
File "/snap/pycharm-professional/132/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
File "/home/tejanmehndiratta15/.local/lib/python3.6/site-packages/moviepy/editor.py", line 33, in
from .video.io.VideoFileClip import VideoFileClip
File "/snap/pycharm-professional/132/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
File "/home/tejanmehndiratta15/.local/lib/python3.6/site-packages/moviepy/video/io/VideoFileClip.py", line 3, in
from moviepy.video.VideoClip import VideoClip
File "/snap/pycharm-professional/132/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)

TypeError in main.py

Traceback (most recent call last):
File "main.py", line 200, in
image_box = pipeline(image)
File "main.py", line 112, in pipeline
= assign_detections_to_trackers(x_box, z_box, iou_thrd = 0.3)
File "main.py", line 56, in assign_detections_to_trackers
if (d not in matched_idx[ : ,1]):
TypeError: tuple indices must be integers or slices, not tuple

Tracker Error

File "main.py", line 140, in pipeline
tmp_trk = Tracker() # Create a new tracker
NameError: name 'Tracker' is not defined

I just ran your code as it is, but there is an error in the main.

Error: 'Tracker' object has no attribute 'R_ratio'

Hello, this is an error when I run tracker.py.

File "tracker.py", line 58, in init
self.R_diag_array = self.R_ratio * np.array

AttributeError: 'Tracker' object has no attribute 'R_ratio'

So, I want to know, where is 'R_ratio' defined?

Thank you!

Error occur when using my own video

Dear, Mr Kcg2015, How are you?
Your project is very helpful and explanation is very detail so thanks for your efforts for developing this project.
When I run main.py to test my own video, there is something wrong.(video information as follows, test file size about 1.5GB)

qq 20190130193440

When the first frame is detected, the program terminates. The results and errors of the first frame detection are as follows:

qq 20190130193615
qq 20190130193628

I would like to ask why there is such a mistake, thank you for your solution.

                                                                                                                                                             Best.

about matching detections and trackers

hi helper ... can I take the principle you used for matching detection and trackers and use dlib for creating a new tracker for the unmatched detection?

Using this for person tracking on different videos

Hi,

First of all, thanks for sharing this great work!

I am using this on other videos of people walking and I figured the following code needs to change a bit to accommodate bigger detections/bounding boxes in addition to changing the ID to 1 (person) instead of 3 (car)

in detector.py I changed # if ((ratio < 0.8) and (box_h>10)): to be if ((ratio < 2) and (box_h>20) and (box_w>20)):

but I am still not seeing tracker properly working on this by following the detected object along like in your sample video

wondering what's going on?

using infrared camera

Hello,
Thanks for your project, it's amazing.
I'm planning to use infrared (thermal) camera instead of RGB camera. What changes needs to be considered ?

A bug

I think you should add "self.car_boxes = []" after "print('no detection!')" in file"detector.py" 124 line.
Otherwise it will keep showing the last position of previous object until the detector detects a new object.

training mode

Can you share the code for training the model using my own dataset
thank yo u

low frame rate issue

For low frame rate the tracker is lagging, is there a way i can make it work for low frame videos.
Like for example changing the dt value in tracker.py

name 'CarDetector' is not defined

hi, I don't know why I got this wrong message?
$ python main.py
Traceback (most recent call last):
File "main.py", line 193, in
det = CarDetector()
NameError: name 'CarDetector' is not defined

not tracking

the tacker is not tracking the sequence of images in "test_mages" folder.
the position of the bounding boxes do not change. it well track the "frame003" image

module 'helpers' has no attribute 'draw_box_label'

[420 4 457 68] , confidence: 0.44781974 ratio: 0.5780346820809248
[424 225 451 263] , confidence: 0.32874542 ratio: 0.7103393843725336
Frame: 1

AttributeError Traceback (most recent call last)
in
198 for i in range(len(images))[0:7]:
199 image = images[i]
--> 200 image_box = pipeline(image)
201 plt.imshow(image_box)
202 plt.show()

in pipeline(img)
100 if debug:
101 for i in range(len(z_box)):
--> 102 img1= helpers.draw_box_label(img, z_box[i], box_color=(255, 0, 0))
103 plt.imshow(img1)
104 plt.show()

AttributeError: module 'helpers' has no attribute 'draw_box_label'

When no detections are present :

I am curious to know that will kalman tracker work when no object is detected in the frame.I saw few ropo(https://github.com/abewley/sort) required detection in every frame otherwise it will leave track on object(car in this case).
So will kalman predict function will give next location even when no detection is received from detector.

Thanks in advance.

Getting an error when executing main.py with videos

Hello,
Thank you for this amazing tutorial! I really appreciate it! I was trying to execute your code with videos and I'm getting this particular error
In line 43 of main.py
in assign_detections_to_trackers
IOU_mat[t,d] = box_iou2(trk,det)
NameError: name 'box_iou2' is not defined

License?

Hey,
thank you for this awesome work! However I couldn't find any license information? Is it MIT- or BSD-licensed ?

NameError: name 'Tracker' is not defined

Hi!

when i run the main.py, it happended to be a problem called ''NameError: name 'Tracker' is not defined" and "2.82it/s]wrong ratio or wrong size, [426 230 441 257] , confidence: 0.36141032 ratio: 0.5553498704183636". I don't know the cause. Please you help me!

Thank you!

vehicle logo detect !!

Dear, Mr Kcg2015, How are you?
Your project is very helpful and explanation is very detail so thanks for your efforts for developing this project.
By the way, Did you have experiences in vehicle logo detection with YOLO?
I am new about training with YOLO so that I wanna receive your kindly help.

Thanks and Regards, RIchardMinh.

Question on Deal with matched detections in main.py?

Hi @kcg2015,
Thanks for your great work.
I have a question on function Deal with matched detections in main.py.
Assume in the 1st frame, there is no matched detections, then we will jump to if len(unmatched_dets) > 0:. It does the tmp_trk.predict_only() by using current detection (x = np.array([[z[0], 0, z[1], 0, z[2], 0, z[3], 0]]).T; tmp_trk.x_state = x).
In the 2nd frame, I assume that all objects in 1st frame are matched to detections in 2nd frame, then we jump to if matched.size > 0, and do predict. However, in tmp_trk.kalman_filter(z), I saw that the code does predict then update with current detection info. I am confused because we already did predict when processing detection in the 1st frame?
Can you clarify my confusion?
I suppose that we should do update and then predict.
Thank you so much.

how do you initialize the tracker? I have problems when I run main.py as shown below.

Traceback (most recent call last):
File "E:/PycharmProject/Vehicle-Detection-and-Tracking/main.py", line 201, in
image_box = pipeline(image)
File "E:/PycharmProject/Vehicle-Detection-and-Tracking/main.py", line 115, in pipeline
= assign_detections_to_trackers(x_box, z_box, iou_thrd=0.3)
File "E:/PycharmProject/Vehicle-Detection-and-Tracking/main.py", line 71, in assign_detections_to_trackers
if IOU_mat[m[0], m[1]] < iou_thrd:
IndexError: index 0 is out of bounds for axis 0 with size 0
matched_idx:(array([], dtype=int32), array([], dtype=int64)),shape:(2, 0)
matched_idx:(array([], dtype=int32), array([], dtype=int64)),shape:(2, 0)

#########################
Can you tell me how to solve it? Thank you?

Tracking and counting

Hi Kyle,

First of all I would like to thank you as I am grateful for your amazing work, I am a student conducting a research for my thesis on people counter using object detection and tracking and this was very helpful.

I used the SSD mobilenet pre-trained model and trained with a custom dataset I made from overhead images of people taken from an IP camera. I was able to detect the heads of people using tensorflow's object detection API after training while in real-time. With the help of your code and some arrangements to it I was also able to track the people's movement and assign an ID.

My next step, and this is where I am struggling, is to count the person entering a room i.e after crossing the ROI(region of interest) so it is important I know the direction of the movement (up or down) and when it crosses this ROI. I have spent a lot of time searching for something similar that could be useful to me but no luck. I was wondering if you had any feedback that could be of help to me as it would be greatly appreciated. TIA.

Below is a sample image of my work.

image

Tracker without constant velocity

Dear Author(kcg2015),

your work is amazing and also great in sharing your work to research community.

Here you are considering the constant velocity model in the process matrix, as shown in snapshot:
image

Here Im having few queries mentioned below:

  1. If ego vehicle is not moving in constant velocity and then how to deal for tracking the target objects?
  2. what is DELTA_T, in your code mentioned a constant value =1.

Iam able to track the object when I capture frames at 30FPS but unable to track the objects when frames are captured at 10FPS as my position is changing vsatly, so how to deal when frame rate is less around 10FPS?

Thanks for your valuable time, I eagery look forward for your response.

Thanks,
Anil.

tracking is lost

Hello,
The tracking is lost and it's not accurate, sometimes in the cars the bounding box is hidden for few frames then it continues tracking again

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.