2EL1730 Machine Learning Project

This repository contains the files for a machine learning project consisting of the analyze of deep learning techniques to estimate how many people are in a given image. Besides this, some other machine learning and data processing algorithms are used.

Requirements

Only python code. Many libraries you may need to install, but it is not necessary the use of a gpu in any part of the code's pipeline.

Usage

Notice that the pipeline.py alongside with the image of the pipeline present in the report indicate which are the programs to be executed for each approach.

Attention! The combination of both approach was tested but it's not recommended, because the result it's worse in comparison of only using the RNN and takes a long time, specially for the inferences in the MCNN architecture and for removing the two max values from the matrix.

Scripts

pipeline.py

This simple python script is calling all the codes in the right order to follow the pipeline.

params.py

It is mainly a shared dictionary of global variables to all the scripts. This was created to facilitate the debugging and production stage, because it is easier to change the same variable in the entire code without the necessity of passing it as a parameter over the entire workflow.

crop.py

Crop an image in its hardcoded positions.

put_zero_image.py

Paint an image's region with a define color (black). The resulting image has a trapezoidal shap.

test.py

This is a simple script defining some variables for calling the core MCNN code.

put_zero_den.py

At some phase of the debugging, it was unknown yet if the original image should be painted or the heatmap generated by the MCNN. This script is painting the heatmap and its use is no longer recommendated.

find_people.py

Algorithm that removes the two values that occurs the most in a matrix. To remove it paints this position in black.

position.py

Use of DBSCAN algorithm to cluster the points from the heatmap into clusters representing a single person each. Also, it stores the coordinates of this clusters.

tiny_face_eval.py

Use of a ResNet101 (RNN) to spot the heads present in the image.

get_heads.py

From the heads coordinates in quadruples as (x1, y1, x2, y2) representing the opposite vertices of a rectangle, recuperate the image of the head and save these images in two sizes for further analyzes.

track_video.py

Load the images saved from get_heads.py and for a sequence of images, try to find the ocurrence of the heads of the first image in the following images, the backend of this script is in track_heads.py.

track_heads.py

Calculates the struct similarity index of two images and multiplicates by the inverse of the distance between them and then correlates which are the images more plausible to correspond to the same person in two consecutive images.

Disclaimer

Codes are tested only on CPUs, not GPUs.

References

[1] Reza Bahmanyar, Elenora Vig, Peter Reinartz.2019. MRCNet: Crowd Counting and Density Map Estimation in Aerial and Ground Imagery.\url{https://arxiv.org/pdf/1909.12743.pdf}\\

[2] Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, Yi Ma.2016.Single-Image Crowd Counting via Multi-Column Convolutional Neural Network.\url{https://zpascal.net/cvpr2016/Zhang_Single-Image_Crowd_Counting_CVPR_2016_paper.pdf}\\

[3] Martin Ester, Hans-Peter Kriegel, Jiirg Sander, Xiaowei Xu.1996.A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise.\url{https://www.aaai.org/Papers/KDD/1996/KDD96-037.pdf}\\

[4] Di Kang, Zheng Ma.2018.Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks - Counting, Detection, and Tracking.\url{http://visal.cs.cityu.edu.hk/static/pubs/journal/tcsvt18-denscnn.pdf}\\

[5] Peiyun Hu, Deva Ramanan. Finding Tiny Faces. \url{https://arxiv.org/abs/1612.04402}\\

[6] Richard Dosselmann, Xue Dong Yang. A comprehensive assessment of the structural similarity index. \url{https://link.springer.com/content/pdf/10.1007\%2Fs11760-009-0144-1.pdf}\\

[7] Karen Simonyan, Andrew Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. \url{https://arxiv.org/abs/1409.1556}\

[8] P. Viola, M. J. Jones, and D. Snow. Detecting pedestrians using patterns of motion and appearance. International Journal of Computer Vision, 2005. \url{http://luthuli.cs.uiuc.edu/~daf/courses/AppCV/Papers-2/t61k38u53j53134.pdf}\\

[9] V. A. Sindagi and V. M. Patel. A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recognition Letters, 2018. \url{https://arxiv.org/pdf/1707.01202.pdf}\\

[10] G. J. Brostow and R. Cipolla. Unsupervised bayesian detection of independent motion in crowds.IEEE, 2006.\url{http://mi.eng.cam.ac.uk/~cipolla/publications/inproceedings/2006-CVPR-Brostow-motionincrowds.pdf}\\

[11] Z. Lin and L. S. Davis.Shape-based human detection and segmentation via hierarchical part-template matching.Pattern Analysis and Machine Intelligence,2010. \url{https://www.researchgate.net/publication/262189064_Shape-Based_Human_Detection_and_Segmentation_via_Hierarchical_Part-Template_Matching}\\

[12] M. Wang and X. Wang. Automatic adaptation of a generic pedestrian detector to a specific traffic scene.IEEE, 2011.\url{http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.459.3309&rep=rep1&type=pdf}\\

[13] H. Idrees, K. Soomro, and M. Shah. Detecting humans in dense crowds using locally-consistent scale prior and global occlusion reasoning.Pattern Analysis and Machine Intelligence, 2005.\url{https://www.researchgate.net/publication/273284676_Detecting_Humans_in_Dense_Crowds_Using_Locally-Consistent_Scale_Prior_and_Global_Occlusion_Reasoning}\\

[14] H. Idrees, I. Saleemi, C. Seibert, and M. Shah. Multi-source multi-scale counting in extremely dense crowd images. In CVPR, IEEE, 2013.\url{http://www.eecs.ucf.edu/~haroon/datafiles/Idrees_Counting_CVPR_2013.pdf}

kaushikmupadhya / crowd-counting Goto Github PK

crowd-counting's Introduction