svishwa / crowdcount-mcnn Goto Github PK

Single Image Crowd Counting via MCNN (Unofficial Implementation)

License: MIT License

MATLAB 22.78% Python 77.22%

crowdcount-mcnn's Introduction

Single Image Crowd Counting via Multi Column Convolutional Neural Network

This is an unofficial implementation of CVPR 2016 paper "Single Image Crowd Counting via Multi Column Convolutional Neural Network"

Installation

Install pytorch
Clone this repository

git clone https://github.com/svishwa/crowdcount-mcnn.git

We'll call the directory that you cloned crowdcount-mcnn ROOT

Data Setup

Download ShanghaiTech Dataset from
Dropbox: https://www.dropbox.com/s/fipgjqxl7uj8hd5/ShanghaiTech.zip?dl=0

Baidu Disk: http://pan.baidu.com/s/1nuAYslz
Create Directory

mkdir ROOT/data/original/shanghaitech/

Save "part_A_final" under ROOT/data/original/shanghaitech/
Save "part_B_final" under ROOT/data/original/shanghaitech/
cd ROOT/data_preparation/

run create_gt_test_set_shtech.m in matlab to create ground truth files for test data
cd ROOT/data_preparation/

run create_training_set_shtech.m in matlab to create training and validataion set along with ground truth files

Test

Follow steps 1,2,3,4 and 5 from Data Setup
Download pre-trained model files:

[Shanghai Tech A]

[Shanghai Tech B]

Save the model files under ROOT/final_models
Run test.py

a. Set save_output = True to save output density maps

b. Errors are saved in output directory

Training

Follow steps 1,2,3,4 and 6 from Data Setup
Run train.py

Training with TensorBoard

With the aid of Crayon, we can access the visualisation power of TensorBoard for any deep learning framework.

To use the TensorBoard, install Crayon (https://github.com/torrvision/crayon) and set use_tensorboard = True in ROOT/train.py.

Other notes

During training, the best model is chosen using error on the validation set. (It is not clear how the authors in the original implementation choose the best model).
10% of the training set is set asised for validation. The validation set is chosen randomly.
The ground truth density maps are obtained using simple gaussian maps unlike the original method described in the paper.

Following are the results on Shanghai Tech A and B dataset:

         |     |  MAE  |   MSE  |
         ------------------------
         | A   |  110  |   169  |
         ------------------------
         | B   |   25  |    44  |

Also, please take a look at our new work on crowd counting using cascaded cnn and high-level prior (https://github.com/svishwa/crowdcount-cascaded-mtl), which has improved results as compared to this work.

crowdcount-mcnn's People

Contributors

Stargazers

Watchers

Forkers

wanquan1992 hellowangqian lavonne1213 wuhailing damanzano rkshuai pumayhui cryax astroman5516 kvnzhao lijian8 jiehaohuang hairy-crab greenteahua lareinahe sditeng zumbalamambo jsmilemsj opencvfun tristantang yangxue0827 zgsxwsdxg giserh newbeeyoung millx2021 nathangq youngergao wangwen39 xyk-sunny wx7in7 cxm1995 bricechou hailsham paipaixy ieee820 celinezhou03 huanghao pchank aoliao12138 lindaweijiasun chenbocqu sxfxhqq suhaisheng michaelyyq mingtaofu xushengim trantorrepository issac8huxleg jules-diez baiyancheng20 peterzcc xuwanjun9 huojiazhi zbxzc35 dattatrayshinde ceciliapyy naruto-sasuke monkey006monkey maohule kaaier anzhicheng ideaplexus egrass xuehaouwa apple1987 chunyu-lin-bjtu xxnote lu-k-brant jackros1022 alvinzh hym0318 highlory doche zihaodong tlwzzy alzayats jeffgan96 zyweven zhaolei-momo abcd11234 wanzhang1 jingyuanzeng hechhx fatlime jacke121 mattgorb wohuhansan simon5u linkeeer xiao-xiao-jin drjiabin lixzhang xliu117 gitlost-murali yuehuiwang tongpinmo anhvaut rainbowma0225 mrcuihao fannierpeng

crowdcount-mcnn's Issues

test.py : Error

/ML/Crowd-count/crowdcount-mcnn$ python3 test.py
Pre-loading the data. This may take a while...
Traceback (most recent call last):
File "test.py", line 41, in
data_loader = ImageDataLoader(data_path, gt_path, shuffle=False, gt_downsample=True, pre_load=True)
File "/home/tikam/ML/Crowd-count/crowdcount-mcnn/src/data_loader.py", line 35, in init
img = cv2.resize(img,(wd_1,ht_1))
TypeError: integer argument expected, got float

annotation about shanghaitech dataset

Do you know what the annotation information of the shanghaitech dataset refers to?
I want to know what kind of software can be used for such labeling?

question in data_loader.py

what these two codes mean in line 34 and 35?
ht_1 = int((ht/4)*4)
wd_1 = int((wd/4)*4)

UCF_CC_50 data prepare

Dear svishwa, I don't know how to devided train and test dataset of UCF_CC_50, could you please provide me the UCF_CC_50 data prepare codes, if available?

Crowd Counting

Dear svishwa, I am not getting shanghatitech images publicly. I also emailed to the authors but yet I dint get any kind of links of the dataset. I have UCF dataset. Can you please provide me the UC_FF_50 data processing marlab codes, if available?

How to get the './data/original/shanghaitech/part_B_final/test_data/ground_truth_csv/'

This document './data/original/shanghaitech/part_B_final/test_data/ground_truth_csv/' is used in test.py.
But the dataset I got doesn't contain this document.
Can anyone tell me how can I get this document?
Thanks

Is your test model trained by yourself?

I'm confused about what wrong with the code, I just use your code to train my model, and select the model which has lowest MAE and MSE as test model, but when I test my mode, I gets higher MAE(195.83) and MSE(265.86) than you provide

density_map = net(im_data, gt_data) # in test.py, this is the density map

density_map = net(im_data, gt_data) # in test.py, this is the density map
the sum of the density map is the count

Originally posted by @winnielbx in #32 (comment)

UC_FF_50 dataset pre-processing

A code question about the file of 'create_gt_test_set_shtech.m'

Could you tell me what is the meaning of 'annPoints = image_info{1}.location'？
I used the command ‘help image_info’ in matlab, but didn't find the answer to the code.

Issues,test.py

When trying to run test.py I get the following message:
File "C:\Users\GCZX\Desktop\crowdcount-mcnn-master\src\data_loader.py", line 35, in
init
img = cv2.resize(img,(wd_1,ht_1))

TypeError: integer argument expected, got float

explaining dataset .mat files

good evening,
i don't seem to understand the structure and the content of the .mat files. in the dataset.
i believe it contains ground truth (counting) for images. i know at some point in CNN we need to calculate the error between the network result and the ground truth we have
i try to print one file:
{'image_info': array([[array([[(array([[ 855.32345978, 590.49587357], [ 965.5908524 , 472.79472415], [ 937.09478464, 400.93507502], ..., [ 42.5852337 , 359.87860699], [1017.48233659, 8.99748811], [1017.48233659, 23.31916643]]), array([[920]], dtype=uint16))]], dtype=[('location', 'O'), ('number', 'O')])]], dtype=object), '__version__': '1.0', '__header__': 'MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Fri Nov 18 20:06:05 2016', '__globals__': []}

can someone explain whats in these files and how or for what that content is used in crowd estimation.

Can I train again based on the last model or pre-model that I saved?

batch_size == 1?

Is the batch_size=1 when training?

Where can I download the dataset?

test script

Hi svishwa,
Is there any script take an input image and return estimation map or count number?

Is this model accurate on sparse image?

About the creating_training_set_shtech.m

` if( w <= 2wn2 )
im = imresize(im,[ h,2wn2+1]);
annPoints(:,1) = annPoints(:,1)2wn2/w;
end
if( h <= 2hn2)
im = imresize(im,[2hn2+1,w]);
annPoints(:,2) = annPoints(:,2)2hn2/h;
end```

Can you tell me the function about the episode?
According to the complete code,I think it is unuse for running

train own dataset

hello,I want to train own dataset,but I donot make gt, Looking forward to help,Thanks

can't open the link of [Shanghai Tech A] and [Shanghai Tech B]

I am chinese ,i can't open the link of [Shanghai Tech A] and [Shanghai Tech B]. maybe something woring with the network. is there any help you can do. here is my email ; [email protected]. would u please send me the pre-trained model files. I will be very grateful

relu

Hi svishwa,
I wonder why there're no relu layer in your model? (authors of "Single Image Crowd Counting via Multi Column Convolutional Neural Network" said that relu is adopted in their network.)

Where is the parameter self.is_training?

Thanking you for provide the code,it works perfect.
But I got a problem when I check the file crowd_count.py, it gota parameter names self.training without define,but it didn't cause any problem.It really puzzles me.

can the gt_density be saved as .jpg file

hello, I saved the gt_density data as .jpg file ,then convert it to lmdb, and then train in caffe , but the loss is huge, like as
Iteration 9000 (5.64897 iter/s, 177.024s/1000 iters), loss = 3.87866e+06

how can i solve the problem, or something i did is wrong?

Dataset Pre-Processing Code Needed

Dear Concern,

It will be a great help if you please provide me the UCF_CC_50 pre-processing matlab code for generating test and train data. And UCSD pre-processing matlab code. I tried to modified your shanghatech code, but I think I am doing some mess. Because I have 50 data in the ucf dataset. and 40 is using to train and rest are using to test. Now, those 40 are converted into 360 input images which is small in number and only 10 test data. Please provide me the matlab code of both dataset.

Thank you very much for your active responses.
Asif : [email protected]

if statement in create_training_set_shtech.m

lines 49 -61:

wn2 = w/8; hn2 = h/8;
wn2 =8 * floor(wn2/8);
hn2 =8 * floor(hn2/8);

annPoints =  image_info{1}.location;
if( w <= 2*wn2 )
    im = imresize(im,[ h,2*wn2+1]);
    annPoints(:,1) = annPoints(:,1)*2*wn2/w;
end
if( h <= 2*hn2)
    im = imresize(im,[2*hn2+1,w]);
    annPoints(:,2) = annPoints(:,2)*2*hn2/h;
end

`
why the if condition w <= 2wn2 or h <= 2hn2 is used? since line 49-51 says w >= 8*wn2. I am quite confused here.

crowd counting from CCTV cameras

Hi @svishwa,

Can you please suggest me a method to perform crowd counting Real Time from CCTV Camera? using Single Image Crowd Counting via MCNN.

Can I read the CCTV frames and Write it as a Single Image(only one jpg file) continuously, with a time diff and make the crowd count algorithm to count the people in the image??

Thanks
Guru

ht_1/4 I don't understand why resize image to this size？

den = cv2.resize(ht_1,wd_1)
den = den*((htwd)/(ht_1wd_1))
And i couldn't understand what is this meaning?

epoch value

Hello author, why is the value of epoch not found in the train file?

training time

is your get_density_map_gaussian implementation difference from the papers

in file "get_density_map_gaussian.m"
it seem that you have a fix value of sigma to 4.0 (line 19)
whereas in the paper the sigma are change according to mean KNN distance for each head? (session 2.2)
am I missing something of it really difference?

how to get the density map(come out with black-and-white map)and how to count the total numbers

UCSD Result

I am getting MAE MSE much greater than 1.07 and 1.35 respectively (given in Zhang et al's paper) for UCSD dataset with your code. Please suggest me what the problem can be. What results did you get with the UCSD dataset?

Is there anyone run it successfully by python3 in windows?

If there is someone run it successfully by python 3 and tell the experience, I think it will help green people very much.

my operating system is windows 10. It looks like I can't install pytorch in python 2.7 which means I have to change the .py file from python 2.x into python 3.x, but there are many errors.

Thanks for any help!

get_density_map_gaussian python version?

HI,
Dose anyone know there is a python version for script get_density_map_gaussian?
Thans

Dataset required

Dear,
You previously helped me with sending the Shanghaitech dataset. Now, I am need of Wroldexpo'10 dataset and UCSD dataset.

Let me explain you a scenario about UCF_CC_50 dataset. I have 50 images. Now I put first 40 images in to train folder like shanghaitech, then rest 10 images i put on test folder. finally i got 359 formatted images to train my model. I want to know how authors tested their method with UCF_CC_50 dataset.

Can you please reply me with these 3 queries. Thanks in advance.
[email protected] is my gmail id

I got Attribute error when running the code. Anyone can help me with it? Many thx.

Traceback (most recent call last):
File "train.py", line 106, in
density_map = net(im_data, gt_data)
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/Users/yinweicheng/Desktop/job/summer-project/crowdcount-mcnn/src/crowd_count.py", line 17, in forward
im_data = network.np_to_variable(im_data, is_cuda=False, is_training=self.training)
AttributeError: module 'network' has no attribute 'np_to_variable'

missing license

Hi Vishwanath, look forward to playing around with your code a bit but noticed there's no license listed. Would you mind adding one, ideally MIT or Apache 2.0? Thanks!

Problem of the link of [Shanghai Tech A] and [Shanghai Tech B]

i can't open the link of [Shanghai Tech A] and [Shanghai Tech B],here is my emil:[email protected],would u please send me the pre-trained model files .Thanks anyway.

What size of image should I use to get good performance

As I can see there is only conv layers and no fully connected (sure) so we could feed image of any size. I have no dataset you trained the model so my question is in topic ;) What sizes of image were used to train the model?

Why the input image of network is the grayscale image? Why not input the RBG image to the network in datapreparation

Dataset

Hello,
I was wondering if the ShahghaiTech crowd counting is publicly available ?

Thanks

Why is the density map I tested is grayscale? Is there any way to get the thermal map?

Do I need to pretrain?

I rewrite it in caffe,but I can not get that good results,in your code ,when I use epoch 0 model to test,MSE can reach 445,while mine in caffe epoch 0 model is 11535.Have you use pretrain??

Regarding input normalization and loss profile during training.

Is there any sort normalization of the input done during the training i.e. mean subtraction or centralization of the data. I see that you have implemented batch normalization layer in the network which helps to get rid of normalization. However, I also saw that you have kept it off during training.

Also, how was the loss profile when you trained the network. Was it very less i.e. in the orders of 10^-3 and so on?

shanghai tech dataset

Hi can you share your shanghai data set?
I downloaded this data set from http://cs-chan.com/downloads_crowd_dataset.html but got a totally different format with your preprocess.

About the number of people

Hi, I am a new student of deep learning, now we are going to do a headcount function, how can I get the headcount of each picture? Thank you very much for your reply.

about density map

Hello author, I want to make a data set myself. I now take 200 images from a video, use the labelimg to get the .xml file, and then use the matlab program to extract the human coordinates to get the .mat file. When I use your program to convert the density map, the partial density map I get is empty, what is the problem. And I want to ask if the spread parameter σ mentioned in the paper are not obtained by k nearest neighbors, why is it set to 4 in your program?

What's the sigma in ShanghaiTech PartB?

about image size

hi @svishwa , in the paper,the author mentioned that he down-sampled each training image by 1/4 before generating its density map(in the Remarks),but i did`t see this part in your code?

Besides,what is these code mean in the data_loader.py?

den = cv2.resize(den,(wd_1,ht_1))                
den = den * ((wd*ht)/(wd_1*ht_1))

looking forward to your reply,thanks!!

Label tool

Hi,what is your label tool?

Why UPG is used and the speed is not improved

Why UPG is used but the speed is not improved?