Git Product home page Git Product logo

crowdcount-mcnn's Introduction

Single Image Crowd Counting via Multi Column Convolutional Neural Network

This is an unofficial implementation of CVPR 2016 paper "Single Image Crowd Counting via Multi Column Convolutional Neural Network"

Installation

  1. Install pytorch
  2. Clone this repository
git clone https://github.com/svishwa/crowdcount-mcnn.git

We'll call the directory that you cloned crowdcount-mcnn ROOT

Data Setup

  1. Download ShanghaiTech Dataset from
    Dropbox: https://www.dropbox.com/s/fipgjqxl7uj8hd5/ShanghaiTech.zip?dl=0

    Baidu Disk: http://pan.baidu.com/s/1nuAYslz

  2. Create Directory

mkdir ROOT/data/original/shanghaitech/  
  1. Save "part_A_final" under ROOT/data/original/shanghaitech/

  2. Save "part_B_final" under ROOT/data/original/shanghaitech/

  3. cd ROOT/data_preparation/

    run create_gt_test_set_shtech.m in matlab to create ground truth files for test data

  4. cd ROOT/data_preparation/

    run create_training_set_shtech.m in matlab to create training and validataion set along with ground truth files

Test

  1. Follow steps 1,2,3,4 and 5 from Data Setup

  2. Download pre-trained model files:

    [Shanghai Tech A]

    [Shanghai Tech B]

    Save the model files under ROOT/final_models

  3. Run test.py

    a. Set save_output = True to save output density maps

    b. Errors are saved in output directory

Training

  1. Follow steps 1,2,3,4 and 6 from Data Setup
  2. Run train.py

Training with TensorBoard

With the aid of Crayon, we can access the visualisation power of TensorBoard for any deep learning framework.

To use the TensorBoard, install Crayon (https://github.com/torrvision/crayon) and set use_tensorboard = True in ROOT/train.py.

Other notes

  1. During training, the best model is chosen using error on the validation set. (It is not clear how the authors in the original implementation choose the best model).

  2. 10% of the training set is set asised for validation. The validation set is chosen randomly.

  3. The ground truth density maps are obtained using simple gaussian maps unlike the original method described in the paper.

  4. Following are the results on Shanghai Tech A and B dataset:

             |     |  MAE  |   MSE  |
             ------------------------
             | A   |  110  |   169  |
             ------------------------
             | B   |   25  |    44  |
    
  5. Also, please take a look at our new work on crowd counting using cascaded cnn and high-level prior (https://github.com/svishwa/crowdcount-cascaded-mtl), which has improved results as compared to this work.

crowdcount-mcnn's People

Contributors

svishwa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crowdcount-mcnn's Issues

test.py : Error

/ML/Crowd-count/crowdcount-mcnn$ python3 test.py
Pre-loading the data. This may take a while...
Traceback (most recent call last):
File "test.py", line 41, in
data_loader = ImageDataLoader(data_path, gt_path, shuffle=False, gt_downsample=True, pre_load=True)
File "/home/tikam/ML/Crowd-count/crowdcount-mcnn/src/data_loader.py", line 35, in init
img = cv2.resize(img,(wd_1,ht_1))
TypeError: integer argument expected, got float

annotation about shanghaitech dataset

Do you know what the annotation information of the shanghaitech dataset refers to?
I want to know what kind of software can be used for such labeling?

UCF_CC_50 data prepare

Dear svishwa, I don't know how to devided train and test dataset of UCF_CC_50, could you please provide me the UCF_CC_50 data prepare codes, if available?

Crowd Counting

Dear svishwa, I am not getting shanghatitech images publicly. I also emailed to the authors but yet I dint get any kind of links of the dataset. I have UCF dataset. Can you please provide me the UC_FF_50 data processing marlab codes, if available?

Is your test model trained by yourself?

I'm confused about what wrong with the code, I just use your code to train my model, and select the model which has lowest MAE and MSE as test model, but when I test my mode, I gets higher MAE(195.83) and MSE(265.86) than you provide

UC_FF_50 dataset pre-processing

Dear svishwa, I am not getting shanghatitech images publicly. I also emailed to the authors but yet I dint get any kind of links of the dataset. I have UCF dataset. Can you please provide me the UC_FF_50 data processing marlab codes, if available?

Issues,test.py

When trying to run test.py I get the following message:
File "C:\Users\GCZX\Desktop\crowdcount-mcnn-master\src\data_loader.py", line 35, in
init
img = cv2.resize(img,(wd_1,ht_1))

TypeError: integer argument expected, got float

explaining dataset .mat files

good evening,
i don't seem to understand the structure and the content of the .mat files. in the dataset.
i believe it contains ground truth (counting) for images. i know at some point in CNN we need to calculate the error between the network result and the ground truth we have
i try to print one file:
{'image_info': array([[array([[(array([[ 855.32345978, 590.49587357], [ 965.5908524 , 472.79472415], [ 937.09478464, 400.93507502], ..., [ 42.5852337 , 359.87860699], [1017.48233659, 8.99748811], [1017.48233659, 23.31916643]]), array([[920]], dtype=uint16))]], dtype=[('location', 'O'), ('number', 'O')])]], dtype=object), '__version__': '1.0', '__header__': 'MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Fri Nov 18 20:06:05 2016', '__globals__': []}

can someone explain whats in these files and how or for what that content is used in crowd estimation.

test script

Hi svishwa,
Is there any script take an input image and return estimation map or count number?

Is this model accurate on sparse image?

About the creating_training_set_shtech.m

` if( w <= 2wn2 )
im = imresize(im,[ h,2
wn2+1]);
annPoints(:,1) = annPoints(:,1)2wn2/w;
end
if( h <= 2hn2)
im = imresize(im,[2
hn2+1,w]);
annPoints(:,2) = annPoints(:,2)2hn2/h;
end```

Can you tell me the function about the episode?
According to the complete code,I think it is unuse for running

train own dataset

hello,I want to train own dataset,but I donot make gt, Looking forward to help,Thanks

relu

Hi svishwa,
I wonder why there're no relu layer in your model? (authors of "Single Image Crowd Counting via Multi Column Convolutional Neural Network" said that relu is adopted in their network.)

Where is the parameter self.is_training?

Thanking you for provide the code,it works perfect.
But I got a problem when I check the file crowd_count.py, it gota parameter names self.training without define,but it didn't cause any problem.It really puzzles me.

can the gt_density be saved as .jpg file

hello, I saved the gt_density data as .jpg file ,then convert it to lmdb, and then train in caffe , but the loss is huge, like as
Iteration 9000 (5.64897 iter/s, 177.024s/1000 iters), loss = 3.87866e+06

how can i solve the problem, or something i did is wrong?

Dataset Pre-Processing Code Needed

Dear Concern,

It will be a great help if you please provide me the UCF_CC_50 pre-processing matlab code for generating test and train data. And UCSD pre-processing matlab code. I tried to modified your shanghatech code, but I think I am doing some mess. Because I have 50 data in the ucf dataset. and 40 is using to train and rest are using to test. Now, those 40 are converted into 360 input images which is small in number and only 10 test data. Please provide me the matlab code of both dataset.

Thank you very much for your active responses.
Asif : [email protected]

if statement in create_training_set_shtech.m

lines 49 -61:

`

wn2 = w/8; hn2 = h/8;
wn2 =8 * floor(wn2/8);
hn2 =8 * floor(hn2/8);

annPoints =  image_info{1}.location;
if( w <= 2*wn2 )
    im = imresize(im,[ h,2*wn2+1]);
    annPoints(:,1) = annPoints(:,1)*2*wn2/w;
end
if( h <= 2*hn2)
    im = imresize(im,[2*hn2+1,w]);
    annPoints(:,2) = annPoints(:,2)*2*hn2/h;
end

`
why the if condition w <= 2wn2 or h <= 2hn2 is used? since line 49-51 says w >= 8*wn2. I am quite confused here.

crowd counting from CCTV cameras

Hi @svishwa,

Can you please suggest me a method to perform crowd counting Real Time from CCTV Camera? using Single Image Crowd Counting via MCNN.

Can I read the CCTV frames and Write it as a Single Image(only one jpg file) continuously, with a time diff and make the crowd count algorithm to count the people in the image??

Thanks
Guru

epoch value

Hello author, why is the value of epoch not found in the train file?

UCSD Result

I am getting MAE MSE much greater than 1.07 and 1.35 respectively (given in Zhang et al's paper) for UCSD dataset with your code. Please suggest me what the problem can be. What results did you get with the UCSD dataset?

Is there anyone run it successfully by python3 in windows?

If there is someone run it successfully by python 3 and tell the experience, I think it will help green people very much.

my operating system is windows 10. It looks like I can't install pytorch in python 2.7 which means I have to change the .py file from python 2.x into python 3.x, but there are many errors.

Thanks for any help!

Dataset required

Dear,
You previously helped me with sending the Shanghaitech dataset. Now, I am need of Wroldexpo'10 dataset and UCSD dataset.

Let me explain you a scenario about UCF_CC_50 dataset. I have 50 images. Now I put first 40 images in to train folder like shanghaitech, then rest 10 images i put on test folder. finally i got 359 formatted images to train my model. I want to know how authors tested their method with UCF_CC_50 dataset.

Can you please reply me with these 3 queries. Thanks in advance.
[email protected] is my gmail id

I got Attribute error when running the code. Anyone can help me with it? Many thx.

Traceback (most recent call last):
File "train.py", line 106, in
density_map = net(im_data, gt_data)
File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/Users/yinweicheng/Desktop/job/summer-project/crowdcount-mcnn/src/crowd_count.py", line 17, in forward
im_data = network.np_to_variable(im_data, is_cuda=False, is_training=self.training)
AttributeError: module 'network' has no attribute 'np_to_variable'

missing license

Hi Vishwanath, look forward to playing around with your code a bit but noticed there's no license listed. Would you mind adding one, ideally MIT or Apache 2.0? Thanks!

What size of image should I use to get good performance

As I can see there is only conv layers and no fully connected (sure) so we could feed image of any size. I have no dataset you trained the model so my question is in topic ;) What sizes of image were used to train the model?

Dataset

Hello,
I was wondering if the ShahghaiTech crowd counting is publicly available ?

Thanks

Do I need to pretrain?

I rewrite it in caffe,but I can not get that good results,in your code ,when I use epoch 0 model to test,MSE can reach 445,while mine in caffe epoch 0 model is 11535.Have you use pretrain??

Regarding input normalization and loss profile during training.

Is there any sort normalization of the input done during the training i.e. mean subtraction or centralization of the data. I see that you have implemented batch normalization layer in the network which helps to get rid of normalization. However, I also saw that you have kept it off during training.

Also, how was the loss profile when you trained the network. Was it very less i.e. in the orders of 10^-3 and so on?

About the number of people

Hi, I am a new student of deep learning, now we are going to do a headcount function, how can I get the headcount of each picture? Thank you very much for your reply.

about density map

Hello author, I want to make a data set myself. I now take 200 images from a video, use the labelimg to get the .xml file, and then use the matlab program to extract the human coordinates to get the .mat file. When I use your program to convert the density map, the partial density map I get is empty, what is the problem. And I want to ask if the spread parameter σ mentioned in the paper are not obtained by k nearest neighbors, why is it set to 4 in your program?

about image size

hi @svishwa , in the paper,the author mentioned that he down-sampled each training image by 1/4 before generating its density map(in the Remarks),but i did`t see this part in your code?

Besides,what is these code mean in the data_loader.py?

den = cv2.resize(den,(wd_1,ht_1))                
den = den * ((wd*ht)/(wd_1*ht_1))

looking forward to your reply,thanks!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.