zhengthomastang / 2018aicity_teamuw Goto Github PK

The winning method in Track 1 and Track 3 at the 2nd AI City Challenge Workshop in CVPR 2018 - Official Implementation

Home Page: http://openaccess.thecvf.com/content_cvpr_2018_workshops/w3/html/Tang_Single-Camera_and_Inter-Camera_CVPR_2018_paper.html

C++ 5.92% Shell 0.90% C 86.86% Python 0.28% Cuda 5.81% Makefile 0.24%

2018aicity_teamuw's Introduction

Single-Camera and Inter-Camera Vehicle Tracking and 3D Speed Estimation Based on Fusion of Visual and Semantic Features (Winner of Track 1 and Track 3 at the 2nd AI City Challenge Workshop in CVPR 2018)

This repository contains our source code of Track 1 and Track 3 at the 2nd AI City Challenge Workshop in CVPR 2018. Our team won in both of the tracks at the challenge.

The source code of Track 1 is built in MATLAB and C++, with our trained YOLOv2 model provided.

The source code of Track 3 is developed in Python and C++, with our trained YOLOv2 model provided.

The code has been tested on Linux and Windows. Dependencies include CUDA, cuDNN and OpenCV.

The team members include Zheng (Thomas) Tang, Gaoang Wang, Hao (Alex) Xiao, and Aotian Zheng.

[Paper], [Slides], [Poster], [The 2nd AI City Challenge @ CVPR 2018]

Important Notice

The datasets for the 2nd AI City Challenge in CVPR 2018 are no longer available to the public. However, as the 3rd AI City Challenge Workshop was launched at CVPR 2019, they provided a new city-scale dataset for multi-camera vehicle tracking as well as image-based re-identification. They also had a new dataset for traffic anomaly detection. The scale of the datasets and the number of vehicles that are being used for evaluation are both unprecedented.

To access the new datasets, please follow the data access instructions at the AI City Challenge website. You may forward your inquiries to [email protected].

Introduction

NVIDIA AI City Challenge Workshop at CVPR 2018

The NVIDIA AI City Challenge Workshop at CVPR 2018 specifically focused on ITS problems such as

Estimating traffic flow characteristics, such as speed
Leveraging unsupervised approaches to detect anomalies caused by crashes, stalled vehicles, etc.
Multi-camera tracking, and object re-identification in urban environments

Our team participated in 2 out of 3 tracks:

Track 1 (Traffic Flow Analysis) - Participating teams submit results for individual vehicle speed for a test set containing 27 1-minute videos. Performance is evaluated based on ground truth generated by a fleet of control vehicles that were driven during the recording. Evaluation for Challenge Track 1 is based on detection rate of the control vehicles and the root mean square error of the predicted control vehicle speeds.
Track 3 (Multi-camera Vehicle Detection and Reidentification) - Participating teams identify all vehicles that are seen passing at least once at all of 4 different locations in a set of 15 videos. Evaluation for Challenge Track 3 is based on detection accuracy and localization sensitivity for a set of ground-truth vehicles that were driven through all camera locations at least once.

Detailed information of this challenge can be found here.

Our team achieves rank #1 in both Track 1 and Track 3. The demo video for Track 1 can be viewed here. The demo video for Track 3 can be view here.

Single-camera Tracking (SCT)

In SCT, the loss function in our data association algorithm consists of motion, temporal and appearance attributes. Especially, a histogram-based adaptive appearance model is designed to encode long-term appearance change. The change of loss is incorporated with a bottom-up clustering strategy for the association of tracklets. Robust 2D-to-3D projection is achieved with EDA optimization applied to camera calibration for speed estimation.

Inter-camera Tracking (ICT)

The proposed appearance model together with DCNN features, license plates, detected car types and traveling time information are combined for the computation of cost function in ICT.

Code Structure

Track 1

Under the Track1 folder, there are 6 components:

1_VDO2IMG: Converting each video file to a folder of frame images
2_CAM_CAL: Semi-automatic camera calibration based on minimization of reprojection error by EDA optimization
With the access to GPS coordinates (using Google Maps or other tools), we suggest you to use our newly developed PnP-based calibration tool here instead.
3_YOLO_VEH: Extension of the YOLOv2 object detector with our trained model for vehicle detection/classification provided
We strongly encourage users to use the latest YOLOv4 object detector instead.
4_TC_tracker: Proposed tracklet-clustering-based tracking method
Note that this SCT method has been upgraded to TrackletNet Tracker (TNT). The corresponding paper on arXiv is here. The source code (training + testing) is provided here.
5_APP_MDL (optional): Extraction of histogram-based adaptive apperance models and their comparison
6_SPD_EST: Speed estimation based on input of tracking results and camera parameters

Detailed description of each package is given in each subfolder.

Track 3

Under the Track3 folder, there are 3 components:

1_Multi-Camera Vehicle Tracking and Re-identification: Multi-camera vehicle tracking based on a fusion of histogram-based adaptive appearance models, DCNN features, detected car types and traveling time information
2_YOLO_LP: Detection of license plate from each cropped vehicle image based on YOLOv2 with our trained model provided We strongly encourage users to use the latest YOLOv4 object detector instead.
3_LP_COMP: Comparison of license plates under low resolution

Detailed description of each package is given in each subfolder.

The output of 1_Multi-Camera Vehicle Tracking and Re-identification is the similarity scores between pairs of vehicles for comparison. It can be converted into a distance score by inverse proportion. The output of 3_LP_COMP is the distance score between each two license plates. The final distance score between two vehicles is the multiplication of the above two distance scores.

References

Please consider to cite these papers in your publications if it helps your research:

@inproceedings{Tang18AIC,
  author = {Zheng Tang and Gaoang Wang and Hao Xiao and Aotian Zheng and Jenq-Neng Hwang},
  title = {Single-camera and inter-camera vehicle tracking and {3D} speed estimation based on fusion of visual and semantic features},
  booktitle = {Proc. CVPR Workshops},
  pages = {108--115}, 
  year = {2018}
}

@misc{Tang17AIC,
  author = {Zheng Tang and Gaoang Wang and Tao Liu and Young-Gun Lee and Adwin Jahn and Xu Liu and Xiaodong He and Jenq-Neng Hwang},
  title = {Multiple-kernel based vehicle tracking using {3D} deformable model and camera self-calibration},
  howpublished = {arXiv:1708.06831},
  year = {2017}
}

Disclaimer

For any question you can contact Zheng (Thomas) Tang.

2018aicity_teamuw's People

Contributors

Stargazers

Watchers

Forkers

dfjackdaw shiinamitsuki luohongzhi alexxiao95 dxykevin ningweikang burui11087 weidezhang zhangp14 photoszzt edenhuang muddassirnayyer truongkhanhduy95 bluesy7585 jdc08161063 nomiscientist prpankajsingh slbinilkumar mdomox challenzhou drroad giserh amirunpri2018 grabber ydrcg captweiss anlianglu zaruker mkzirncz1 lqchien wangshaobobetter leethaiduy insmod-he rena-ganba zbyuan jethrotan syedrz zhang11wu4 vivicoco vthuo perezale 818ajian heli000 smarth265 yeyewen linye-boli rishi1906 muaz65 fmigone aoe-khkhan joy19920609 senoa95 bingai ledikephraim antonioam82 faisalshahbaz omar-fouad gilsoncarvalho gameinskysky bygreencn konglongteng wang122300090 eglrp talhausuf dexception wuzujiong bastienjacquet paulestano 1nfinityloop byq-luo rahulsingh50 xuxiandi itsmonterey fias786 19317362 quuhua911 mohamedtaghi kans-alpha rede5 lingguomeng mfzhang chatchard47 cj401-jw meamarp bruceche11 madowhat schliffen tele-sources jlerasmus iyadshaheen sudara999 sbofirov wonstran juancarlosromeroc duvitech-llc nuzhny007 heejae1213 tiendv mayurmorin trcclub

2018aicity_teamuw's Issues

Back projection matrix construction

Hello.

I'm asking about how do you construct the matrix of back projection ? Do you extract the homography and then use it , and if so why you use x and y inside first matrix oMatA ? I'm a little bit confused about this part and I would be pleased if you share the reference of this mathematical equation.

Thanks in advance

Can we use this on Windows?

Tracklet clustering

I am trying to understand the tracklet clustering algorithm (section 3.5) of your CVPR paper. I have two questions to start with:

For sake of simplicity, assume that I am just considering smoothness loss for figuring out if two tracklets in their individual set can be merged together.

Is it right to say that if each tracklet is its own set (let's say that is where I start my optimization) the loss of each set will be zero? I am trying to see what is the value of 2nd term in equation 18 of the paper.
Now if I try to merge them I may possibly get a minor increase in the union set (1st term of equation 18). So the overall loss-change will be positive, encouraging to not perform any merging operation.

Can you help me with understanding the optimization scheme?

Thanks in advance!

Arguments 3_YOLO_VEH in file track1.sh

./darknet procimgflr
cfg/aicity.data
cfg/yolo-voc.cfg
yolo-voc_final.weights
/home/ipl_gpu/Thomas/aicity18/Track1/Loc1_1/img1/ - video converted to pictures?
/home/ipl_gpu/Thomas/aicity18/Track1/Loc1_1/detimg1/ - ?????
/home/ipl_gpu/Thomas/aicity18/Track1/Loc1_1/det.txt - path for outputting recognition results to a file?
.1
.5
0
1799 - number of pictures in the folder?
After recognizing the pictures, an empty file turned out. I do not know what path / detimg1 / is and what should be there, it is not describe

d.

Track1/6_SPD_EST nullptr issue

Hi,

I am trying to use the SPD_EST_IPL tool with Visual Studio (Windows 10), however keep getting the following error for line 24 of main.cpp (frame count):

Exception thrown: read access violation.
this was nullptr.

I input tracking and camera parameters in the specified format, and reference the frame images directory.
Any idea on how to deal with this issue?

Thanks,
Dan

Confidence and class lost in Track1/4_TC_tracker output

Hi.

First of all, thanks for making this code public. Now, I'm trying to use the tracker in order to get input to run speed estimation on a specific data set, but after I run the code, the outputs seems to have the value "-1" on the field and no value where the should be, even though the inputs had a valid value.

Here it's a example of input:
000000,-1,703,370,29,21,41.0,-1,-1,-1,car
000000,-1,692,417,46,38,67.0,-1,-1,-1,car

And here, the output for those entries:
1,1,636,364,25,20,-1,-1,-1,-1
1,2,701,369,32,21,-1,-1,-1,-1

Do you have any idea why is this happening ?
Thanks in advance.

Code in C or C++

Hi
Does anybody implement the tracking part of the algorithm in C or C++ languages?

How to evaluate the accuracy of tracking?

I have tried performing cell tracking.
In some cases, the tracker failed (wrongly assigned object ID).
It also sometimes makes new IDs for the reappeared object.
What I need to know is how to evaluate the accuracy of tracking so that we can compare this algorithm with other algorithms?

Any help would be appreciated.

How to run the yolo detection with the model provided?

Hi, I want to know how can I run yolo detection with your model provided? I don't understand the given examples in your bashfile track1.sh
./darknet procimgflr cfg/aicity.data cfg/yolo-voc.cfg yolo-voc_final.weights /home/ipl_gpu/Thomas/aicity18/Track1/Loc1_1/img1/ /home/ipl_gpu/Thomas/aicity18/Track1/Loc1_1/detimg1/ /home/ipl_gpu/Thomas/aicity18/Track1/Loc1_1/det.txt .1 .5 0 1799

What are the images in '/home/ipl_gpu/Thomas/aicity18/Track1/Loc1_1/detimg1/ ', what are they used for? Is /Loc1_1/det.txt the file that saves the result( in the format of MOTChanllenge)? And what is 'procimgflr' option? On my conputer I try to run this command but my computer gives me error（Not an option: procimgflr). Sorry, I'm really new to this, right now I want to try to use your given model to successfully detect and get the MOT format output, your help would means a lot to me, thanks : )

Manually labeling - how to input?

Hi!
We are trying to reproduce /reuse the scripts. However we get stuck at the manual labelling. In the darknet aicity.data, a reference is made to:
train = /home/ipl_gpu/Aotian/darknet2/data/aicity/train.txt
valid = /home/ipl_gpu/Aotian/darknet2/data/aicity/validation.txt

are these the class (sedan etc.) labels? How should we format those files?
Thjanks

Track1/1_YOLO_VEH issues

In the Second Component of Track 1, YOLO_VEH_IPL, the function " void get_region_boxes()" is slightly different from the orginal defination in darknet, while its implementation can not be found in this repository.

In the head file, the function get_region_boxes is decleared as follow:
void get_region_boxes(layer l, int w, int h, int netw, int neth, float thresh, float **probs, box *boxes, float **masks, int only_objectness, int *map, float tree_thresh, int relative);
The orginal function get_region_boxes somehow cannot be found in https://github.com/pjreddie/darknet either. However, I found the implementation in this repository https://github.com/hgpvision/darknet, and the definition about this get_region_boxes function is slightly different from yours one:
void get_region_boxes(layer l, int w, int h, int netw, int neth, float thresh, float **probs, box *boxes, int only_objectness, int *map, float tree_thresh, int relative):
The only diffenence is float **masks

Considering your other two customised functions draw_detections and output_detections, they both use float **masks. In that way, I wander wheather the Customised function "get_region_boxes" is missing in this repository?

Thank a lot！

Track1/2_CAM_CAL problem

For some images(resolution: 4096x2160), after choosing 8 points in image, the vanishing point is given: right: (60970, 1488) and left: (2142, 139).
The next step is "calcStGrdPt", but i find that after 10 minutes, "calcStGrdPt" funtions is still running.
Sorry for I can't upload my own images

Height of vehicles effect on real distance between 2 frames

Hi, Thank you for the tutorial.
I have a problem in estimating real distance of 2 objects (an object in 2 consecutive frames). Height of the vehicle (from ground) can influence on our estimate, how can I handle it?
In step 2, you use camera calibration then perspective, however they assume that all object are on ground and are flat.

I really appreciate your guidance

Python implementation

Is there any python implementation for this use case.

how to get the original video

i need to get how to get the original video , can you provide the download link

FPS

What was the average FPS you were able to get for the detection and tracking? Also, what was the hardware you got this result on?

SPD_EST_IPL and APP_MDL_IPL compile error

My evironment:
Hardware: NVIDIA Jetson TX!
OS: Ubuntu 16.04 for Tegra
OpenCV: 2.4.13 (OpenCV4Tegra)
CUDA:8.0

When I compile SPD_EST_IPL and APP_MDL_IPL, I have the following errors
(The error reports is too long, I just paste a part of them)

ubuntu@tegra-ubuntu:~/2018AICity_TeamUW-master_1/Track1/SPD_EST_IPL/SPD_EST_IPL/src$ g++ main.cpp 
main.cpp: In member function ‘void CTrkNd::setDetCls(char*)’:
main.cpp:37:75: warning: format not a string literal and no format arguments [-Wformat-security]
  inline void setDetCls(char* acDetCls) { std::sprintf(m_acDetCls, acDetCls); }
                                                                           ^
main.cpp: In function ‘cv::Point3f bkproj2d23d(cv::Point2f, float*, int)’:
main.cpp:123:76: error: no match for ‘operator/’ (operand types are ‘cv::Point3f {aka cv::Point3_<float>}’ and ‘int’)
  o3dPt = cv::Point3f(oMatM.at<double>(0, 0), oMatM.at<double>(1, 0), 0.0f) / nLenUnit;
                                                                            ^

ubuntu@tegra-ubuntu:~/2018AICity_TeamUW-master_1/Track1/APP_MDL_IPL/src$ g++ main.cpp 
...
main.cpp:487:2: error: ‘Rect2f’ is not a member of ‘cv’
  cv::Rect2f oBBoxf;
  ^
main.cpp:533:174: error: no matching function for call to ‘cv::RotatedRect::RotatedRect(cv::Point, cv::Point, cv::Point)’
 cv::Point((APP_MDL_NORM_SZ.width - 1), (APP_MDL_NORM_SZ.height - 1))), cv::Scal
                                                                     ^
...

Maybe I should update my OpenCV to 3.1 ?
I know little about C++, So if it's a stupid question, please forgive me.

Complete error report tmp_error.txt

Some questions about 2_CAM_CAL

I'm sorry if it bothers you。When I started working on camera calibration(2_CAM_CAL), I ran into a problem.

in the window of “selector of vanishing lines”. Please select a pair of vanishing lines first and then another pair, like this pic.

The I got the calVr and calVl： Vanishing point (right): (1093,38) Vanishing point (left): (-13993,10)

But the 3dgrid.jpg that I got is not performing very well, where am I going wrong and how can I improve it.

Thanks

Core dumped

Core dumped, when saving results on disk. Not go to if ((f >= vvoTrkNd[i][0].getFrmCnt()) && (f <= vvoTrkNd[i][nTrajLen - 1].getFrmCnt())).

for (int i = 0; i < vvoTrkNd.size(); i++)
{
nTrajLen = vvoTrkNd[i].size();
//Core dumped!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! not go in IF
if ((f >= vvoTrkNd[i][0].getFrmCnt()) && (f <= vvoTrkNd[i][nTrajLen - 1].getFrmCnt()))
{
for (int j = 0; j < nTrajLen; j++)
{
if (f == vvoTrkNd[i][j].getFrmCnt())
{
cout << "output tracking results in NVIDIA AI City Challenge format" << endl;
std::fprintf(pfOutTrkTxt, "%d %d %d %d %d %d %d %.3f %.5f\n",
viVdo[v], (f + 1), -1, vvoTrkNd[i][j].getBBox().x, vvoTrkNd[i][j].getBBox().y,
(vvoTrkNd[i][j].getBBox().x + vvoTrkNd[i][j].getBBox().width - 1),
(vvoTrkNd[i][j].getBBox().y + vvoTrkNd[i][j].getBBox().height - 1),
vvoTrkNd[i][j].getSpd(), (vvoTrkNd[i][j].getDetScr() / 100));

						// output submission results
						std::fprintf(pfOutSubmTxt, "%d %d %d %d %d %d %d %.3f %.5f\n",
							viVdo[v], (f + 1), -1, vvoTrkNd[i][j].getBBox().x, vvoTrkNd[i][j].getBBox().y,
							(vvoTrkNd[i][j].getBBox().x + vvoTrkNd[i][j].getBBox().width - 1),
							(vvoTrkNd[i][j].getBBox().y + vvoTrkNd[i][j].getBBox().height - 1),
							vvoTrkNd[i][j].getSpd(), (vvoTrkNd[i][j].getDetScr() / 100));

						if (bOutTrk3dImgFlg || bOutVdoFlg)
						{
							// plot bounding box
							cv::rectangle(oImgFrm, vvoTrkNd[i][j].getBBox(), voBBoxClr[i % voBBoxClr.size()], 2);
							// plot vehicle ID
							std::sprintf(acId, "%d", (i + 1));
							cv::putText(oImgFrm, acId, vvoTrkNd[i][j].get2dFtPt(), cv::FONT_HERSHEY_SIMPLEX, 1, voBBoxClr[i % voBBoxClr.size()], 2);
							// plot speed 
							std::sprintf(acSpd, "%.3f", vvoTrkNd[i][j].getSpd());
							cv::putText(oImgFrm, acSpd, cv::Point(vvoTrkNd[i][j].getBBox().x, (vvoTrkNd[i][j].getBBox().y - 20)),
								cv::FONT_HERSHEY_SIMPLEX, 1, voBBoxClr[i % voBBoxClr.size()], 2);
							// plot past trajectory
							nPltTrajLen = std::min(nPltTrajLenMax, (j + 1));
							for (int k = j; k > (j - nPltTrajLen + 1); k--)
								cv::line(oImgFrm, vvoTrkNd[i][k].get2dFtPt(), vvoTrkNd[i][k - 1].get2dFtPt(), voBBoxClr[i % voBBoxClr.size()], 2);
						}

						break;
					}
				}

				// plot video ID
				if (bOutVdoFlg)
					cv::putText(oImgFrm, vstrCam[v].c_str(), cv::Point(100, 100),
						cv::FONT_HERSHEY_SIMPLEX, 2, cv::Scalar(255, 255, 255), 2);
			}
		}

		// output plotted frames
		if (bOutTrk3dImgFlg)
		{
			std::sprintf(acOutFrmNm, "%06d.jpeg", (f + 1));
			std::strcpy(acOutFrmPth, acOutTrk3dImgFlrPth);
			std::strcat(acOutFrmPth, acOutFrmNm);
			cv::imwrite(acOutFrmPth, oImgFrm);
		}

		// output video
		if (bOutVdoFlg)
			oVdoWrt.write(oImgFrm);
	}

	std::fclose(pfOutTrkTxt);
}

std::fclose(pfOutSubmTxt);

cv::namedWindow("empty");
cv::waitKey(0);

CUDA installation

Is it necessary to have CUDA installed to implement YOLO? Unfortunately, my GPU is not CUDA-supported!

dataset

do you have the nvidia ai city dataset? I found that the official website download requires a password.

Track1/6_SPD_EST issues

Thank you very much for sharing. I want to ask if speed detection is implemented in Python?

A Not Working Link

hello guys , i look forward to take a look at your trained model code ( of the CVPR 2018 AI city challenge ) , but unfortunately the link that's in your github https://drive.google.com/open?id=1k1ha7q_Zuv3V9VL_8k47vCaLB0tS8w-J is not working anymore , i would be so grateful if you'll provide me with another link to access the trained model (weights) , for educational purpose , please.
Cordially

Coordinate Systems Ambiguity

Hello,

I'm a little bit confused about the coordinate systems [Camera and World] and your assumptions :

originally CCS parallel with WCS
translate upwards by t
rotate yaw(pan) degrees around Y axis
rotate pitch(tilt) degrees around X axis
rotate roll degrees around Z axis

I choose X-Y to be the ground plane and Z-axis points upward [right handed rule] then I start to apply your assumption and ended up with the CCS Z-Axis is pointing upward which means that the camera is looking at the sky.

You can see the image below for the steps I did when applying your assumption. [right hand role rotation]

I think that the Z-Axis should point to right [replaced with X-Axis] to be reasonable.

Could you advice if I'm working with correct approach or not ?
As changing coordinate frames' axes will make [Roll , Pitch and Yaw] missed up .

Note :

X-Axis [Red]
Y-Axis [Green]
Z-Axis [Blue]

Thanks in advance.

Inquiries Regarding the Integration of Semantic Features for Enhanced Vehicle Tracking

Dear Zheng (Thomas) Tang and Team,

I hope this message finds you well. I am reaching out to you after a thorough examination of your repository, which contains the source code for the winning entries of Track 1 and Track 3 at the 2nd AI City Challenge Workshop in CVPR 2018. The fusion of visual and semantic features for single-camera and inter-camera vehicle tracking, as well as 3D speed estimation, is indeed a remarkable achievement.

As I delve deeper into the intricacies of your approach, I am particularly intrigued by the methodology employed in the semantic feature integration for vehicle tracking. The utilisation of DCNN features, license plates, detected car types, and travelling time information to compute the cost function in ICT is a testament to the sophistication of your system.

I am considering adapting your system for a project that requires high-accuracy vehicle tracking in a multi-camera setup. However, I am curious about the following aspects:

Robustness in Diverse Conditions: How does the system perform under varying weather conditions and during different times of the day? Are there any pre-processing steps or model adjustments that you would recommend to maintain high accuracy?
Scalability to Larger Camera Networks: What are the limitations when scaling the system to a larger network of cameras, and how might one address potential challenges in data association across an expanded camera network?
Real-time Processing Capabilities: Could you provide insights into the real-time processing capabilities of your system? Are there any particular hardware requirements or optimisations that are critical for achieving real-time performance?
Adaptation to Newer Models: With the advent of YOLOv4 and other advanced object detectors, what would be the recommended approach to integrate these newer models into your existing framework?
Semantic Feature Enhancement: Are there additional semantic features or data sources that you believe could further enhance the tracking accuracy or speed estimation performance?

I would greatly appreciate your insights on these matters. Your expertise and experience would be invaluable in guiding the adaptation of your system to meet the specific requirements of my project.

Thank you for your time and consideration. I look forward to your response.

Best regards,
yihong1120

The datasets for the AI City Challenge Workshop at CVPR 2018 is no longer available to the public

draw direction of vehicle with colors

could you please explain how to add direction of vehicle in this project il this lines:
// plot past trajectory nPltTrajLen = std::min(nPltTrajLenMax, (j + 1)); for (int k = j; k > (j - nPltTrajLen + 1); k--) cv::line(oImgFrm, vvoTrkNd[i][k].get2dFtPt(), vvoTrkNd[i][k - 1].get2dFtPt(), voBBoxClr[i % voBBoxClr.size()], 2);

one difference between the provided code and the paper

in your paper[1], you said:

The proposed camera self-calibration framework mainly depends on reliable human body segmentation and EDA to search for optimal locations of vanishing points and optimize the camera parameters, so that we can exploit the availability of human tracking and segmentation data for robust calibration

However, in this repository, i only found CAM_CAL, which used manually set or the default vanishing points.

May i ask about the implementation details about how to find the vanishing points?

1.Tang, Zheng, et al. "Multiple-kernel based vehicle tracking using 3D deformable model and camera self-calibration." arXiv preprint arXiv:1708.06831 (2017).

Unable to mark vanishing lines

I was trying to set my own vanishing lines for my video, but when I move my move over the image the lines are plotted with even clicking. Unfortunately I cannot use the new method as I am unaware of the location of the video. Hence please let me know how to fix this.

collect2: error: ld returned 1 exit status`

hi @zhengthomastang
thank you for sharing with us you code source and make it open-source ,
i have a problem when i run any main.cpp file :

ing const&, cv::_InputArray const&, std::vector<int, std::allocator<int> > const&)' /usr/bin/ld: main.cpp:(.text+0x2b2): undefined reference to cv::VideoCapture::~VideoCapture()'
/usr/bin/ld: main.cpp:(.text+0x2de): undefined reference to cv::VideoCapture::~VideoCapture()' /usr/bin/ld: main.cpp:(.text+0x351): undefined reference to cv::VideoCapture::~VideoCapture()'
/usr/bin/ld: /tmp/cc6mUeHV.o: in function cv::String::String(char const*)': main.cpp:(.text._ZN2cv6StringC2EPKc[_ZN2cv6StringC5EPKc]+0x4f): undefined reference to cv::String::allocate(unsigned long)'
/usr/bin/ld: /tmp/cc6mUeHV.o: in function cv::String::~String()': main.cpp:(.text._ZN2cv6StringD2Ev[_ZN2cv6StringD5Ev]+0x14): undefined reference to cv::String::deallocate()'
/usr/bin/ld: /tmp/cc6mUeHV.o: in function cv::Mat::~Mat()': main.cpp:(.text._ZN2cv3MatD2Ev[_ZN2cv3MatD5Ev]+0x39): undefined reference to cv::fastFree(void*)'
/usr/bin/ld: /tmp/cc6mUeHV.o: in function cv::Mat::release()': main.cpp:(.text._ZN2cv3Mat7releaseEv[_ZN2cv3Mat7releaseEv]+0x4b): undefined reference to cv::Mat::deallocate()'
collect2: error: ld returned 1 exit status
(base) root@ZakariaUbunto:/home/zakaria/Public/2018AICity_TeamUW-master/Track1/1_VDO2IMG/src# clear

(base) root@ZakariaUbunto:/home/zakaria/Public/2018AICity_TeamUW-master/Track1/1_VDO2IMG/src# g++ main.cpp
/usr/bin/ld: /tmp/ccvDSGHa.o: in function main': main.cpp:(.text+0x33): undefined reference to cv::VideoCapture::VideoCapture()'
/usr/bin/ld: main.cpp:(.text+0x6c): undefined reference to cv::VideoCapture::VideoCapture(cv::String const&)' /usr/bin/ld: main.cpp:(.text+0x94): undefined reference to cv::VideoCapture::~VideoCapture()'
/usr/bin/ld: main.cpp:(.text+0xb2): undefined reference to cv::VideoCapture::isOpened() const' /usr/bin/ld: main.cpp:(.text+0x131): undefined reference to vtable for cv::VideoCapture'
/usr/bin/ld: main.cpp:(.text+0x256): undefined reference to cv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator<int> > const&)' /usr/bin/ld: main.cpp:(.text+0x2b2): undefined reference to cv::VideoCapture::~VideoCapture()'
/usr/bin/ld: main.cpp:(.text+0x2de): undefined reference to cv::VideoCapture::~VideoCapture()' /usr/bin/ld: main.cpp:(.text+0x351): undefined reference to cv::VideoCapture::~VideoCapture()'
/usr/bin/ld: /tmp/ccvDSGHa.o: in function cv::String::String(char const*)': main.cpp:(.text._ZN2cv6StringC2EPKc[_ZN2cv6StringC5EPKc]+0x4f): undefined reference to cv::String::allocate(unsigned long)'
/usr/bin/ld: /tmp/ccvDSGHa.o: in function cv::String::~String()': main.cpp:(.text._ZN2cv6StringD2Ev[_ZN2cv6StringD5Ev]+0x14): undefined reference to cv::String::deallocate()'
/usr/bin/ld: /tmp/ccvDSGHa.o: in function cv::Mat::~Mat()': main.cpp:(.text._ZN2cv3MatD2Ev[_ZN2cv3MatD5Ev]+0x39): undefined reference to cv::fastFree(void*)'
/usr/bin/ld: /tmp/ccvDSGHa.o: in function cv::Mat::release()': main.cpp:(.text._ZN2cv3Mat7releaseEv[_ZN2cv3Mat7releaseEv]+0x4b): undefined reference to cv::Mat::deallocate()'
collect2: error: ld returned 1 exit status`

Can I test directly on my own dataset with the code and model you provide, without training.

Can I use it to train for Multiple camera Multiple person tracking problem?

Hello organizers,

Thank you for the code. It's a great work done.
I want to do a research for my thesis to create a system to track Multiple people in Multiple camera scenario. I totally believe your code can be extended or your models could be trained to do that.
Will you please share some insights if it's possible?

output of speed : Error: camera parameters not loaded

Hi sir im use your code when i launch ../6_SPD_EST$ ./bin the output of commande line is : output of speed: Error: camera parameters not loaded, please how can be solve it , and how can be add the path of the video in your code , thanks

Step 2- Camera calibration old method vs new method

Hi, Thank you so much for providing us with your codes.
I am a bit confused about the output of step 2, the old method, which produces a Projection Matrix (3 x4) and the newer version which produces a Homography Matrix (3x3).
Since the projection matrix is fed into the code in the speed estimation step, I was wondering how the homography matrix could be converted into the projection matrix.

I really appreciate your response,
Bita

sequence naming of images in TC_Tracker

Thank you again for your interesting code. How should be the sequence naming of images in the img_folder for TC_Tracker?
I have the following error in my implementation. What is the problem?

Index exceeds the number of array elements (0).

Error in TC_tracker (line 8)
temp_img = imread([img_folder,'',img_list(1).name]);

Error in demo (line 30)
TC_tracker(img_path,det_path,ROI_path,param,img_save_path,seq_name,...

how we compile in linux

hi zheng tomas ..
in speed estimation ,how we compile in cpp speed estimation in linux experiment

when I compile gcc compiler
gcc -std=c++98 -o main main.cpp
gcc -std=c++14 -o main main.cpp
gcc -std=c++11 -o main main.cpp

the output is same

main.cpp:37:75: warning: format not a string literal and no format arguments [-Wformat-security]
inline void setDetCls(char* acDetCls) { std::sprintf(m_acDetCls, acDetCls); }
^
main.cpp: In function ‘int main(int, char**)’:
main.cpp:134:22: error: in C++98 ‘viVdo’ must be initialized by constructor, not by ‘{...}’
23, 24, 25, 26, 27 };
^
main.cpp:134:22: error: could not convert ‘{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27}’ from ‘’ to ‘std::vector’
main.cpp:136:118: error: in C++98 ‘vstrCam’ must be initialized by constructor, not by ‘{...}’
std::vectorstd::string vstrCam = { "Loc1_1", "Loc1_2", "Loc1_3", "Loc1_4", "Loc1_5", "Loc1_6", "Loc1_7", "Loc1_8" };
^
main.cpp:136:118: error: could not convert ‘{"Loc1_1", "Loc1_2", "Loc1_3", "Loc1_4", "Loc1_5", "Loc1_6", "Loc1_7", "Loc1_8"}’ from ‘’ to ‘std::vector<std::__cxx11::basic_string >’
main.cpp:144:22: error: in C++98 ‘vnSpdWinSz’ must be initialized by constructor, not by ‘{...}’
31, 31, 31, 31, 31 };
^
main.cpp:144:22: error: could not convert ‘{15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31}’ from ‘’ to ‘std::vector’
main.cpp:149:37: error: in C++98 ‘vfSpdScl’ must be initialized by constructor, not by ‘{...}’
0.89f, 0.89f, 0.89f, 0.89f, 0.89f };
^
main.cpp:149:37: error: could not convert ‘{1.25e+0f, 1.25e+0f, 1.25e+0f, 1.25e+0f, 1.25e+0f, 1.25e+0f, 1.25e+0f, 1.25e+0f, 1.04999995e+0f, 1.04999995e+0f, 1.04999995e+0f, 1.04999995e+0f, 1.04999995e+0f, 1.04999995e+0f, 1.04999995e+0f, 1.04999995e+0f, 8.00000012e-1f, 8.00000012e-1f, 8.00000012e-1f, 8.00000012e-1f, 8.00000012e-1f, 8.00000012e-1f, 8.89999986e-1f, 8.89999986e-1f, 8.89999986e-1f, 8.89999986e-1f, 8.89999986e-1f}’ from ‘’ to ‘std::vector’
main.cpp:154:32: error: in C++98 ‘vfSpdStdThld’ must be initialized by constructor, not by ‘{...}’
5.0f, 5.0f, 5.0f, 5.0f, 5.0f };
^
main.cpp:154:32: error: could not convert ‘{7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 7.0e+1f, 1.5e+1f, 1.5e+1f, 1.5e+1f, 1.5e+1f, 1.5e+1f, 1.5e+1f, 5.0e+0f, 5.0e+0f, 5.0e+0f, 5.0e+0f, 5.0e+0f}’ from ‘’ to ‘std::vector’
main.cpp:159:37: error: in C++98 ‘vfSpdLowThld’ must be initialized by constructor, not by ‘{...}’
18.0f, 18.0f, 18.0f, 18.0f, 18.0f };
^
main.cpp:159:37: error: could not convert ‘{1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 1.0e+1f, 2.8e+1f, 2.8e+1f, 2.8e+1f, 2.8e+1f, 2.8e+1f, 2.8e+1f, 1.8e+1f, 1.8e+1f, 1.8e+1f, 1.8e+1f, 1.8e+1f}’ from ‘’ to ‘std::vector’
main.cpp:164:32: error: in C++98 ‘vfSpdStpThld’ must be initialized by constructor, not by ‘{...}’
5.0f, 5.0f, 5.0f, 5.0f, 5.0f };
^
main.cpp:164:32: error: could not convert ‘{2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 2.0e+0f, 5.0e+0f, 5.0e+0f, 5.0e+0f, 5.0e+0f, 5.0e+0f, 5.0e+0f, 5.0e+0f, 5.0e+0f, 5.0e+0f, 5.0e+0f, 5.0e+0f}’ from ‘’ to ‘std::vector’
main.cpp:169:32: error: in C++98 ‘vfSpdPropFNThld’ must be initialized by constructor, not by ‘{...}’
0.0f, 0.0f, 0.0f, 0.0f, 0.0f };
^
main.cpp:169:32: error: could not convert ‘{0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 3.0e+1f, 3.0e+1f, 3.0e+1f, 3.0e+1f, 3.0e+1f, 3.0e+1f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f}’ from ‘’ to ‘std::vector’
main.cpp:186:41: warning: format not a string literal and no format arguments [-Wformat-security]
std::sprintf(acOutSubmPth, acTrk1FlrPth);
^
main.cpp:195:41: warning: format not a string literal and no format arguments [-Wformat-security]
std::sprintf(acOutVdoPth, acTrk1FlrPth);
^
main.cpp:205:41: warning: format not a string literal and no format arguments [-Wformat-security]
std::sprintf(acCamFlrPth, acTrk1FlrPth);
^
main.cpp:210:44: warning: format not a string literal and no format arguments [-Wformat-security]
std::sprintf(acInCamParamPth, acCamFlrPth);
^
main.cpp:216:41: warning: format not a string literal and no format arguments [-Wformat-security]
std::sprintf(acInTrk2dPth, acCamFlrPth);
^
main.cpp:222:42: warning: format not a string literal and no format arguments [-Wformat-security]
std::sprintf(acInFrmFlrPth, acCamFlrPth);
^
main.cpp:228:43: warning: format not a string literal and no format arguments [-Wformat-security]
std::sprintf(acOutTrkFlrPth, acCamFlrPth);
^
main.cpp:234:43: warning: format not a string literal and no format arguments [-Wformat-security]
std::sprintf(acOutTrkPth, acOutTrkFlrPth);
^
main.cpp:240:51: warning: format not a string literal and no format arguments [-Wformat-security]
std::sprintf(acOutTrk3dImgFlrPth, acOutTrkFlrPth);
^
main.cpp:519:43: warning: format not a string literal and no format arguments [-Wformat-security]
std::sprintf(acInFrmPth, acInFrmFlrPth);
^
main.cpp:579:50: warning: format not a string literal and no format arguments [-Wformat-security]
std::sprintf(acOutFrmPth, acOutTrk3dImgFlrPth);

any reccommendation for it?