harshayugirase / human-path-prediction Goto Github PK

State-of-the-art methods for human trajectory forecasting. Contains code for papers published at ECCV 2020 and ICCV 2021.

License: MIT License

Python 39.50% Jupyter Notebook 60.50%

human-path-prediction's People

Contributors

Stargazers

Watchers

human-path-prediction's Issues

[SOLVED] Three potential problems: (A) 1.86 factor (B) initial position needed? (C) ADE/FDE calculation

Hello,
I am playing with the code of PECNet to help my research, it is well written, and thank you for sharing! But I am struggling with three detailed problems here. Hope you can help out.

Why is all data scaled to 1.86 times bigger using the optimal settings?
Why is the initial positions needed for the prediction features? I think this is not mentioned in the paper.
In the test function, the euclidean distances between the guess end points and the true end point are calculated. Then the best guess is used for generating the future trajectory. The problem here is, we don't have the true value of the end point in the test process, right? So theoretically, we do not know which guess point is the best one. What the code did here is actually choosing the best guess manually, which cannot be done if we don't know any information about the true destination. Is my understanding correct?

Best,
Zifeng

YNet: More details about the final training on ETH/UCY Dataset

Hi, can you provide more experimental details about the final training of Y-Net on ETH/UCY Dataset to get your scores on the paper? Like #29 , I cannot obtain the results on the paper as well. Specifically, I have the following questions:

Did you train the full epoch? I notice that you stopped training after the counter reaching 30. Otherwise, ~300 hours are required.
How did you apply the deformable convolution on the model?
Did you train directly on the segmentation masks or like the way on SDD datasets: pre-training a segmentation model and fine-tuning after 150 epochs?
I have run into a very severe overfitting during the training and the model simply ignore the scene information. Did you have the same problem? If yes, how did you handle it?
Which scene image did you use for uni_examples?

Or can you provide the pretrained weights and the pre-processed datasets?

Thanks!

Questions about the dataset split

Hi, first of all, congratulations! This (along with Y-net) is very impressive work. Also thank you for making the code available.

I have been working on a related model and am also using / have used the Stanford drone dataset for evaluation. I had a few questions regarding the dataset split used and reported in the PECnet and Y-Net papers so as to conduct a fair comparison.

Are the results reported on just pedestrian trajectories in both papers, or are all agents such as bicyclists, skateboarders and vehicles considered? Non-pedestrians tend to be fast moving and can potentially have higher ADE/FDE values. The Y-Net paper suggests that only pedestrians were considered while prior work (including mine - P2TIRL) seems to consider all agents.
SDD has a lot of tracks that temporarily go missing/ get occluded. The raw annotation files have a flag indicating this. The annotation files show the last known location for a missing track for all timestamps that the track went missing. From the standpoint of a prediction model, this would effectively be a stationary agent (which should be trivial to the model). Were these agents filtered during evaluation?
Finally, the Y-net paper suggests that all short trajectories (shorter than np + nf) were filtered. However np + nf is 35 seconds for training Y-net, while the baselines have all reported results for np + nf = 8 seconds. Are trajectories of lengths 8 to 35 seconds in the test set discarded?

Looking forward to your response!

-- Nachiket

input data for the model

hi, I hava a question about input data, in function collect_data, input is [person_id, frame_id ,x, y], here what's "x, y" refering to?
and while in your pickled data, input data is bounding box coordinates [xmin, ymin, xmax, ymax]

time cost

Hello, thanks for sharing your code!
I am wondering how much time does it cost for you to train your y-net model? And how many gpus do you use? I use one titian xp and it seems to cost me 2 days for training. Is that normal?

Full name of TTST and CWS

As titile, what is the full name of TTST and CWS?

ETH original dataset

Thank you very much for your work, I have a request I hope you can agree to
Could you share with me the data of your ETH processing script before processing, thank you very much!

Problem in the prediction result

In the prediction process, you use all the pedestrains in the scene to predict their future. Therefore, you may use the same or neighbor pedestrain's future trajectories for the prediction. I think it will invalidate your result. Is it correct?

About ETH-UCY [student_001.txt], [univ_examples_train.txt], [crowds_zara03_val] datasets

Hi, @HarshayuGirase
thank you for your nice work! :)

You seem to have used the same data split as SGAN.
However, it has been confirmed so far that the three datasets (student_001.txt, univ_examples_train.txt, crowds_zara03) the above title do not have a scene image on the Internet.

Nevertheless, WHY do you include the above three data in the train/val?
Since there is NO scene image, map segmentation will not be possible,
and if the segmentation is not accurate and is included in the data, it may become noise.

Answer me, please.

How does the social mask get created?

Hi @karttikeya ,

First of all, very nice work for PECNet! The idea of path planning conditioned on endpoints is really interesting. I have two questions regarding the training procedures:

how does the social mask corresponds to the social IDs?
I understand intuitively the social masks are used to indicate whether in spatial and temporal domain two objects interacts with each other. And due to the limit of GPU power, you are manually creating each batch and their social masks. But why you are using current_size to represent the index within a mask given they are always incrementing by 1? Maybe some explanation of the current_size and the object could help.
https://github.com/HarshayuGirase/PECNet/blob/4c342597ca4310ef4976412dd8c829ad60a5465a/utils/social_utils.py#L121-L128
why initial poses are downscaled by 1000 while the past trajectories are upscaled by 1.86?
Are those just magic numbers or there's a rationale on how to select those?

Best,
Yijun

Model corresponding to Ours-S-TT

Hi,

First thanks for this very well written paper and an innovative approach.

Your readme mentions that one can reproduce the results from Table 2 of the paper (page 11). In the table, there are 2 models mentioned - Ours-S-TT and PECNet. However, in your repo, you have 3 models.

Would appreciate if you could suggest which model corresponds to Ours-S-TT

Regards & thanks
Kapil

Data processing

I am wondering if the collect_data() function in social_utils.py was used to create the .pickle files? When I try to use that function to process a different dataset the output shapes of the trajectories don't make sense for the dataloader, but it is unclear how to modify the function. It would also be helpful to access the original ETH and SDD dataset files to understand how the data was pre-processed.

Conceptual Understanding

Hi all,

I wish to use this model to predict trajectories of all objects (static and dynamic) captured by my camera. And my camera is moving as well. Can this model be used in such a case to get trajectories of the objects? Thank you.

Trajectory Heatmaps for ETH/UCY

Hi, Thank you for your contribution and for providing the code!
I have a question regarding the trajectory heatmaps computation for the ETH/UCY datsets.
The resize parameter from the config file and the value 4200 set the resolution of the big distance matrix template:

Human-Path-Prediction/ynet/model.py

Line 295 in ec6cca5

size = int(4200 * params['resize'])

Could you kindly provide the values used for these datsets ?

Many thanks in advance.

Able to get good/comparable results for SDD in Ours-S-TT configuration

Hi @karttikeya

You will find this interesting

Here is a colab notebook that has code from this repository with a change that I removed the code related to anything social and initial_pos.

https://colab.research.google.com/drive/1p2NsydskxlJpN9FQjsIuBYt023SK03P9?usp=sharing

You will see that during training it showing best as

Test Min FDE 16.265123988001477
Test Best ADE Loss So Far (N = 20) 10.08790206860359
Test Best Min FDE (N = 20) 15.723517020260925

and in the independent testing of the model it is showing

Average ADE: 10.166523163287923
Average FDE: 16.0149660701936

(I was earlier confused and did not pay attention to "average" ... numbers during the training looked better because during test you are taking an average over 150 runs so it makes sense)

Regards
Kapil

Question about the multimodel end point sampling

Hi, thank you for providing the code. I'm just wondering how do you sample the multimodel end points from only one Gaussian distribution? Should it be GMM instead?

step interval in ETH

The original time step interval is 6. But in the data provided, it is 10, Why? Could you please provide the code for data processing? Thank you very much.

Questions about ETH and UCY data processing

Hi,
Thank you for your excellent work, I have a question about conducting experiments on ETH/UCY data.
I would like to know whether your data processing methods for ETH and UCY and the required data format are consistent with those in SGAN.
Thanks in advance,

ETH-UCY results, train/ test/ val sets, best point picked up on the test set?

Hi,

Thanks for your work. I was looking into the ETH-UCY notebooks/ code that is provided in the code. What I see that the test set where used to pickup the best point and then the results where reported on the test set. Yet, all prior works - I'm aware of- used the validation set to pickup the best training point and then reported the ADE/FDE metrics on the test set.

I just want to confirm this because the results will change dramatically due this. For example, the ETH scenarios validation set doesn't contain cases that you find it in the test set [ For example a pedestrian doing more than delta 1 meter between the 2.5 fps] , so technically it's much more difficult to generalize from the training set to these situations.

Best,
Abduallah

ETH-UCY result without TTST

Hello,

Could you provide the ETH-UCY result without TTST trick? I guess the result reported in your YNet paper is with TTST, and I wonder how TTST works against ETH-UCY dataset. Thank you very much!

sdd dataset

Thank you very much for your work. I recently had some problems reproducing the code for PEC-NET. The original SDD dataset we converted with the code provided by the authors for the dataset transformation yielded worse results(ADE/FDE=12.03/21.94) than the preprocessed data in the source code.Looking forward to your response!

about visualize

sorry for bothering
can you provide the visualize method? I am new to this field , hope you can do me a fever.thank you!

YNet: Data augmentation for training on ETH/UCY Dataset

Hi, I've read #29 and #35, but I'm still confused about the data augmentation for ETH/UCY dataset.
The original code uses the function augment_data(train_data, image_path, image_file, seg_mask), but when I try to use augment_eth_ucy_social(train_batches, train_scenes, train_masks, train_images), I can't find the corresponding variables, like train_masks.
I wonder when and how you call the function augment_eth_ucy_social() or you just use augment_data() (the same as other datasets).

Thanks!

Question - Prior - Training vs Testing

Hi @karttikeya

I noticed that during the training that a standard normal prior is assumed to compute the KL Loss. However, in the testing/evaluation, I see this hyperparameter for sigma (value set to 1.3) https://github.com/HarshayuGirase/PECNet/blob/4c342597ca4310ef4976412dd8c829ad60a5465a/utils/models.py#L127

Two questions -

a) Why this difference? i.e why not use 1.3 during training as well to compute the KL ?
b) How did you find 1.3?

Regards & thanks
Kapil

Question about SDD dataset split

Hello, I'd like to reproduce your method on the SDD dataset. I notice that you follow the TrajNet split of train/val/test. However, it looks like the link of TrajNet website is down (http://trajnet.stanford.edu/) and their paper "Trajnet: Towards a benchmark for
human trajectory prediction" is not on arXiv anymore. I'd like to ask do you know how we can get the TrajNet setting of SDD to reproduce? Thanks

Some questions

Hi,

Thanks for the ynet code. We have a question about the gaussian heatmap template used to represent the ground truth template.
In the paper, you mentioned that you represent the ground truth as a heatmap with gaussian components with a variance of 4 pixels centered at observed points. However, in the implementation, you are taking patches from the ground truth template which is essentially a large heatmap with a gaussian kernel with the size of 31. We don't think this has been mentioned in the paper so we were wondering what the reason for this discrepancy is.

SDD dataset reproduce

My approach is to take the SDD data set according to the ETH-UCY processing method, take data every 10 frames, use the first 8 coordinates of pedestrians to predict the coordinates of the next 12 points, and test it on the O-S-TT model. The results are as follows :
################## BEST PERFORMANCE 0.22 ######## Saved model to: ../content/trained.pt Epoch: 1 Train Loss 10908859.606394859 RCL 4337671.330541314 KLD 3969445.6071976754 ADL 2601742.6686558584 Test ADE 0.21815821937219357 Test Average FDE (Across all samples) 1.1173672070395309 Test Min FDE 0.4298632619006239 Test Best ADE Loss So Far (N = 20) 0.21815821937219357 Test Best Min FDE (N = 20) 0.4298632619006239
Obviously not in the same order of magnitude as the results of the paper。
I would like to ask whether there is a problem with my data set division (using the data set division method in ynet), or a parameter setting problem, or my data set processing idea is wrong.

Is initial_pos required if not doing social pooling ?

Hi @karttikeya

This is a follow-up question based on your answer in the previous queries

Your answer:

Scaled initial pos is used to allow the network to learn relative positions between pedestrians while matching element magnitudes.

Based on this I am assuming that if not doing social pooling then it is okay to not take initial_pos into account. Is that correct understanding?

Regards & thanks
Kapil

Hi, Can you provide the segmentation_models of eth_ucy?

Unable to get the results in table1

Hi,@karttikeya
I tried to use the "python training_loop.py" command without changing any parameters, but when I used the model I just got to test, I couldn't get the best result in table1, and I got the smallest ade/fde 10.35/16.23. I am confused why the result is similar to O-TT (in fact, sigma=1.3, and I did not change any parameters) instead of PECNet.
Moreover, I am also curious about l1loss. When I run test_pretrained_model.py using PECNET_social_model1.pt in the code, there is "l1loss:False" in the printed hyper_params, but I can't find this parameter in the code.
Third, why nonlocal_pools=3, which means that the data will go through Social Pooling three times, can I change it to 1？
Can you give me some suggestions about these questions.Thanks.

Regards
LXY

Script for GIFs/ Best trajecotrie / loss function

Thank you for your wonderful code and paper!
Would it be possible that you also provide the script for generating the gifs for the YNet?
In addition, I am unfortunately not quite clear how to find out which is the best trajectory of the predicted 20 (also for the YNet)?

.

Config of the ETH-UCY

Hi,

Thanks for sharing the code of your great work.

Could you please share the optimal config file of the ETH-UCY. If I use a similar config file as the SDD, I am getting a lot worse results (errors are almost twice larger).

I used the same training script and pickle files provided for ETH-UCY.

Or even better, if you can share with us the pretrained model you obtain for ETH-UCY, so we can reproduce the results on this dataset.

Best,
Osama

Ynet ETH/UCY config file

Hello!
Can you produce the config file of ETH/UCY? I've tried SDD trajnet and longterm but I don't know the exact settings of ETH/UCY dataset. Can you release the config and training script of ETH/UCY like SDD? Thanks very much!

Input to Future Trajectory Decoder (Ut)

Hi,
I'm currently using your Y-Net on Simulated Data and it works very well. Thank you for the great code :)

My question regards the input for the Trajectory Decoder (Ut). In your paper it seems, that you use the feature output from the Encoder(Ue) together with the predicted goal from the Ug Decoder. But going over your code in "ynet/train.py" in line 83 it seems, that you use the ground-truth information for the goal.
Am I missing something, or are both approaches considered ok?

Best and thank you,

André

Question about visualize

How to visualize the pre-trained model and what is the commands ?

Thank you

Pretrained Models for ETH-UCY

Hi,
Thanks for the nice work, I have a question regarding reproducing the numbers in Table 2 (ETH-UCY).

Do you have the pretrained models, and also the testing script?

Thanks in advance,

YNet ETH Training

Hello,
I am tring EHT dataset with Y-net,but no works.
my code was just following the [train_SDD_trajnet.ipynb] !
could you please relase the training code ?

Best,
FunLiu

I can't find yaml file for ynet evaluate_SDD_longterm.

I am trying to use the evaluate_SDD_longterm.ipynb file but I can't seem to find the config/yaml file for the step "Load config file and print hyperparameters"

Do I have to create it myself and what would I have to do for this? Thanks.

Reproduce the results

Hi, thank you very much for your great work！
I have some problems when reproduce your results on ETH/UCY, I use the *.pickle files that you provided(data scale is 170 when compute init_position), but can not get the same results of PECNet,(error even twice lager).
1.Is this result due to the dataset you processed?
2.Can you please provide the code that you get the results of the paper using the *.pickle you provided？

Hope to get your help, thx!

Question regarding Y-Net's ETH/UCY experiments

Hi,

First of all, thank you for this amazing work!

I am currently trying to reproduce the results for ETH/UCY, but cannot seem to achieve this. I believe the reason is because I might have some different inputs from you, e.g. ETH/UCY dataset in image coordinate, segmentation maps, reference images, etc.
Could you provide us the inputs you've used for ETH/UCY? And if possible, the configuration for these training?

Thank you

YNet ETH Dataset Moving Camera

Hello,

In your paper YNet you talked about using ETH/UCY and I was wondering if you also tested your network on the moving ego perspective of ETH. (Mounted camera rig on the car)
If so, I wanted to ask if you normalized the pedestrian coordinates to a certain frame, and which frame from the video-sequence you used as input to your YNet model. I.e. the initial image of the video-sequence or the last image of the observation.

Thank you very much in advance.

André Henkel

Homographic Matrix for UCY used in Y-Net

Hi, I am building the dataset for UCY and ETH dataset. To conduct a fair experiment, can you provide the homographic matrix used for UCY? Also, do you use same scene images among uni_examples and UNIV (student00*) as well as among Zara * datasets? Thanks.

Difference between pretrained models

Hi,

I am wondering if you can give some explanation about the different pretrained models on SDD. When running the test script on these models, I got different results (the best are for the first model PECNET_social_model1.pt)

Thanks in advance,

Best,

[Solved] Reproduced ETH Result

    Hi @karttikeya

This is to confirm that indeed 170 does benefit.

Here is a new notebook in which I am training ETH

https://colab.research.google.com/drive/1OdVwL3CM-_f-T3HlHyY3B2IaTDiKmprs?usp=sharing

This notebook has some differences from code in your repository because of the data preparation part. Everything is in the notebook itself including downloading of datasets etc.

Above to reproduce similar results as paper. Some of the runs produced even a better number.

Would appreciate if you have a cursory look to see I am not doing anything wrong as even though I managed to reproduce your numbers with machine learning it is easy to get fooled.

Again I want to thank you for the time you spent answering the questions; I am very happy that the results of your paper can be reproduced. This is an achievement in itself and indeed a good job on your part and your co-authors.

Regards
Kapil

Originally posted by @ksachdeva in #7 (comment)

Implementation clarification appreciated

Thanks for sharing your code. Could you please explain the rationale behind the following implementations? Thanks.

Why is initial_pos down scaled by 1000? Also the number 7 below probably should be replaced by param in yaml, otherwise it can easily leads to ground truth leakage i.e. when we adjust observation horizon to 6.
https://github.com/HarshayuGirase/PECNet/blob/4c342597ca4310ef4976412dd8c829ad60a5465a/utils/social_utils.py#L155
At test time, your model first selects best destination estimation by comparing to ground truth destination, then predict trajectory conditioned on the best guess. Since your model already have access to ground truth at test time, why not just conditioning on the ground truth destination instead?
https://github.com/HarshayuGirase/PECNet/blob/4c342597ca4310ef4976412dd8c829ad60a5465a/scripts/test_pretrained_model.py#L66

Question - data scaling (applicable for ETH ?)

Hi @karttikeya

I understand the notion of shifting the dataset to have the same origin. You also multiply the trajectories by 1.86.

Is this number has any significance?
Is it the same hyper-parameter used for ETH as well?

Regards
Kapil

Normalize the gt template

Hi, thanks for your excellent work!

I notice that your template is within (0,0.0099) and you do not normalize the template to (0,1) though there is such a choice. Would this affect the sigmoid function and BCE? Because usually, we use binary labels for this. May I ask any reason behind that?

Just want to make sure I am not missing important details! It will be helpful if you can answer these. Thanks!

Question about data preprocessing. How to convert the raw dataset into the picke files?

Thank you for sharing your code. It is a wonderful work.
I am wondering how to generate the pickle files that used in this code.
As far as I know, the raw ETH dataset is provided in the data format something like this " [frame number, pedestrain ID, y, x]
Although the dataset processing script is provided (https://drive.google.com/drive/folders/1ee9h_WtoXZhXZPT0H55uDFZaSmoArbf0?usp=sharing), the pre-processing script is not given.
Could your share your work on how to convert the raw dataset into the picke files?
Thank you very much!

Question about the preprocessed data for ETH/UCY

Hi,
thank you very much for your nice work and sharing it!

I have a question regarding the preprocessed data you provided (through Google Drive).
Among the datasets provided, I found out that some of the samples in pickle file for eth/ucy dataset are the same, but with different metaIds.
For example in hotel_train.pkl, it has samples of different metaId with everything else the same (trackId=1.0, sceneId='students001', and same frameIds as well as their x, y coordinates). Other files also contain some same samples like this.
I believe I did not modify the pickle file, but please let me know if this is not the case!

If it is, is this what was intended or should I remove the duplicated ones for further trials?

Sorry, in advance, to bother you if I misunderstood the purpose of metaId and this is totally fine!

harshayugirase / human-path-prediction Goto Github PK

human-path-prediction's People

Contributors

Stargazers

Watchers

Forkers

human-path-prediction's Issues

Recommend Projects

Recommend Topics

Recommend Org