Thanks for sharing your great work! Just wonder is it possible to share your zip file

Why do we have a rindex 6510787 in the dataset. <p dir=

About dataset about aco HOT 16 CLOSED

metadriverse commented on September 21, 2024

About dataset

from aco.

Comments (16)

zqh0253 commented on September 21, 2024 1

The camera setting I use is dict(size=[320, 180], position=[2.0, 0.0, 1.4], rotation=[0,0,0], fov=100). I use default FullTown01-v1 to collect data and FullTown02-v2 for evaluation.
Note that these settings are based on an old version of DI-drive ( with commit ID: f532c9e9a6b26386a933049c1754ca5262d76e0a).

from aco.

zqh0253 commented on September 21, 2024 1

Why do we have a rindex 6510787 in the dataset.

This number is used to fix a bug in the index. The data part is not ready yet and I am still working on this.

There are 8,642,040 frames in total, and if we sample them by an interval of 10, there would be 0.8M samples in total, which does not match 1.3M.

Current 0.8M data is a small portion of the whole ytb frames that I experiment with. I pick up the video clips with close visual appearance to Carla to form these 0.8M frames. This does not affect the pretrain quality in carla downstream tasks. I will consider uploading the whole dataset in the future.

For pre-training, do you take the checkpoint with best accuracy on the 30% test-set or take the last epoch one.

I use the last epoch. I found the test accuracy stable during training so I simply pick the last checkpoint.

For imitation learning training, do you use the last epoch (100 epoch) checkpoint for evaluation?

This one is a little tricky. Due to IL's distribution shift problem, I found that the test performance between different epochs varies greatly. It is also hard to decide which epoch to pick since we do not have a validate environment. So for each pretrain weight, I pick (i*10)th, i=[3, 10] checkpoint and report the highest success rate.

from aco.

zqh0253 commented on September 21, 2024

Hi, thanks for your interest in our work. You can find the frames here: OneDrive link.
After downloading, you can run

cat sega* > frames.zip

to get the zip file.

from aco.

penghao-wu commented on September 21, 2024

Thank you very much! Another question about the down-stream tasks: do you also use the DI-drive engine for imitation learning and collect data using the default setting? Besides, which carla version is used for training and evaluation?

from aco.

zqh0253 commented on September 21, 2024

Yes, I use DI-drive engine for imitation learning and the Carla version I use is 0.9.9.4.

Based on default settings, I change several things including the camera setting (to match the pretrained image resolution).

from aco.

penghao-wu commented on September 21, 2024

Thanks. Could you provide the camera settings you used including the size, position, and fov? Also, could you provide more details about the collecting and evaluation suites. For example, do you use the default 'FullTown01-v1' suite and weather to collect data, and what evaluation suite and weather are used (straight/One turn/navigation/navigation with dynamics). That would be very helpful.
Thanks a lot.

from aco.

penghao-wu commented on September 21, 2024

Thanks for your help. I appreciate it a lot.

from aco.

penghao-wu commented on September 21, 2024

Sorry to bother you again. I have a few questions about the training to confirm.

Why do we have a rindex 6510787 in the dataset, is it used to fix the index problem in dir-65?
There are 8,642,040 frames in total, and if we sample them by an interval of 10, there would be 0.8M samples in total, which does not match 1.3M.
For pre-training, do you take the checkpoint with best accuracy on the 30% test-set or take the last epoch one.
For imitation learning training, do you use the last epoch (100 epoch) checkpoint for evaluation?
Thanks in advance.

from aco.

penghao-wu commented on September 21, 2024

Hi, I follow your instructions and train an agent (pretrained from imagenet) using 4K data. However, the SR evaluated on the FullTown02-v2 suite of it is $37.3 \pm 3.1$, which is higher than the reported $21.3 \pm 7.5$ in paper. Do I miss any details or other modifications are needed? What are the possible reasons in your opinion? I am using Carla 0.9.9.4 and the same DI-Drive version as you. I sample the 10% data uniformly (eg data_4K = data_40K[::10]), is this the same way you do it or I should choose data_4K = data_40K[:4000].

Besides, do you plan to release the pre-calculated steer values for the uploaded 80K frames or the code for the inverse dynamic model? If not, could you please share more details about the model structure so that I can implement and train it by myself.

Also, as the DI-drive only contains PPO model with bev input, could you provide your model file or model details for PPO training?

Thanks a lot!

from aco.

zqh0253 commented on September 21, 2024

Hi,

I sample the 10% data uniformly (eg data_4K = data_40K[::10]), is this the same way you do it or I should choose data_4K = data_40K[:4000].

I reduce the data set size in trajectory level, data_40K[:4000]. Since redundancy exists in adjacent frames, reducing in trajectory level will create harder problem than reducing in frame level (data_40K[::10]). So I think the performance gap you reported is expected.

Do you plan to release the pre-calculated steer values?

Yes, I am working on this part and will release in the future. Stay tuned.

PPO training.

I do not experiment much with PPO model design. A resnet34 backbone is used to extract the visual feature. Then, the feature goes through a mlp and is concatenated with the velocity as the output of the encoder.

from aco.

zqh0253 commented on September 21, 2024

Let's keep this issue only dataset-relevant. If you have any further questions about training, feel free to open a new one.

from aco.

SiyuanHuang95 commented on September 21, 2024

Why do we have a rindex 6510787 in the dataset.

This number is used to fix a bug in the index. The data part is not ready yet and I am still working on this.

There are 8,642,040 frames in total, and if we sample them by an interval of 10, there would be 0.8M samples in total, which does not match 1.3M.

Current 0.8M data is a small portion of the whole ytb frames that I experiment with. I pick up the video clips with close visual appearance to Carla to form these 0.8M frames. This does not affect the pretrain quality in carla downstream tasks. I will consider uploading the whole dataset in the future.

For pre-training, do you take the checkpoint with best accuracy on the 30% test-set or take the last epoch one.

I use the last epoch. I found the test accuracy stable during training so I simply pick the last checkpoint.

For imitation learning training, do you use the last epoch (100 epoch) checkpoint for evaluation?

This one is a little tricky. Due to IL's distribution shift problem, I found that the test performance between different epochs varies greatly. It is also hard to decide which epoch to pick since we do not have a validate environment. So for each pretrain weight, I pick (i*10)th, i=[3, 10] checkpoint and report the highest success rate.

Hi, Thanks for sharing your great work!

In this issue, you mentioned that you picked up the clips with a close visual appearance to Carla, so what is the criteria for visual appearance similarity? Picking up 0.8M will bring a better performance or it just saves the training cost?
Did you follow the default suite of DI-Drive for IL Carla dataset generation?

Bests,

from aco.

zqh0253 commented on September 21, 2024

Hi, thanks for your interest in our work.

There are no clear criteria; I simply removed some driving videos with extreme weather to save the training cost. It would help with more carefully designed measures, for example, calculating the feature distance between Carla frames and video frames of a particular video and then sorting all the videos based on that distance.
Yes, I follow the default suite of DI-Drive for IL dataset generation, except for several settings mentioned here.

from aco.

SiyuanHuang95 commented on September 21, 2024

Thanks for your reply! 1. Have you tested the performance with the full-size dataset? Will it cause a performance drop? Or it just takes more time? Given the diversity, extreme weather would add more diversity, which is the gain source of the pre-trained large model. 3. I am wondering if it would matter much for different Carla versions. And I found the default version for DI is 0.9.9. Can you share why you make another choice? Bests

from aco.

zqh0253 commented on September 21, 2024

We didn't conduct experiments comparing different dataset size. Indeed with more diversity, the pre-trained large model could be even stronger.
From the very start, DI-drive did not support Carla 0.9.9. And that is why we used an old version.

from aco.

SiyuanHuang95 commented on September 21, 2024

Okay, thanks.
Thanks for your information.

from aco.

About dataset about aco HOT 16 CLOSED

Comments (16)

Related Issues (7)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent