Comments (16)
The camera setting I use is dict(size=[320, 180], position=[2.0, 0.0, 1.4], rotation=[0,0,0], fov=100)
. I use default FullTown01-v1
to collect data and FullTown02-v2
for evaluation.
Note that these settings are based on an old version of DI-drive ( with commit ID: f532c9e9a6b26386a933049c1754ca5262d76e0a).
from aco.
Why do we have a rindex 6510787 in the dataset.
This number is used to fix a bug in the index. The data part is not ready yet and I am still working on this.
There are 8,642,040 frames in total, and if we sample them by an interval of 10, there would be 0.8M samples in total, which does not match 1.3M.
Current 0.8M data is a small portion of the whole ytb frames that I experiment with. I pick up the video clips with close visual appearance to Carla to form these 0.8M frames. This does not affect the pretrain quality in carla downstream tasks. I will consider uploading the whole dataset in the future.
For pre-training, do you take the checkpoint with best accuracy on the 30% test-set or take the last epoch one.
I use the last epoch. I found the test accuracy stable during training so I simply pick the last checkpoint.
For imitation learning training, do you use the last epoch (100 epoch) checkpoint for evaluation?
This one is a little tricky. Due to IL's distribution shift problem, I found that the test performance between different epochs varies greatly. It is also hard to decide which epoch to pick since we do not have a validate environment. So for each pretrain weight, I pick (i*10)th, i=[3, 10] checkpoint and report the highest success rate.
from aco.
Hi, thanks for your interest in our work. You can find the frames here: OneDrive link.
After downloading, you can run
cat sega* > frames.zip
to get the zip file.
from aco.
Thank you very much! Another question about the down-stream tasks: do you also use the DI-drive engine for imitation learning and collect data using the default setting? Besides, which carla version is used for training and evaluation?
from aco.
Yes, I use DI-drive engine for imitation learning and the Carla version I use is 0.9.9.4.
Based on default settings, I change several things including the camera setting (to match the pretrained image resolution).
from aco.
Thanks. Could you provide the camera settings you used including the size, position, and fov? Also, could you provide more details about the collecting and evaluation suites. For example, do you use the default 'FullTown01-v1' suite and weather to collect data, and what evaluation suite and weather are used (straight/One turn/navigation/navigation with dynamics). That would be very helpful.
Thanks a lot.
from aco.
Thanks for your help. I appreciate it a lot.
from aco.
Sorry to bother you again. I have a few questions about the training to confirm.
- Why do we have a rindex 6510787 in the dataset, is it used to fix the index problem in dir-65?
- There are 8,642,040 frames in total, and if we sample them by an interval of 10, there would be 0.8M samples in total, which does not match 1.3M.
- For pre-training, do you take the checkpoint with best accuracy on the 30% test-set or take the last epoch one.
- For imitation learning training, do you use the last epoch (100 epoch) checkpoint for evaluation?
Thanks in advance.
from aco.
Hi, I follow your instructions and train an agent (pretrained from imagenet) using 4K data. However, the SR evaluated on the FullTown02-v2
suite of it is
Besides, do you plan to release the pre-calculated steer values for the uploaded 80K frames or the code for the inverse dynamic model? If not, could you please share more details about the model structure so that I can implement and train it by myself.
Also, as the DI-drive only contains PPO model with bev input, could you provide your model file or model details for PPO training?
Thanks a lot!
from aco.
Hi,
I sample the 10% data uniformly (eg data_4K = data_40K[::10]), is this the same way you do it or I should choose data_4K = data_40K[:4000].
I reduce the data set size in trajectory level, data_40K[:4000]. Since redundancy exists in adjacent frames, reducing in trajectory level will create harder problem than reducing in frame level (data_40K[::10]). So I think the performance gap you reported is expected.
Do you plan to release the pre-calculated steer values?
Yes, I am working on this part and will release in the future. Stay tuned.
PPO training.
I do not experiment much with PPO model design. A resnet34 backbone is used to extract the visual feature. Then, the feature goes through a mlp and is concatenated with the velocity as the output of the encoder.
from aco.
Let's keep this issue only dataset-relevant. If you have any further questions about training, feel free to open a new one.
from aco.
Why do we have a rindex 6510787 in the dataset.
This number is used to fix a bug in the index. The data part is not ready yet and I am still working on this.
There are 8,642,040 frames in total, and if we sample them by an interval of 10, there would be 0.8M samples in total, which does not match 1.3M.
Current 0.8M data is a small portion of the whole ytb frames that I experiment with. I pick up the video clips with close visual appearance to Carla to form these 0.8M frames. This does not affect the pretrain quality in carla downstream tasks. I will consider uploading the whole dataset in the future.
For pre-training, do you take the checkpoint with best accuracy on the 30% test-set or take the last epoch one.
I use the last epoch. I found the test accuracy stable during training so I simply pick the last checkpoint.
For imitation learning training, do you use the last epoch (100 epoch) checkpoint for evaluation?
This one is a little tricky. Due to IL's distribution shift problem, I found that the test performance between different epochs varies greatly. It is also hard to decide which epoch to pick since we do not have a validate environment. So for each pretrain weight, I pick (i*10)th, i=[3, 10] checkpoint and report the highest success rate.
Hi, Thanks for sharing your great work!
- In this issue, you mentioned that you picked up the clips with a close visual appearance to Carla, so what is the criteria for visual appearance similarity? Picking up 0.8M will bring a better performance or it just saves the training cost?
- Did you follow the default suite of DI-Drive for IL Carla dataset generation?
Bests,
from aco.
Hi, thanks for your interest in our work.
- There are no clear criteria; I simply removed some driving videos with extreme weather to save the training cost. It would help with more carefully designed measures, for example, calculating the feature distance between Carla frames and video frames of a particular video and then sorting all the videos based on that distance.
- Yes, I follow the default suite of DI-Drive for IL dataset generation, except for several settings mentioned here.
from aco.
from aco.
- We didn't conduct experiments comparing different dataset size. Indeed with more diversity, the pre-trained large model could be even stronger.
- From the very start, DI-drive did not support Carla 0.9.9. And that is why we used an old version.
from aco.
- Okay, thanks.
- Thanks for your information.
from aco.
Related Issues (7)
- The training of the Inverse dynamics model HOT 5
- Question about the model for PPO HOT 2
- Hi, where can I obtain the files label.pt and meta.txt? HOT 1
- Which dataset is the provided pre-trained weight trained on? HOT 2
- Possible for sharing imitation_learning pre-trained weights? HOT 1
- Camera Intrinsic Matrix for the Youtube Driving Videos HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aco.