Validation about openvla HOT 4 CLOSED

JianJianHeng commented on August 21, 2024

Validation

from openvla.

Comments (4)

kpertsch commented on August 21, 2024

Hi! Thanks for your question!
Please note that validation accuracy (unfortunately) is usually a very poor metric for policy rollout success. In fact, we often find that policies with high training accuracy and potentially low validation accuracy work best when rolled out on the robot.
This is also the reason we are not logging validation stats in our train loop, since we didn't find them to be informative in the past.

So I would encourage you to still give your policy a try in rollouts and see whether it works.

As an aside: 85% action token accuracy still seems a bit low -- with small datasets I would expect 95%+ accuracy after finetuning, so you may want to check whether you can further improve accuracy by training longer or cleaning your training data.

from openvla.

JianJianHeng commented on August 21, 2024

Hi! Thanks for your question! Please note that validation accuracy (unfortunately) is usually a very poor metric for policy rollout success. In fact, we often find that policies with high training accuracy and potentially low validation accuracy work best when rolled out on the robot. This is also the reason we are not logging validation stats in our train loop, since we didn't find them to be informative in the past.

So I would encourage you to still give your policy a try in rollouts and see whether it works.

As an aside: 85% action token accuracy still seems a bit low -- with small datasets I would expect 95%+ accuracy after finetuning, so you may want to check whether you can further improve accuracy by training longer or cleaning your training data.

Thank you for your reply. We will continue training to achieve higher accuracy and then give it a try.

from openvla.

object814 commented on August 21, 2024

Hello, Author,

I have a question that I would like to consult with you. We tried fine-tuning on OpenVLA using our own dataset. We added validation following the training method, but we found that the performance on the training set is excellent, with an action accuracy of 0.85. However, the performance on the validation set is very poor, only around 0.1. I would like to ask if you have encountered a similar situation during your training. Do you have any plans to add validation in the future?

Thank you again for your work.

Hi there,
we splited our custom dataset and converted to RLDS as well, however the logging is not showing the validation result, as the author pointed out. I am wondering what changes you made to visualize the validation performance? We wish to see if the same thing happens in our case. Thank you

from openvla.

JianJianHeng commented on August 21, 2024

Hello, Author,
I have a question that I would like to consult with you. We tried fine-tuning on OpenVLA using our own dataset. We added validation following the training method, but we found that the performance on the training set is excellent, with an action accuracy of 0.85. However, the performance on the validation set is very poor, only around 0.1. I would like to ask if you have encountered a similar situation during your training. Do you have any plans to add validation in the future?
Thank you again for your work.

Hi there, we splited our custom dataset and converted to RLDS as well, however the logging is not showing the validation result, as the author pointed out. I am wondering what changes you made to visualize the validation performance? We wish to see if the same thing happens in our case. Thank you

Hello,

I apologize for the delayed response due to some personal matters I've been attending to recently. In the original OpenVLA code, there is no part for validation. We created a new validation dataset based on the setting in the dataset that marks train as false. According to the TFDS configuration, it allocates 95% of the data as the training set and 5% as the validation set. We manually coded to obtain this new validation dataset and calculated values such as loss and accuracy on the validation dataset, mimicking the metrics from the training process. However, we later found that the validation metrics were indeed very low, but this did not affect the effectiveness in real-world experiments.

from openvla.

Validation about openvla HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent