Git Product home page Git Product logo

Comments (4)

kpertsch avatar kpertsch commented on August 21, 2024

Hi! Thanks for your question!
Please note that validation accuracy (unfortunately) is usually a very poor metric for policy rollout success. In fact, we often find that policies with high training accuracy and potentially low validation accuracy work best when rolled out on the robot.
This is also the reason we are not logging validation stats in our train loop, since we didn't find them to be informative in the past.

So I would encourage you to still give your policy a try in rollouts and see whether it works.

As an aside: 85% action token accuracy still seems a bit low -- with small datasets I would expect 95%+ accuracy after finetuning, so you may want to check whether you can further improve accuracy by training longer or cleaning your training data.

from openvla.

JianJianHeng avatar JianJianHeng commented on August 21, 2024

Hi! Thanks for your question! Please note that validation accuracy (unfortunately) is usually a very poor metric for policy rollout success. In fact, we often find that policies with high training accuracy and potentially low validation accuracy work best when rolled out on the robot. This is also the reason we are not logging validation stats in our train loop, since we didn't find them to be informative in the past.

So I would encourage you to still give your policy a try in rollouts and see whether it works.

As an aside: 85% action token accuracy still seems a bit low -- with small datasets I would expect 95%+ accuracy after finetuning, so you may want to check whether you can further improve accuracy by training longer or cleaning your training data.

Thank you for your reply. We will continue training to achieve higher accuracy and then give it a try.

from openvla.

object814 avatar object814 commented on August 21, 2024

Hello, Author,

I have a question that I would like to consult with you. We tried fine-tuning on OpenVLA using our own dataset. We added validation following the training method, but we found that the performance on the training set is excellent, with an action accuracy of 0.85. However, the performance on the validation set is very poor, only around 0.1. I would like to ask if you have encountered a similar situation during your training. Do you have any plans to add validation in the future?

Thank you again for your work.

Hi there,
we splited our custom dataset and converted to RLDS as well, however the logging is not showing the validation result, as the author pointed out. I am wondering what changes you made to visualize the validation performance? We wish to see if the same thing happens in our case. Thank you

from openvla.

JianJianHeng avatar JianJianHeng commented on August 21, 2024

Hello, Author,
I have a question that I would like to consult with you. We tried fine-tuning on OpenVLA using our own dataset. We added validation following the training method, but we found that the performance on the training set is excellent, with an action accuracy of 0.85. However, the performance on the validation set is very poor, only around 0.1. I would like to ask if you have encountered a similar situation during your training. Do you have any plans to add validation in the future?
Thank you again for your work.

Hi there, we splited our custom dataset and converted to RLDS as well, however the logging is not showing the validation result, as the author pointed out. I am wondering what changes you made to visualize the validation performance? We wish to see if the same thing happens in our case. Thank you

Hello,

I apologize for the delayed response due to some personal matters I've been attending to recently. In the original OpenVLA code, there is no part for validation. We created a new validation dataset based on the setting in the dataset that marks train as false. According to the TFDS configuration, it allocates 95% of the data as the training set and 5% as the validation set. We manually coded to obtain this new validation dataset and calculated values such as loss and accuracy on the validation dataset, mimicking the metrics from the training process. However, we later found that the validation metrics were indeed very low, but this did not affect the effectiveness in real-world experiments.

from openvla.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.