Git Product home page Git Product logo

ml-testing-accelerators's Introduction

IMPORTANT: This repository is being deprecated. Please migrate or onboard your ML tests to our new repository here.

ML Testing Accelerators

A set of tools and examples to run machine learning tests on ML hardware accelerators (TPUs or GPUs) using Google Cloud Platform.

This is not an officially supported Google product.

Getting Started

In this mode, your tests and/or models run on an automated schedule in GKE. Results are collected by the "Metrics Handler" and written to BigQuery.

  1. Install all of our development prerequisites.
  2. Follow instructions in the deployments directory to set up a Kubernetes Cluster.
  3. Follow instructions in the images directory to set up the Docker image that your tests will run.
  4. Deploy the metrics handler to Google Cloud Functions.
  5. Deploy the event publisher to you GKE cluster.
  6. See templates directory for a JSonnet template library to generate test config files.
  7. (Optional) Set up a dashboard to view test results. See dashboard directory for instructions.

Are you interested in using ML Testing Accelerators? E-mail [email protected] and tell us about your use-case. We're happy to help you get started.

ml-testing-accelerators's People

Contributors

a9isha avatar aireenmei avatar allenwang28 avatar aman2930 avatar chandrasekhard2 avatar darisoy avatar dependabot[bot] avatar ericlefort avatar gagika avatar gkroiz avatar hyeygit avatar jackcaog avatar jonb377 avatar jysohn23 avatar khshah6 avatar manfeibai avatar miaoshasha avatar rissyran avatar skye avatar sshahrokhi avatar ssusie avatar steventk-g avatar suexu1025 avatar taylanbil avatar vanbasten23 avatar will-cromar avatar wonjoolee95 avatar yeounoh avatar zcain117 avatar zpcore avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ml-testing-accelerators's Issues

Direct links to dashboard tabs

Right now, I'm having to first go to the dashboard url and then click on the pytorch-nightly tab. If I can pin that tab's direct url to my browser, that would be awesomesauce.

pass/fail dashboard: boxes are different sizes

Different tabs have different-sized boxes representing pass/fail status. The plot height/width scales with number of tests and number of days of data. Try to find some way to scale such that:

  1. boxes are square
  2. boxes are same size between tabs

Add support for tagging failures in dashboard

It would be great if we could tag failures with causes such as infrastructure flakiness, model regressions, or erroneous test configs. It may also be useful to link to bugs or github issues. This would be useful both for record-keeping (e.g. tracking flakiness over time) and to communicate with teammates if an error has been triaged.

Sorting graphs in metrics history

I think we should continue to show any currently out-of-bounds metrics first in the page.

Aside from that, some kind of sorting would be nice for all the in-bounds metrics.

One idea is to sort by stddev of the metric for the last N runs. This would give us a sense of which metrics are becoming increasingly unstable

Hide inactive tests (option to toggle hide)

FR: Current dashboard seems to display every test stored in the metrics table. We should have the option to toggle hiding inactive tests or hide them by default and give an option to reveal them (this was a pretty nice feature of our legacy dashboard).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.