Git Product home page Git Product logo

Comments (8)

jlewi avatar jlewi commented on July 21, 2024

This is a great idea.

Would it be possible to merge the aspects proposed in this example with one of our existing examples, so that we can collectively work to produce a set of high quality examples?

We have two main examples in progress

  • GitHub issue summarization

    • This is already using Tensor2Tensor
    • For this example we'd like to illustrate a variety of aspects such as hyperparameter tuning, continuous training etc...
    • But this is text data
  • For images we have #42 in progress

    • This is an E2E example using the mnist data
    • Could we combine that example with this example?

/cc @elsonrodriguez @yupbank

from examples.

cwbeitel avatar cwbeitel commented on July 21, 2024

I think people should communicate about what their barriers are and, where there is a bit of tooling to be shared, break that out into tools/. I pointed @ankushagarwal to my launcher code yesterday and got in touch with @texasmichelle yesterday and we're meeting up Friday to look through it and talk about the various issues.

My opinion is that examples should stay separated by application (and not be grouped by input or output modality or model type).

I'm not clear on how the issue summarization example is using t2t currently (it looks like it isn't).

Also I'll note that the above examples, as far as I understand, are meant to be e2e via serving accessible via a web page which is a distinct sense of being e2e compared to the batch data pipeline I'm demonstrating here.

If the launcher, job models, and utils for shipping workspace code were broken out into tools/ what would remain in this example would be only the docs and the t2t_usr_dir containing the data download utilities, t2t Problem definition, and a convenience wrapper for t2t-decoder. This level of simplification could be shared by other examples that go the t2t route.

from examples.

jlewi avatar jlewi commented on July 21, 2024

Good solutions are a lot of work.

  • The 1 PR you mailed out is 30 files and only a subset of the items you listed
  • The tasks you listed in the comment is a fairly substantial
  • Maintaining solutions over time to avoid bit rot is substantial work
  • We also need to ensure examples aren't specific to different Clouds/K8s distributions

All of which leads me to conclude that I think we will be much more successful if we can build a community of folks building and maintaining various examples.

The 2 examples, I mentioned above are ones with momentum that I think could be used to accomplish at least of the goals of this issue.

For example, why couldn't we add a batch prediction component to either of those two examples?

from examples.

cwbeitel avatar cwbeitel commented on July 21, 2024

I think the best course of action is for me to continue hacking around with this example separately and communicate with @texasmichelle and @ankushagarwal about how much of the strategy I'm advocating here they're interested in incorporating. If they're interested in using enough of it then I'll just contribute to that example without loss of benefit to myself and it sounds like with increased benefit to this project. If it's something in between then we'll figure something out.

from examples.

jlewi avatar jlewi commented on July 21, 2024

I think it would be great if we could find a way to have more people working together to produce a small subset of high quality samples that can be used to highlight Kubeflow.

The core value of Kubeflow is that we make it easy to deploy and manage all the components needed to do ML.

So having a small set of samples that each highlight a bunch of those components allows us to tell a much better story than trying to build a new example to highlight each component.

Furthermore, incorporating new components into the samples should be much easier. For example, if you want to highlight inference you don't need to first create a sample to train the model because it already exists.

For example @elsonrodriguez is developing an example based on mnist to highlight the KVC Intel has been developing.

@yixinshi needs an example suitable for large scale batch inference especially with GPUs (kubeflow/kubeflow#251)

We have kubeflow/example-seldon to highlight serving with Seldon (GitHub Issue summarization is also using Seldon).

So right now we are missing samples that demonstrate a lot of things

  • Distributed training with TFJob (GitHub Issue Summarization will probably do this)
  • Batch inference
  • Serving with TFServing with/without GPUs
  • A classification model that can be used for model analysis see #56
  • Combining multiple steps using a pipeline tool like Argo or Pachyderm

There is a list of possible examples here

Some other examples that might allow us to check a lot of boxes are

from examples.

cwbeitel avatar cwbeitel commented on July 21, 2024

I hear you. I'll talk to @texasmichelle tomorrow about how I can contribute there.

The video labeling and next-frame prediction problems are very interesting to me and I would like to work on that at some point.

Toward getting people more integrated we could start having weekly example developer meetings for ~45min or 1h where people go around and present progress, challenges, prototypes, etc.

from examples.

jlewi avatar jlewi commented on July 21, 2024

Website for object detection including tools and models
https://github.com/openimages/dataset

from examples.

cwbeitel avatar cwbeitel commented on July 21, 2024

Cool so given the above and after the discussion today it's clear to me that the above exercise is of value to the project but not currently as an additional example in this repository. Closing this issue and the related PR and we'll have separate discussions about what individual pieces might be incorporated into existing examples if at all. The air-tight front-facing kubeflow examples can always link to additional applications elsewhere.

from examples.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.