Git Product home page Git Product logo

Comments (6)

tharvik avatar tharvik commented on September 25, 2024 1

a custom task requires users to choose parameter values of learning rate, DP sensitivity, gradient clipping

that's clearly an issue, creating a Task is way too complex, there are so much fields, some very technicals, some that are only related to image or tabular, some that are unused even. it should be as easy as Task { id: 'titanic', model: ... } for the basic cases, with a maximum of fields having sane defaults.

Furthermore, users have to upload their own model which requires coding to some extent.

I don't think we can really avoid that but a guide will indeed help a lot to do so.

from disco.

peacefulotter avatar peacefulotter commented on September 25, 2024 1

Thanks for tagging me, would love to brainstorm / contribute for the task refactoring if needed

from disco.

tharvik avatar tharvik commented on September 25, 2024 1

confusing is that the work "task"

yeah, that's way too generic. I think we're hitting the hardest problem in computer science.

I'm also realizing that there is probably some confusion in our discussion between the concept of a Task in the user interface, and the actual Task object in the code base.

ho right, thanks for noticing it. I was indeed only viewing via the discojs-core's Task class.

Talking about the user interface concepts, what do you think of:

  • Re-defining "Pre-defined tasks" as "Examples" or "Demo" or "Showcase"

"examples" sounds good, "demo" feels as an guided/tutorial experience which is not the case, "showcase" is nice too. having a plural name (as "examples") will probably help us a bit more (how one can talk a specific element of a "showcase"?).

Talking about the user interface concepts, what do you think of:

  • Re-defining "Custom task" as "Training Session" or "Session" or "DisCollaborative" (this term is used in the homepage and I quite like it even though it's not used anywhere else). These terms feel more like a problem instance of a task and more specific

I really like "DisCollaborative"! *"session" make it feel temporary which is not the case for most of theses models (especially with continuous learning).

As such, the word task is not used anymore in the UI (but still available if needed). And from there we can define programming objects that match the higher-level concepts and that are coherent. What do you think?

yep, makes total sense, I'm keeping Task for now in code (as I didn't find really another name at the moment), hopefully, cleaning up will help getting a clearer picture/name.

from disco.

JulienVig avatar JulienVig commented on September 25, 2024

Related to #647

from disco.

tharvik avatar tharvik commented on September 25, 2024

Related to #647

indeed, moving my (now deleted) response from there

what's the purpose of tasks in Disco and what they are,

clearly agree, it's currently used as an global context object everywhere, with way too many fields. you can see some very early draft of a split of TrainingInformation @ https://github.com/epfml/disco/blob/647-split-tasks-tharvik/discojs/discojs-core/src/task/training_information.ts
I'll try a definition of what I think it should be, discussion very welcome: a Task is a Model with a description. it's what users of disco will participate in, by adding data (training) or predicting with.
for now, it's also containing model specific config (mv to Model init), dataset config (mv to Model inputs type and dataset types) and network config (maybe showing which network are currently running).
I also think that TaskProvider functionnality should be renamed to Task, and the old Task would be more precisely Config.

especially what are the use cases for pre-defined tasks

pre-defined task is for to showoff the possibilities of disco itself, but there is clearly no need to uncumber discojs-core with it. it should be only available for trying out/demo purposes. when put outside, it will effectively be the same as adding custom tasks.

and custom tasks

custom task would be for specific uses of disco (such as the various bilateral projects that's being develop at MLO).

from disco.

JulienVig avatar JulienVig commented on September 25, 2024

I think what I find confusing is that the work "task" is usually used at a higher level in machine learning. For example, the first google result of "machine learning tasks" talks about "binary classification", "regression", "clustering" etc. Similarly, for LLMs tasks refer to problems like "question answering" or "summarizing".
In contrast, a Disco task is much more specific as you said, it specifies the model, the config, the dataset, the distributed learning scheme etc.
Similarly,

a Task is a Model with a description. it's what users of disco will participate in, by adding data (training) or predicting with.

feels till too specific for the word "Task".

I'm also realizing that there is probably some confusion in our discussion between the concept of a Task in the user interface, and the actual Task object in the code base.

Talking about the user interface concepts, what do you think of:

  • Re-defining "Pre-defined tasks" as "Examples" or "Demo" or "Showcase"
  • Re-defining "Custom task" as "Training Session" or "Session" or "DisCollaborative" (this term is used in the homepage and I quite like it even though it's not used anywhere else). These terms feel more like a problem instance of a task and more specific
    As such, the word task is not used anymore in the UI (but still available if needed). And from there we can define programming objects that match the higher-level concepts and that are coherent. What do you think?

from disco.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.