rust-ml / discussion Goto Github PK

View Code? Open in Web Editor NEW

108.0 108.0 3.0 9 KB

A space to discuss the future of the ML ecosystem in Rust.

discussion's People

Contributors

Stargazers

Watchers

Forkers

davidb ai-stuff stjordanis

discussion's Issues

Looking for PoC project using Video stream from a Drone

Hello from AeroRust WG

We are working on a Drone SDK for Parrot drones and we thought it would be nice to make a small PoC project with the video stream coming from the Drone.

We don't have anything particular in mind, however we do want to tie it with the SDK and have something working even if it's not something spectacular.

There is even a simulated environment build with Sphinx (in which I work) and one of our other members (cc @o0Ignition0o) has some hardware that we can test on.

We are looking into a VR headset for displaying only (for now) the video stream and it might be cool to have what the drone sees from the camera or something similar.

Future work

We would like to build a cool demo starting with controlling the drone all the way for person to supervise the drone using the VR Headset and taking control if he sees fit.

For example: Drone flies towards something and the person intervenes and takes control of the drone before crashing.

Links

Taking off & landing inside of simulation: https://www.youtube.com/watch?v=t5ftrR-uZPE&ab_channel=LachezarLechev
Discord: https://discord.gg/RXNsMXc
More about AeroRust: AeroRust/WorkingGroup#6
SDK Arsdk-rs: https://github.com/AeroRust/arsdk-rs
Simulation Parrot-Sphinx: https://developer.parrot.com/docs/sphinx
Parrot: https://parrot.com

Disclaimer

We are in no way sponsored or backed (at least not yet) by Parrot. We are just hacking on this drone for fun and to get people interested in Aerospace and Rust (AeroRust 😉 )

[data-viz] What do we want to build ?

Data visualization (data-viz) is used for exploration, explanation, illustration, ...

This a place to discuss:

what we want to build
what we have
what we need
what are the constraints
- inputs (csv, dataframe (arrow?), array (ndarray ?),...)
- output (jupyter notebook, console, file, web browser, ...)
- ...

Default and optional parameters

Related to the goal of defining a set of common ML traits as discussed in #1, another question is how to define default and optional parameters in ML models.

For instance, let's take as an example the LogisticRegression estimator that takes as input following parameters (with their default values),

regularization penalty: penalty="l2"
Tolerance for stopping criteria: tol=0.0001
inverse of the regularization strenght: C=1.0
etc.

There are then multiple choices for initializing the model with some of these parameters,

1. Passing parameters explicitly + default method

// initialize the model with default parameters
let model = LogisticRegression::default();
// otherwise initialize all parameters explicitly
let model = LogisticRegression::new("l2", 0.001, 0.1);

From a quick look, this pattern (or something similar) appears to be used in the rusty-machine crate (e.g. here). For models with 10+ parameters passing all of them explicitly seems hardly practical.

2. Builder pattern

One could also use the builder pattern and either construct a LogisticRegressionBuilder or maybe even use it directly with LogisticRegression,

use some_crate::linear_model::logistic_regression::Hyperparameters

// Other parameters are left at their default
let params = Hyperparameters::new()
                .penalty("l1")
                .tol(0.01);
// build a model with these parameters
let model = params.build();

Is used by rustlearn as far as I can tell (e.g. here).

The advantage is that hyperparameters are already in a struct, which helps if you want to pass them between estimators or serialize them. The disadvantage is it requires to create a builder for each model. Also, I find that having multiple objects called Hyperparameters in the code base somewhat confusing (and it will definitely be an issue when searching the code for something).

3. Using struct update syntax

Possibly something around the struct update syntax, though I have not explored this topic too much.

struct LogisticRegression {
   penalty: String,
   tol: f64,
   C: f64
}

impl LogisticRegression {
    fn default() -> Self {
        LogisticRegression {
            penalty: "l2".to_string(),
            tol: 0.001,
            C: 1.0
        }
    }

// update one parameter, keep others as defaults
let model = LogisticRegression {tol: 0.1, .. LogisticRegression::default()};

(Note: I have not tried to complile it to see if this actually works)

4. Other approaches

This topic was discussed at length in rust-lang/rfcs#323 and related RFCs, but I'm not sure what was accepted as of now or could be used in rust stable now (or in near future).

Comments would be very welcome. Please let me know if I missed something.

What do we want to build?

Welcome!

I created this repository as a discussion hub for the ML ecosystem in Rust, "following" a talk I gave at the Rust meetup in London (slides).

I do believe that Rust has great potential in this area, but to fully realize this potential we need to provide building blocks: we need to tackle those shared challenges that, once removed, will enable more and more people to just come to Rust and build what they want to build.

The three building blocks I do see as fundamental for an ML ecosystem are:

n-dimensional arrays;
dataframes;
an ML model interface.

I have spent the last year, when it comes to open-source contributions, enhancing n-dimensional arrays: direct contributions to ndarray, statistical routines on top of it (ndarray-stats) and tutorials to help people to get into the Rust scientific ecosystem from Python, Julia or R. I do believe that ndarray is in more than a good shape when it comes to fulfil NumPy's role in the Rust ecosystem.

There is now movement as well when it comes to dataframes - a discussion is taking place at rust-dataframe/discussion#1 to explore use cases and potential designs. (The idea of opening this repository comes directly from this experiment of community-led design for dataframes).

Given that one of the two data structures that are usually consumed by ML models is ready (n-dimensional arrays) and the other one is baking (dataframes) I think it's time to start thinking about what to do with the ML-specific piece.

I don't want to steer the debate too much with the opening post (I'll chip in once the discussion starts), but the questions I'd like to see tackled are:

what use-cases could make Rust shine in the ML ecosystem?
what are the basic capabilities that have to be built to enable the usage of Rust for ML workloads?
how should we structure such a project? A core library with few traits and a set of separate crates tackling different aspects? A large battery-included scikit-learn equivalent?
why do you want to use Rust for ML?

Group organization and logistics

Hello!

Per some discussion in #1 What do we want to build?, I'm opening this thread as a place to start a discussion about what a Rust ML working group (which, at least for the moment, is unofficial) could look like, and as a place to coordinate tasks and structures, etc.

Based on some of the other WG structures, I think there are a few things in particular that might be worth some discussion. After there's some consensus built around those (and any other topics that might come up), either opening a PR with the new information on this repository's README or building a new rust-ml / wg coordination repository with that information might be in order. If there is a topic missed below that you would like to discuss, feel free to chime in! With any luck, once things here are worked out a little more, we can start breaking things down and begin digging into designing and writing code.

a) What is the initial plan for scope of work, and how can we break this down into smaller projects that can make decisions and coordinate within themselves? Machine learning is a really broad domain, so a well-stated initial scope of work and direction seems important. Scope creep or trying to start with an overwhelming number of goals rarely lead to success. There was some discussion in the aforementioned thread about putting together an outline for a Book, and then developing the ecosystem to match the work required to finish that.

Not to get too far ahead of ourselves in terms of breaking down a new group into even smaller groups, but considering how broad the field of ML is, having an idea of how to structure the different parts of that in a reasonable way seems important. Groups like Embedded seems to break things down into formal project teams, while others like Async and Secure Code seem to have kept things a little more informal in terms of structure. Personally, I think based on the breadth of the ML domain and scale we're currently at, a little bit of both is probably in order.

b) How can new contributors get involved? Based on the discussion thread, it seems like there's a decent amount of community interest, so making it clear their contributions are be appreciated and providing a clear path for providing them seems like it would be valuable. This might include some guidance on how marking issues with certain flags such that new contributors can easily find them and guidance for what's expected in terms of submission content.

c) In what way and how often should the group expect to meet? Across existing WGs, it seems like there's a tendency towards weekly or biweekly meetings on Matrix, Zulip, or Discord servers at a specified time. Outside of that, GH issues like these seems standard for tracking discussions. I've personally found a quick progress update or summary of work posted at regular intervals (perhaps after meetings) to be really helpful in keeping track of things. That's a task I'd be willing to volunteer for, if others think it might be helpful.

d) What pre-existing components should be used, and what do we think should be built from scratch? There's a fairly large amount of early-stage or abandoned Rust ML crates and projects. The ecosystem has seemed to center around ndarray, but there are certainly alternatives for other things where instead of building something from scratch, an abandoned or external project could be forked, or included by building a shim layer/interop functions. On the other hand, some other projects (particularly ones with poor documentation or strange design choices) might not be worth the time and effort spent trying to grok them and building things from the ground up might make more sense. Many Rust projects do a really good job at providing well thought-out interfaces for both people and code, and I know that's something that I'd like to see in any Rust ML project as well.

[neural-net-api] What do we want to build?

Following on from @LukeMathWalker 's post here, I'm starting this issue to discuss and refine the details of a high level machine learning interface (API).

From an API perspective, I propose that the interface is modelled similarly to Keras - though not necessarily with the same naming convention.
When it comes to GPUs, from what I've seen there is some vendor segmentation with the ML space - for example, Tensorflow only works on CUDA enabled video cards (see here ). I'd be keen to focus on industry standard and cross platform solutions for neural networks - specifically the standards defined by the Khronos group - though obviously that would require some community discussion to determine the right path here.

Edit 6Apr2020:
Further to the above it looks like there is already some progress on creating a cross platform, web based method for accessing the GPU:

GFRS WebGPU - Native Rust (over Rust/C bindings for > DirectX/Vulkan/Metal) implementation of the W3C WebGPU specification and W3C WebGPU Shader Language

EMU - A (soon to be) pure Rust based GPGPU abstraction over WebGPU for running SPIR-V compute kernels.

I'm also keen to understand what use cases for neural-net building others have at the minute and assist the community in moving this forwards.

What should we build?

rust-ml / discussion Goto Github PK

discussion's People

Contributors

Stargazers

Watchers

Forkers

discussion's Issues

Looking for PoC project using Video stream from a Drone

Hello from AeroRust WG

Future work

Links

Disclaimer

[data-viz] What do we want to build ?

Default and optional parameters

What do we want to build?

Group organization and logistics

[neural-net-api] What do we want to build?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent