Git Product home page Git Product logo

Comments (6)

teucer avatar teucer commented on August 19, 2024 1

Extracting factors/variables would be very useful for our use case. As said for prediction purposes we need to know them. Obviously they would be context dependent, but should be doable (?)

from formulaic.

matthewwardrop avatar matthewwardrop commented on August 19, 2024

Hi @teucer. Thanks for reaching out.

Extracting the different parts of the formula is pretty straightforward in Formulaic 0.2.x; e.g.: y, X = model_matrix("y ~ x + y +z"). The ~ operator acts like a separator.

As of the current main branch, with changes I merged a few minutes ago, the above will still work; but so will mm = model_matrix("y ~ x + y"); mm.lhs ; mm.rhs.

As for extracting out the variables used, it's an interesting idea, and I've though about doing it in the past. The only tricky thing is that to formulaic, log and y are both just variables. We could do some sleuthing to figure out which one comes from the dataframe/etc, if that is useful.

from formulaic.

seanv507 avatar seanv507 commented on August 19, 2024

I would also request the ability to extract variable names. The particular issue I would have is how factors are encoded as columns. given y~factor1 + factor2 + factor1:factor2 it is not clear how to infer the columns used, and therefore extraction of coefficient "names".

from formulaic.

matthewwardrop avatar matthewwardrop commented on August 19, 2024

Hi @seanv507 , thanks for reaching out. There is already support for mapping features in the formula to columns and vice versa; see for example <output>.model_spec.[feature_names|feature_indices|structure] etc. This API needs to be ratified and documented, but support does exist. What was new in this issue request was the ability to peek inside the string name of features to extract the columns used in the dataset (e.g. "log(y)" should indicate that "y" was used). This is not (yet) implemented.

from formulaic.

matthewwardrop avatar matthewwardrop commented on August 19, 2024

@teucer Do you have any thoughts on PR #145 which addresses this issue. Will likely merge before the end of the week (after cleanups).

from formulaic.

teucer avatar teucer commented on August 19, 2024

This is really cool, thx!

from formulaic.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.