These generics are used within fable models for produ

The implementation in the fable series of packages is: <code class="

Add model() and estimate() generics about generics HOT 8 OPEN

r-lib commented on June 9, 2024

Add model() and estimate() generics

from generics.

Comments (8)

topepo commented on June 9, 2024

Those seem pretty reasonable but I think that it would be good to have the generic for estimate only involve the second argument to be less restrictive.

We also have the convention to make the first argument x whenever possible. That might seem like we are being uptight, but these generics are meant to be used in a broad context. For example:

estimate <- function(.data, .model){
  UseMethod("estimate")
}

I would want to use this on an existing model object (where the data have already been consumed). For example, I wrote an S package for Shewhart charts a long time ago and wanted a generic that would return estimates of the process mean and variance (mean and var weren't generic back then). This signature would preclude something like that.

Are you doing double dispatch on these two objects? If not, you wouldn't lose anything by using a single argument.

from generics.

mitchelloharawild commented on June 9, 2024

Completely agree with the usage of estimate(). Actually I changed this for the same reason last week in tidyverts/fabletools@d34f1c7.

The only reason it was so restrictive, was because I had originally planned to use this functionality only internally.

I don't feel too strongly about the name of the first argument, however I think using .data will restrict the context of the generic in a beneficial way. In what scenarios do you anticipate estimate() to be used without data as the first argument? Consistency for the usage of this generic may result in less cognitive load for the users.

from generics.

topepo commented on June 9, 2024

In what scenarios do you anticipate estimate() to be used without data as the first argument

One thing that I'd use it for is unsupervised methods. So if I have an object with PCA loadings, I'd use estimate(object, new_data = df) to get projections for new data points.

from generics.

mitchelloharawild commented on June 9, 2024

Sure, x or object is fine. So generics for model(x, ...) and estimate(x, ...)?

How general do you think the documentation of their functionality should be? Should it distinguish functionality between these generics, or should the methods have flexibility to use them inconsistently.

For example, in fable these generics would be used as follows:
model.tbl_ts(x, ...) trains multiple model definitions to data, where x is a tsibble, and ... are the model definitions.
estimate.tbl_ts(x, .model, ...) trains a single model definition to data, where x is a tsibble and .model is the model definition. ... is unused.

Having some recommended usage of these verbs would make it easier for users to learn their functionality, it would also make them less flexible.

from generics.

topepo commented on June 9, 2024

I'd suggest x for both. I wouldn't really get too specific about how we think that these should be used. I think that the doc files can give examples of what existing methods do.

from generics.

mitchelloharawild commented on June 9, 2024

Sounds reasonable. I'll work on this a bit and make a PR.

from generics.

hadley commented on June 9, 2024

I think it's most important that you give some thought to the type signature of the generic — i.e. what does it return? Does it return a data frame? A tibble? An object of the same type as x?

from generics.

mitchelloharawild commented on June 9, 2024

The implementation in the fable series of packages is:

model(.data, ...)

Returns a mable object (a tibble with model attributes).
Rows of models are identified by groups of the input .data (keys+groups in a tsibble).
Columns of models are specified in the ....
Cells are the result from a call to estimate() with the appropriate data split and model definition.
summarise-esque semantics. Respects groups and reduces data into summary statistics (model fit parameters).
Input is a tsibble (tibble), output is a mable (tibble)

estimate(.data, .model, ...)

Returns a model object (a list containing the model specification, response, transformation, and the fit object (result from model training method).
Input is a tsibble (tibble), output is a model (list)

edit: Using estimate() is discouraged, but is exported to allow users access to the lower level objects if they're particularly inquisitive. It also makes the nest-map-unnest workflow better if they're uncomfortable with using model().

These functions dispatch on a data object, and so if a similar approach is supported for cross-sectional modelling there would not be many more methods required. So I think the purpose of this generic may be less about consistent functionality, but more about avoiding namespace conflicts.

You could also argue that estimate() should dispatch on .model rather than .data, which could make it easier to define model training methods. Currently fable keeps the model's training method in the R6 class for the model definition.

from generics.

Add model() and estimate() generics about generics HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent