Comments (8)
Those seem pretty reasonable but I think that it would be good to have the generic for estimate
only involve the second argument to be less restrictive.
We also have the convention to make the first argument x
whenever possible. That might seem like we are being uptight, but these generics are meant to be used in a broad context. For example:
estimate <- function(.data, .model){
UseMethod("estimate")
}
I would want to use this on an existing model object (where the data have already been consumed). For example, I wrote an S package for Shewhart charts a long time ago and wanted a generic that would return estimates of the process mean and variance (mean
and var
weren't generic back then). This signature would preclude something like that.
Are you doing double dispatch on these two objects? If not, you wouldn't lose anything by using a single argument.
from generics.
Completely agree with the usage of estimate()
. Actually I changed this for the same reason last week in tidyverts/fabletools@d34f1c7.
The only reason it was so restrictive, was because I had originally planned to use this functionality only internally.
I don't feel too strongly about the name of the first argument, however I think using .data
will restrict the context of the generic in a beneficial way. In what scenarios do you anticipate estimate()
to be used without data as the first argument? Consistency for the usage of this generic may result in less cognitive load for the users.
from generics.
In what scenarios do you anticipate estimate() to be used without data as the first argument
One thing that I'd use it for is unsupervised methods. So if I have an object with PCA loadings, I'd use estimate(object, new_data = df)
to get projections for new data points.
from generics.
Sure, x
or object
is fine. So generics for model(x, ...)
and estimate(x, ...)
?
How general do you think the documentation of their functionality should be? Should it distinguish functionality between these generics, or should the methods have flexibility to use them inconsistently.
For example, in fable these generics would be used as follows:
model.tbl_ts(x, ...)
trains multiple model definitions to data, where x
is a tsibble, and ...
are the model definitions.
estimate.tbl_ts(x, .model, ...)
trains a single model definition to data, where x
is a tsibble and .model
is the model definition. ...
is unused.
Having some recommended usage of these verbs would make it easier for users to learn their functionality, it would also make them less flexible.
from generics.
I'd suggest x
for both. I wouldn't really get too specific about how we think that these should be used. I think that the doc files can give examples of what existing methods do.
from generics.
Sounds reasonable. I'll work on this a bit and make a PR.
from generics.
I think it's most important that you give some thought to the type signature of the generic — i.e. what does it return? Does it return a data frame? A tibble? An object of the same type as x
?
from generics.
The implementation in the fable series of packages is:
model(.data, ...)
- Returns a mable object (a tibble with model attributes).
- Rows of models are identified by groups of the input
.data
(keys+groups in atsibble
). - Columns of models are specified in the
...
. - Cells are the result from a call to
estimate()
with the appropriate data split and model definition. summarise
-esque semantics. Respects groups and reduces data into summary statistics (model fit parameters).- Input is a tsibble (tibble), output is a mable (tibble)
estimate(.data, .model, ...)
- Returns a model object (a list containing the model specification, response, transformation, and the fit object (result from model training method).
- Input is a tsibble (tibble), output is a model (list)
edit: Using estimate()
is discouraged, but is exported to allow users access to the lower level objects if they're particularly inquisitive. It also makes the nest-map-unnest workflow better if they're uncomfortable with using model()
.
These functions dispatch on a data object, and so if a similar approach is supported for cross-sectional modelling there would not be many more methods required. So I think the purpose of this generic may be less about consistent functionality, but more about avoiding namespace conflicts.
You could also argue that estimate()
should dispatch on .model
rather than .data
, which could make it easier to define model training methods. Currently fable
keeps the model's training method in the R6 class for the model definition.
from generics.
Related Issues (20)
- Add existing generics (pillar, tibble, dplyr, tidyr) to this package HOT 1
- `Methods` section in roxygen docs malformed HOT 4
- Add generics from universals package?
- Add a compress(x, ...) generic HOT 2
- Re-licensing generics as MIT HOT 2
- License info HOT 1
- Release generics 0.1.0
- Adding `sum()` generic? HOT 1
- Add hypothesise() as alternative of hypothesize()
- recode generic for arules and dplyr HOT 2
- Move `master` branch to `main` HOT 1
- Add forecast() and accuracy() generics HOT 3
- deprecate `varying_args()`
- Release generics 0.1.2
- Upkeep for generics HOT 1
- Release generics 0.1.3
- I don't know where to look for the source code HOT 1
- Add generic for `bake()`
- Upkeep for generics (2023)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from generics.