Comments (9)
One possible point of (approximate) alignment with Xarray API is this issue: pydata/xarray#3894 for selecting using an iterable of variable names. This seems analogous to selecting nodes using subset
from datatree.
I've routinely wanted something that says select these variable names from all nodes.
This is way too much typing for that:
dailies.map_over_subtree(lambda n: n[["KT", "eps", "chi"]])
Perhaps a DataTree.subset_nodes
?
from datatree.
@OriolAbril would these types of functions be sufficient for ArViz's usecases you think? From arviz-devs/arviz#2015 (comment):
dt[["posterior", "posterior_predictive"]] is not possible
getting a subset of the datatree that consists of multiple groups
This is what I'm suggesting subset
do, or __getitem__
.
applying a function to the variable x that is present in 3 out of 5 groups of the datatree.
I'm imagining enabling that via
dt.filter(lambda node: 'x' in node.variables).map_over_subtree(func)
Or we could potentially add an optional filterfunc
argument to map_over_subtree
.
from datatree.
I had not seen that issue, thanks @dcherian
from datatree.
I think that would cover everything, but I'll try to think of examples so that we can also have things to test on.
We could also provide functions in datatree/xarray/arviz to act as filterfunc
for common cases. My main question when thinking about using filter
is storing the results back. I guess a merge would do it? With some renaming happening in the process maybe. It will probably be best to discuss with some examples.
from datatree.
My main question when thinking about using filter is storing the results back.
Yes that's the tricky bit, because if you want to return a tree then you might need to retain nodes for which filterfunc(node)=False
in order to still have a valid tree structure afterwards...
For example:
def name_is_lowercase(node)
return node.name == node.name.lower()
root = DataTree("a")
child = DataTree(parent=root, name="B")
grandchild = DataTree(parent=child, name="c")
root.filter(name_is_lowercase)
This would return nodes "a" and "c", but it couldn't automatically reconstruct them into a tree without also preserving node "B".
If .filter
just returned an iterator of nodes then you wouldn't need to be able to rebuild a tree, but this might not be most convenient for the user. This is why I would like to build these functions with some desired usage patterns in mind.
from datatree.
I added a method to filter nodes based on some condition in #185
from datatree.
Finally started using DataTree intensively. I also find I am using map_over_subtree
more often than I would like. And not only to subset some variables, also for use with .sel
, .mean
...
How would you feel about an accessor or something of the sort (.tree
or .treemap
for example) that exposes all the methods (or a subset of commonly used ones) via map_over_subtree
?
dt.map_over_subtree(lambda node: node.sel(dim="label"))
# would become
dt.tree.sel(dim="label")
# and the same for .map, .drop_sel, .mean and others
from datatree.
Closing in favour of pydata/xarray#9342
from datatree.
Related Issues (20)
- setting node name breaks tree linkage HOT 8
- When creating a DataTree from a Dataset with path-like variable, subgroups are expected to be created HOT 8
- Actually you're right, I don't know if the docs currently mention anywhere that assigning to `.ds` is allowed! HOT 1
- Typing: `DataTree[Unknown]` HOT 1
- `drop_vars` issue? HOT 1
- Creating DataTree from DataArrays HOT 1
- Opening a datatree from S3 bucket HOT 3
- Typing issue: Pylance complains with DataTree inequalities HOT 3
- open_datatree() keeps the hdf file open preventing writes HOT 2
- open_datatree() from zarr creates issues with `kwargs` HOT 4
- `open_datatree` performance HOT 2
- decision analysis in datatree? HOT 1
- Collapse by default the "Attributes" section in rich display HOT 1
- Add an `attrs` keyword argument to the constructor: `DataTree(attrs={})` HOT 1
- Describe a DataTree: adidng visualization and summarization capabilities HOT 1
- Rich display width is broken HOT 4
- Auto-plotting capabilities HOT 1
- Loosing attributes with .chunk and .pad HOT 1
- Losing top level name attribute when saving and then reopening using h5netcdf HOT 4
- Supporting Excel Spreadsheets? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datatree.