Comments (4)
This is something I would love to have! Manually writing updating TOML feels hackish and unreproducible at the moment. The create
, delete
and update
syntax seems the best to me - I'd rather these operations be explicit.
from datasets.jl.
Love the questions being asked here, but I would add another related to
When creating a dataset it needs to be created within "some" data project. Presumably this would be the topmost project in the data project stack, or within a provided project if the project is supplied as the first argument.
Should it data projects be made more transparent as well?
While I know the functions DataSets.ActiveDataProject
& DataSets.DataProject
are provided I honestly did not think about the concept of a Data Project when first using this package. Maybe something in the Data REPL to show the active project (My Data Project) data>
would make this more obvious. Maybe we also provide a Data REPL command to list the ones DataSets.jl
knows are available.
command | alias | description |
---|---|---|
projects |
proj |
list all available data projects |
project $name |
proj $name |
switch to $name data project |
from datasets.jl.
Maybe something in the Data REPL to show the active project
We have this — I guess it's just badly named:
data> stack list
DataSets.StackedDataProject:
DataSets.ActiveDataProject:
(empty)
DataSets.TomlFileDataProject [/home/chris/.julia/datasets/Data.toml]:
📁 SomeDir => 302a6dd6-d9e1-4487-8919-c520f08165be
📄 SomeFile => 97633d9c-afa8-4437-abd9-320cb4fdb270
📁 TrueFX => aa21c966-563e-42fb-ac3d-edaa3bdf3652
📁 imagenet => e73ae172-eeb0-4417-b3e1-007d42918752
Alternatively, we could make data> ls
just show the full stack in this format by default? (The downside there is that duplicate names can occur with the topmost data project taking precedence. Which is why I used the current format for ls
where deduplication has already happened.)
Current data REPL docs do mention this, and the stack
command is findable via tab completion. But clearly it should be more discoverable, somehow.
data> ?
DataSets Data REPL
====================
Press > to enter the data repl. Press TAB to complete commands.
Command Alias Action
–––––––––––––––– ––––––– –––––––––––––––––––––––––––––––––––––––––––––––––––
help ? Show this message
list ls List all datasets by name
show $name Preview the content of dataset $name
stack st Manipulate the global data search stack
stack list st ls List all projects in the global data search stack
stack push $path st push Add data project $path to front of the search stack
stack pop st pop Remove data project from front of the search stack
from datasets.jl.
This works perfectly. My tired eyes / brain just looked right over it. Thanks for clarifying!
from datasets.jl.
Related Issues (20)
- Why do I need to `open` a dataset twice? HOT 3
- DataSets is hard to use with Distributed.jl HOT 2
- Using data handles with Distributed.jl HOT 4
- JuliaAstro/JuliaSpace use case HOT 1
- New storage backend API based on ResourceContexts
- Remote drivers HOT 5
- Addressing versioned data
- Refresh documentation
- Large dataset examples
- Rename `BlobTree -> FileTree`
- Iteration of `BlobTree` — `pairs` or `values`? `basename`?
- The road to DataSets 1.0 HOT 1
- DataSets for Testing Use Case HOT 2
- The role of the dataset UUID
- Processing pipeline on dtype level
- Concept of managed datasets for create/update/delete
- API to get the cached path for a blob HOT 1
- Better interoperation with Parquet files
- File materialization functions
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datasets.jl.