Comments (5)
These are obvious points but I just want to confirm that we understand and are talking about the same things. We need to be able to set and get version metadata. This would be set at archive creation and archive update and gotten at download (for example). Right now we want to get at least dependencies within the version but ultimately, their could conceivably be arbitrary metadata for each version.
from datafs.
I am thinking this could be one implementation
metadata = {
'archive_name': 'Big Climate Data',
'source': 'a big data climate instrument',
'description': 'waves and how you can get stoked and barreled'
'versions': [('0.0.1', 'NOAA Charts from some time period'), ('0.1.0', 'Later NOAA Charts'), ('0.2.0', 'The most reliable NOAA Charts')]
}
from datafs.
Is this for the dependencies? I was thinking more like
{
'_id': big_climate_data',
'authority_name': 'osdc',
'archive_path': '/big/climate/data',
'versioned': True,
'metadata': {
'source': 'a big data climate instrument',
'description': 'waves and how you can get stoked and barreled'},
'versions': [
{'version': 1.0, etc..., 'dependencies': [(arch2, 1.3), (arch7, 1.6.2a1)]},
{'version': 0.8, etc..., 'dependencies': [(arch2, 1.2), (arch7, 1.4)]}
]
}
Does that make sense? Each archive depends on other archives. The version metadata might be extendable beyond just version, checksum, algorithm, time stamp, author, dependencies, but this seems like the minimum requirement.
from datafs.
On DataArchive
, we'll need the following updates:
DataArchive.update
def update(
self,
filepath,
cache=False,
remove=False,
bumpversion='patch',
prerelease=None,
dependencies=None,
**kwargs):
...
self._update_manager(checksum, kwargs, version=next_version, dependencies=dependencies)
DataArchive.open
def open(
self,
mode='r',
version=None,
bumpversion='patch',
prerelease=None,
dependencies=None,
*args,
**kwargs):
...
updater = lambda *args, **kwargs: self._update_manager(
*args, version=next_version, dependencies=dependencies, **kwargs)
...
DataArchive.get_local_path
similar to DataArchive.open
DataArchive._update_manager
def _update_manager(self, checksum, metadata={}, version=None, dependencies=None):
# by default, dependencies is the last version of dependencies
if dependencies is None:
history = self.history
if len(history) == 0:
dependencies = []
else:
dependencies = history[-1]['dependencies']
....
from datafs.
from datafs.
Related Issues (20)
- Improve naming conventions
- 'default_version' on 'get_archive' does not return specified archive HOT 1
- tags should be standardized to lower case strings HOT 1
- Use anaconda environment file for travis CONDA env HOT 1
- xarray tests fail due to display mismatch
- configure fails when the click app directory does not exist
- environment variables don't work from the python API
- Check on race conditions HOT 3
- filter and search should be merged HOT 1
- make sure CLI and python examples work on both managers
- pandas 0.22 not supported
- sphinx build failing on travis
- docs aren't building :/
- We don't build with moto 1.0.0 HOT 3
- keyword tags
- Create variable name pattern registry
- we should coerce tag names on search
- Allow flexible caching policies HOT 1
- Reverse file lookup
- All builds failing on travis
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datafs.