Git Product home page Git Product logo

zen's Introduction

zen - A Python Library for Interacting with Zenodo

zen icon

Overview

zen is a Python library that provides a simple and intuitive way to interact with Zenodo, a popular research data repository. With zen, you can automate and streamline various tasks related to creating, managing, and exploring Zenodo depositions, all within your Python environment.

Features

  • Deposition Management: Easily create, retrieve, update, and delete Zenodo depositions from your Python code.

  • Local File Management: Handle local files and datasets, with built-in support for templating.

  • File Handling: Upload, download, and manage files associated with your Zenodo depositions.

  • Deposition Listings: Retrieve a list of depositions from your Zenodo account with various filtering options.

  • Integrity Checking: Automatically calculate checksums for files within your depositions for integrity checking.

  • Interactivity with Zenodo API: Communicate with the Zenodo API seamlessly to access and manipulate your deposition data.

Installation

You can install zen using pip:

pip install -e 'git+https://github.com/Open-Earth-Monitor/zen#egg=zen[full]'

Getting Started

Using zen in your Python project is straightforward. Here's a quick example of how to create a new Zenodo deposition:

from zen import Zenodo

# Initialize Zenodo with your API token
zen = Zenodo(url=Zenodo.sandbox_url, token="your_api_token")

# Create a deposition
dep = zen.depositions.create()

# Uploading a file
dep.files.create('examples/file1.csv')

# Print the deposition ID
print(f"Deposition ID: {dep.id}")

# Discard the example deposition
dep.discard()

Managing local files

To associate a set of files to a Zenodo deposition, you can set up a local dataset. With a local dataset, users can easily track local changes and manage big datasets uploading. If the local files are stored in a remote machine, zen will download them temporarily just before the uploading.

from zen import LocalFiles

# Create a dataset
ds = LocalFiles(['examples/file1.csv', 'examples/file2.csv'])
ds.save('examples/dataset.json')

# Load a saved dataset
ds = LocalFiles.from_file('examples/dataset.json')

# Create a deposition if there is no one already defined
dep = ds.get_deposition(url=Zenodo.sandbox_url, token='your_api_token')

# Upload files to Zenodo
ds.upload(dep)

# Add more files to local dataset
ds.add(['examples/file3.csv'])
ds.save()

# Just upload modified or new files to Zenodo
ds.upload(dep)

Managing metadata

Metadata management is easy with zen. The package provides helper classes to fill metadata information and document all Zenodo metadata tags. zen also supports basic templating that enables users to automate and personalize dataset descriptions using templated metadata.

from zen.metadata import Dataset

# Create a metadata for a dataset
meta = Dataset(
    title='My first dataset',
    description='The dataset description. Files from index {index_min} to {index_max}.'
)

# Add a creator
meta.creators.add('My Name')

# Update metadata on Zenodo
# Create replacement value for the metadata placeholders
replacements = {'index_min': 1, 'index_max': 3}
dep.update(meta.render(replacements))

The replacements dictionary used to render the metadata could get that information from the local dataset itself. One way to do this is to extract that information from the filenames. Users can do this in two different ways, (1) by providing a filename template that will be used to parse filenames and information will be stored in file properties; or (2) by generating the filenames using that template filename.

  1. Providing a filename template

In this example, file properties will be extracted from filenames using the placeholder as a pattern.

# Create a template with 'index' placeholder
filename_template = 'file{index}.csv'
ds = LocalFiles(['examples/file1.csv', 'examples/file2.csv', 'examples/file3.csv'], 
                 template=filename_template)

print(ds.summary())
#... {'index_min': '1', 'index_max': '3'}

# Get the previous metadata template and render a metadata
replacements = ds.summary()
meta.render(replacements)
  1. Generating local files' filenames

In this example, file properties will be generated along with filenames by calling expand() method. Multiple calls on this method will generate filenames by combining all occurrences in a cartesian product.

# Create a template with 'index' placeholder
filename_template = 'file{index}.csv'
ds = LocalFiles.from_template(filename_template)

# Expand the index placeholder
ds.expand(index=[1,2,3])
ds.modify_url(prefix='examples/')

print([f.url for f in ds])
#... ['examples/file1.csv', 'examples/file2.csv', 'examples/file3.csv']

# Get the previous metadata template and render a metadata
replacements = ds.summary()
meta.render(replacements)

Documentation

For detailed usage and additional examples, please refer to the zen documentation.

Contributing

We welcome contributions! If you would like to contribute to the zen library, please see our Contributing Guide for more information.

License

© OpenGeoHub Foundation, 2023-2024. Licensed under the MIT License.

Acknowledgements & Funding

This work is supported by OpenGeoHub Foundation and has received funding from the European Commission (EC) through the projects:

zen's People

Contributors

rolfsimoes avatar

Stargazers

 avatar Tobias Augspurger avatar Colin Hill avatar Egor Kotov avatar Pierre-Marie Allard avatar

Watchers

Tomislav Hengl avatar  avatar

zen's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.