Git Product home page Git Product logo

librarian's Introduction

Librarian

Librarian is a simple python script that can read in a yaml catalog of items, scan a directory for files matching the regex in the catalog, and then copy the files to a destination. Useful for curating a directory and automatically copying files off to a different location.

Usage:

Basic usage is as follows:

$ ./librarian.py -h
Usage:
    librarian.py --source=<source> --destination=<destination> [options]

Options:
    --catalog=<catalog>     Catalog file [Default: catalog.yaml]
    --dryrun                Don't actually do anything
    -h --help               Show help

Features

  • Full regex supported catalog (yaml)
  • Support for regex group matching (see examples)
  • Support for file deletions after completion

Examples

Basic usage

Sample catalog.yaml

---
test1:
  target: 'test_folder'
  regex: 'testing_\d'
different:
  target: 'diff/sub'
  regex: '.*different.+'

And the directory structure

.
├── after
│   ├── diff
│   │   └── sub
│   └── test_folder
└── before
    ├── dasdf
    ├── different_file
    ├── testing_1
    ├── testing_2
    └── testing3

And the script output in dryrun mode to view what it is doing. Notice that the file testing3 and dasdf are ignored because they do not match anything in the catalog.

$ ./librarian.py --source ~/tmp/before --destination ~/tmp/after --dryrun
Matched testing_1 copying to /home/pgmcneil/tmp/after/test1_folder/testing_1
Matched testing_2 copying to /home/pgmcneil/tmp/after/test2_folder/testing_2
Matched different_file copying to /home/pgmcneil/tmp/after/diff/sub/different_file

Regex group matching

Using the same example as above but let's say we want finer control. Using some of the example above: Sample catalog.yaml

---
test1:
  target: 'test\1_folder'
  regex: 'testing_(\d)'

And the directory structure:

.
├── after
│   └── test1_folder
│   └── test2_folder
└── before
    ├── testing_1
    ├── testing_2

And the dryrun output of the script. Notice how the number for the file was replaced in the folder name using the group matching notation of \1.

$ ./librarian.py --source ~/tmp/before --destination ~/tmp/after --dryrun
Matched testing_1 copying to /home/pgmcneil/tmp/after/test1_folder/testing_1
Matched testing_2 copying to /home/pgmcneil/tmp/after/test2_folder/testing_2

ToDo:

  1. Add support for moving files
  2. Add support for entire directories, most likely a flag in the catalog too
  3. MD5/SHA1/SHA256 checksum before deleting?

librarian's People

Contributors

pemcne avatar pgmcneil avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.