Git Product home page Git Product logo

ifiscripts's Introduction

This repo is now archived as is no longer maintained. The maintained, live repo is at https://github.com/Irish-Film-Institute/IFIscripts Full documentation at: http://ifiscripts.readthedocs.io/en/latest/index.html -

image

Introduction

Summary

These scriptsfacilitate collections management workflows within the IFI Irish Film Archive. These scripts have been tested in OSX, Windows 7 & 10, Ubuntu 14.04 & 16.04 & 18.04. They are located here on github: https://github.com/Irish-Film-Institute/IFIscripts

They are mostly Python 3.7 compatible but some are still Python 2.7 only. Most scripts take either a file or a directory as their input, for example makeffv1.py filename.mov or premis.py path/to/folder_of_stuff. (It's best to just drag and drop the folder or filename into the terminal as this provides the absolute path).

We want the project to be as reuseable as possible in different institutions and contexts. Some scripts, particularly anything to do with Object Entry or Accessioning will be quite IFI specific, but other scripts such as makeffv1.py, dcpaccess.py and many others have been used in a variety of contexts in several different countries.

The project uses the MIT license, and we encourage the reuse, modification and study of the scripts. It's always nice to hear when the scripts have been reused in some way, but it's not necessary to let us know.

Purpose

These python scripts facilitate much of our collections management procedures for digitised and born digital objects in the Irish Film Institute. We utilise a lot of open source tools, so we wanted to make these scripts as open as possible. This is why this project has the MIT License.

The Irish Film Institute has followed the SPECTRUM museum collections management standard for several years. These scripts attempt to follow SPECTRUM procedures while also utilising some of the concepts of the Open Archival Information System (OAIS). Initially the scripts only handled single video files, but they are now capable of handling:

  • Digital Cinema Packages
  • XDCAM cards
  • DPX/TIFF image sequences
  • Documents (.doc, .pdf etc)
  • Images (.jpg, .TIFF etc)

An example workflow might be:

  • A digital object is created or acquired by the IFI, and ingest begins.
  • sipcreator.py is run on the object. This script:
    • generates an Object Entry identifier (eg OE-1234)
    • creates a folder structure for logs, metadata, objects
    • generates a UUID, extracts technical metadata
    • generates a md5 checksum manifest
    • and see the usage section for more.
  • All of these preservation events are logged in a log file located in the logs directory. This log file tries to use PREMIS (PREservation Metadata Implementation Strategies) terminology as much as possible.
  • Even though the package has yet to be accessioned, temporary backups are required. copyit.py will generate backups, and it will use the checksum manifest generated by sipcreator.py to verify the integrity of the file transfer.
  • If the package contains FFV1 or Matroska files, perhaps ffv1mkvvalidate.py could run, which uses mediaconch to verify the compliance of the files, and stores the information in the logfile.
  • If the package passes our Quality Control Procedures, then it will be accessioned. accession.py will generate an accession number, rename the OE number with the accession number, generate a SHA-512 manifest and update the log file to document these new preservation events.
  • A large batch of items can be accessioned using batchaccession.py. If the -pbcore command line argument is used with the accessioning scripts, technical metadata based on the PBCore standard will be generated in CSV format. This process can be run seperately by using makepbcore.py. CSV was chosen instead of XML as this allows us to immediately import the CSV into our database system so that we have item level records.
  • Access copies may be needed, so low-res watermarked proxies can be generated with bitc.py, or high res mezzanines with prores.py.
  • The accessioned package can then be written to preservation storage, again using the copyit.py command.

So this is just one way of using the scripts from acquisition to preservation storage, but there are many other scripts for specific workflows, which you can investigate further down in the documentation.

Table of Contents

installation contributing usage credits

ifiscripts's People

Contributors

ablwr avatar anjamahler avatar aoifefitz2016 avatar briancash avatar kieranjol avatar mcampos-quinn avatar raecasey avatar wendriftwood avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ifiscripts's Issues

revtmd - audio tracks

implement track count and generate appropriate number of audio tracks in the revtmd technical section

makeffv1 - secify ffmpeg log names

If ffmpeg decides the names, a first process log can be overwritten if it completes in the same second as the first second.
Use the report environment variable, which should be easy on unix, not s much on windows :{

revtmd: More specific xml parsing

Right now, I'm using indexes[1] so perhaps something involving XPATH like '//16mm Telecine' or '//Quality Assesment' would be better..

revtmd - date of last cleaning

Part of cleaning/maintanence protocol should be recording the date of deck cleaning somewhere in the revtmd coding process history. Not sure where, settings sounds wonky, but not sure where else it could go.

bitc.py - broken on windows


PS C:\Users\kieran.oleary> python .\bitc.py C:\Users\kieran.oleary\Documents\inter.mov
C:\Users\kieran.oleary\Documents\inter.mov
inter.mov
True
single file found
['inter.mov']
inter.mov
576.0
786.0
Windows
25/1
drawtext='fontfile=C\:\\Windows\\Fonts\\'arial.ttf':fontcolor=white:fontsize=48.0:timecode=01\:00\:00\:00:        rate=2
5/1:x=(w-text_w)/2:y=h/1.2:boxcolor=0x000000AA:box=1,        drawtext='fontfile=C\:\\Windows\\Fonts\\'arial.ttf':fontcol
or=white:text='INSERT WATERMARK TEXT HERE':        x=(w-text_w)/2:y=(h-text_h)/2:fontsize=41.1428571429:alpha=0.4
01\:00\:00\:00
25/1

ffmpeg version N-76347-gdd36749 Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 5.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-avisynth --enable-bzlib --enable-fontconfi
g --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --
enable-libdcadec --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libm
p3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --en
able-libschroedinger --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --ena
ble-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enabl
e-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-lzma --enable-decklink --enable-zlib
  libavutil      55.  5.100 / 55.  5.100
  libavcodec     57. 12.100 / 57. 12.100
  libavformat    57. 11.100 / 57. 11.100
  libavdevice    57.  0.100 / 57.  0.100
  libavfilter     6. 14.101 /  6. 14.101
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.100 /  2.  0.100
  libpostproc    54.  0.100 / 54.  0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'inter.mov':
  Metadata:
    major_brand     : qt
    minor_version   : 512
    compatible_brands: qt
    encoder         : Lavf57.11.100
  Duration: 00:00:01.00, start: 0.000000, bitrate: 251669 kb/s
    Stream #0:0(eng): Video: v210 (v210 / 0x30313276), yuv422p10le, 786x576, 250675 kb/s, 25 fps, 25 tbr, 12800 tbn, 128
00 tbc (default)
    Metadata:
      handler_name    : DataHandler
      encoder         : Lavc57.12.100 v210
    Stream #0:1(eng): Audio: pcm_s16le (sowt / 0x74776F73), 48000 Hz, stereo, s16, 1536 kb/s (default)
    Metadata:
      handler_name    : DataHandler
[drawtext @ 00000000004e0ee0] Unable to parse option value "1,        drawtext=fontfile=C" as boolean
    Last message repeated 1 times
[drawtext @ 00000000004e0ee0] Error setting option box to value 1,        drawtext=fontfile=C.
[Parsed_drawtext_0 @ 00000000004db420] Error applying options to the filter.
[AVFilterGraph @ 00000000004e36e0] Error initializing filter 'drawtext' with args 'fontfile=C\:\\Windows\\Fonts\\arial.t
tf:fontcolor=white:fontsize=48.0:timecode=01\:00\:00\:00:        rate=25/1:x=(w-text_w)/2:y=h/1.2:boxcolor=0x000000AA:bo
x=1,        drawtext=fontfile=C:\Windows\Fonts\arial.ttf:fontcolor=white:text=INSERT WATERMARK TEXT HERE:        x=(w-te
xt_w)/2:y=(h-text_h)/2:fontsize=41.1428571429:alpha=0.4'
Error opening filters!
PS C:\Users\kieran.oleary>


revtmd - FPS and TODO

Add multiple possibilities for FPS.
Exposure pushed to the point of noise
condition info - how to represent?
Specify how much audio level was adjusted by.

as11fixity - Wishlist - Add emailer

As these fixity checks can take a long time, it should email the csv once it has completed.
This also means that we can open the reports on our office computers.

openssl for windows

tried dcpfixity on ingest1 (windows). error states: 'System cannot find file system' . seems there is a need to install 'openssl' for windows for the script to be working.

revtmd - codec

Codec is currently broken - requires codec.type children rather than just the codec name.

move.py - move to hashlib from md5deep

md5deep could still work, but I think a counter that shows progress of maneifest generation would be very helpful, especially when running a 24 hour move with 150k files.

move.py - file transfer log

check if cp/gcp/robocopy have log modes - otherwise, find a tradeoff between capturing stdout and having terminal updates for the user.

Cant select directory only as source

DCDMs are often just in audio&video folder without subfolder. eg E:/video/tiff tiff tiff and E:/ audio/ wav wav. To select E:/ as source would be great.

Remove copy all .log files to /log

This was a terrible idea. Remove as soon as possible. Find a better way. Intermediate workaround is to just take in ffmpeg logs but even that is pretty bad.

Engineers report

What to do?

  1. Possibly make new fields based around revtmd/videoMD.
  2. Create temporary xml file in submission package, this is harvested and added to the final inmagic xml during the accessions phase?
  3. Easygui might be the only user friendly way to enter user info?

as11fixity - Create md5 manifest

Seeing as we already have the filenames and the md5s, we should generate a manifest using the md5sum standard, something like:
path/to/file md5checksum43890jksdjikloj49s

move.py - Ubuntu creates .fuse files on external drives

Figure out if these are created as a result of the script, in particular the actual .tmp file creation. I don't think that this is the issue as these are created in the parent directory, while the .fuses are in the main file dir.

ltopers just deletes all hidden files, so maybe we could delete all files that .startswith(.fuse) . Hopefully if we delete them before the manifest check then all will be well. Further tests required.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.