Git Product home page Git Product logo

extractmetadata's Introduction

Extract Metadata

Extract embedded metadata from files. Once installed and active, this module has the following features:

  • When configuring the module, the user can:
    • View and enable/diable extractors;
    • View and enable/diable mappers;
    • Configure the metadata crosswalk for the JSON Pointer mapper (if enabled).
  • When adding a media, the module will automatically:
    • Extract metadata from the file using enabled;
    • Save the metadata alongside the media;
    • Map metadata to resource values.
  • When editing a media/item or batch editing media/item, the user can choose to perform a number of actions:
    • Refresh metadata: (re)extract metadata from files;
    • Refresh and map metadata: (re)extract metadata from files and map metadata to resource values;
    • Map metadata: Map extracted metadata to resource values;
    • Delete metadata: Delete extracted metadata.
  • When viewing and editing a media, the user can see the extracted metadata in the "Extract metadata" section.

Extractors:

Extractors extract metadata from files. Note that extractors must be enabled on the module configuration page. This module comes with four extractors, but more can be added depending on your need.

ExifTool

Used to extract many types of metadata from many types of files. Requires the ExifTool command-line application.

Exif

Used to extract EXIF metadata that is commonly found in JPEG and TIFF files. Requires PHP's exif extension.

getID3

Used to extract many types of metadata from many types of files. Uses the getID3 PHP library, which comes with this module.

Tika

Used to extract many types of metadata from many types of files. Requires the Apache Tika content analysis toolkit. Java must be installed and the path to the tika-app-*.jar file must be configured in config/module.config.php under [extract_metadata_extractor_config][tika][jar_path].

Mappers

Mappers map extracted metadata to resource values. Note that a mapper must be enabled on the module configuration page. This module comes with one mapper, but more can be added depending on your need.

JSON Pointer

Used to map metadata to resource values using JSON pointers. You must define your own metadata crosswalk in the module configuration page under "JSON Pointer crosswalk".

One common example is to map a JPEG file's creation date to Dublin Core's "Date Created" property:

  • Resource: [Media or Item]
  • Extractor: "Exif"
  • Pointer: /EXIF/DateTimeOriginal
  • Property: "Dublin Core : Date Created"
  • Replace values: [checked or unchecked]

Note that the pointer points to the DateTimeOriginal value in the Exif metadata output, which you can view in a JPEG media's "Extract metadata" section. Once you've saved this map, perform the "Map metadata" action as described above and, if your JPEG file includes DateTimeOriginal, the media/item should now have a "Date Created" value.

Copyright

ExtractMetadata is Copyright © 2019-present Corporation for Digital Scholarship, Vienna, Virginia, USA http://digitalscholar.org

The Corporation for Digital Scholarship distributes the Omeka source code under the GNU General Public License, version 3 (GPLv3). The full text of this license is given in the license file.

The Omeka name is a registered trademark of the Corporation for Digital Scholarship.

Third-party copyright in this distribution is noted where applicable.

All rights not expressly granted are reserved.

extractmetadata's People

Contributors

jimsafley avatar kimisgold avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.