Git Product home page Git Product logo

camel-oaipmh's Introduction

For more details about OAI-PMH see the documentation: http://www.openarchives.org/pmh/

OAI-PMH Component

The oaipmh component is used for polling OAI-PMH data providers. Camel will default poll the provider every 60th seconds.

Maven users will need to add the following dependency to their pom.xml for this component:

<dependency>
    <groupId>es.upm.oeg.camel</groupId>
    <artifactId>camel-oaipmh</artifactId>
    <version>x.x.x</version>
</dependency>

Note: The component currently only supports polling (consuming) feeds.

Note: You must include this repository in your pom.xml:

<repositories>
    <!-- GitHub Repository -->
    <repository>
        <id>camel-oaipmh-mvn-repo</id>
        <url>https://raw.github.com/cbadenes/camel-oaipmh/mvn-repo/</url>
        <snapshots>
            <enabled>true</enabled>
            <updatePolicy>always</updatePolicy>
        </snapshots>
    </repository>
</repositories>

URI format

oaipmh:oaipmhURI

Where oaipmhURI is the URI to the OAI-PMH data provider to poll.

You can append query options to the URI in the following format, ?option=value&option=value&...

Options

Property Default Description
delay 60000 Delay in milliseconds between each poll
initialDelay 1000 Milliseconds before polling starts
userFixedDelay false Set to true to use fixed delay between pools, otherwise fixed rate is used. See ScheduledExecutorService in JDK for details.
verb ListRecords Future versions will handle ListIdentifiers, Identify, GetRecord, ListSets and ListMetadataFormats.
metadataPrefix oai_dc Specifies the metadataPrefix of the format that should be included in the metadata part of the returned records.
from Specifies a lower bound for datestamp-based selective harvesting. UTC DateTime value. After first request, this value is updated to current time if no upper bound is defined
until Specifies an upper bound for datestamp-based selective harvesting. UTC DateTime value.
set Specifies membership as a criteria for set-based selective harvesting.

Exchange data types

Camel initializes the IN body on the Exchange with a response message in XML format. For ListXX requests, Camel will return a message for each element of the list received.

OAI-PMH Data Format

The oaipmh component ships with an OAIPMH dataformat that can be used to convert between String (XML) and OAIPMHType model object (JaxB).

  • marshal = from OAIPMHType to XML String
  • unmarshal = from XML String to OAIPMHType More details about these xsd here.

A route using this would look something like this:

from("oaipmh://aprendeenlinea.udea.edu.co/revistas/index.php/ingenieria/oai?delay=60000").unmarshal().jaxb("es.upm.oeg.camel.oaipmh.model").to("mock:result");

The purpose of this feature is to make it possible to use Camel's lovely built-in expressions for manipulating OAI-PMH messages. As show below, an XPath expression can be used to filter the OAI-PMH message:

from("oaipmh://aprendeenlinea.udea.edu.co/revistas/index.php/ingenieria/oai?delay=60000").unmarshal().jaxb("es.upm.oeg.camel.oaipmh.model").filter().xpath("//item/request/set[contains(.,'physics')]").to("mock:result");

This work is funded by the EC-funded project DrInventor (www.drinventor.eu).

camel-oaipmh's People

Contributors

cbadenes avatar claussni avatar

Stargazers

 avatar

Watchers

 avatar

camel-oaipmh's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.