I've been watching the experimental RSS feed for a few days and find it very interesting. It can serve as a reasonable model for our thinking about the kind of notification we want to see from SHARE.
Right now the items in the RSS feed look something like this:
<item>
<title>Symmetric Designs for Helical Spin Rotators at RHIC (12/94)</title>
<link>http://173.255.232.219/archive/SciTech/1149788/2014-08-21 17:08:41.531480/normalized.json</link>
<description>Ptitsin V.<br/>Retrieved from SciTech Connect at 2014-08-21 17:08:41.531480<br/><br/>.</description>
<guid>1149788</guid>
<pubDate>2014-08-21 17:08:41.531480</pubDate>
</item>
The title
is conveyed from the acquired metadata for the resource. That makes sense.
The link
looks like it is attempting to be a link to the RESTful API of the notification service for the record in question, but I find it often does not work. However, I believe the link
element should be a link directly back to the source. This is the most critical element of the notification service, every item we harvest or have reported to us must have an actionable link back to the originating source. In one way or another, we have to construct those links from the acquired metadata, most likely by combining some identifier with some known base URL. In the case of DOI's these might be set to resolve via CrossRef, in the case of OAI harvests, they should point back the the originating repository.
The guid
looks like it is trying to pass along the identifier supplied by the source for this resource. The guid
must be uniquely identifiable according to the RSS standard, we cannot guarantee these identifies will be unique stripped of their context. If we make the link
a URL to the source, then we can use the quid
for a URL back to our notification service, since we can guarantee that would be unique. Every outgoing RSS item is based on a record we have stored in our system, this guid
should be an actionable URL that uses the notification system's RESTful API to point back to that record, pretty much what is currently in the link
field, but actually resolving consistently.
The description
is somewhat cryptic, providing a variety of information depending on the source, but not really labeling that information in a useful way. The description
in RSS is actually capable of being pretty much anything. Since this is an RSS feed, I believe that, for now, we should make it a human-readable expression of the normalized JSON record we are keeping for this item. It could be expressed as a pretty version of the JSON record itself.
While I would like the pubDate
to be the date of the actual emergence of the resource (its publication date or the date of its deposit in the holding repository), I can see how this might be somewhat of an abuse of RSS. I think it is fair to keep the date as is. But the actual publication date should be clear in the description
.
The result, for the record similar to the one above, would be something like:
<item>
<title>Los Alamos National Laboratory Overview</title>
<link>http://www.osti.gov/scitech/servlets/purl/1140136</link>
<description>{
"contributors": [
{
"email": null,
"full_name": "Neu, Mary"
}
],
"id": "1140136",
"meta": {},
"properties": {
"article_type": "Conference",
"date_entered": "2014-07-21",
"date_published": "2010-06-02",
"date_retrieved": "2014-07-21",
"description": "Mary Neu, Associate Director for Chemistry, Life and Earth Sciences at Los Alamos National Laboratory, delivers opening remarks at the \"Sequencing, Finishing, Analysis in the Future\" meeting in Santa Fe, NM",
"doi": null,
"research_org": "Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)",
"research_sponsor": "USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23)",
"tags": [
"59 BASIC BIOLOGICAL SCIENCES"
],
"url": "http://www.osti.gov/scitech/servlets/purl/1140136"
},
"source": "SciTech",
"timestamp": "2014-07-21 15:03:21.096378",
"title": "Los Alamos National Laboratory Overview"
}</description>
<guid>http://173.255.232.219/archive/SciTech/1140136/2014-07-21 15:03:21.096378</guid>
<pubDate>2014-08-21 17:08:41.531480</pubDate>
</item>
I think these changes would make for a much more useful RSS feed. How doable would they be?