Git Product home page Git Product logo

jsonfeed's People

Contributors

alexdebril avatar andrewheiss avatar apike avatar bcomnes avatar brentsimmons avatar chilts avatar chobeat avatar chockenberry avatar danrigby avatar devilgate avatar dougeverly avatar frjo avatar genehack avatar gramgibson avatar gugod avatar jdecool avatar jonathanpike avatar joshuatbrown avatar karlshea avatar kr avatar manton avatar mateusjatenee avatar nmdias avatar sonicdoe avatar williamjacksn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jsonfeed's Issues

Author email field

I mapped my Atom feed to JSON Feed, which was very straight forward, thanks for the spec! I hope it becomes a success.

I think it would be useful to allow mentioning a author url and email, so I'd propose to add an optional email string field to every author dictionary. I checked my blogroll, and while this feature is not used extensively, it's used occasionally.

consider not using _ to prefix extensions

From RFC 6648 ‘Deprecating the "X-" Prefix and Similar Constructs in Application Protocols’:

   Historically, designers and implementers of application protocols
   have often distinguished between standardized and unstandardized
   parameters by prefixing the names of unstandardized parameters with
   the string "X-" or similar constructs.  In practice, that convention
   causes more problems than it solves.  Therefore, this document
   deprecates the convention for newly defined parameters with textual
   (as opposed to numerical) names in application protocols.

This spec's _ prefix seems to qualify as a "similar construct".

Maybe it's too late to do anything about this, and even if not, maybe the arguments in that RFC are not persuasive enough. But I wanted to raise the issue anyway, just in case.

Allow items to be an map object

Several JSON-focused technologies, including Firebase, natively serialises collections of entities as objects which are keyed with their contained entity's id.

My proposal would be to allow this behaviour, optionally, for the items field. This would allow many applications to export JSON feeds almost automatically, by merely wrapping or just being moderately careful about their data schema.

Thanks for your all your hard work, JSON Feed is great.

Are URLs URLs?

The spec mentions the data type URL which is a string in JSON terms. I'm wondering about the syntactic requirement of these strings. Are these URL normalized URIs per RFC 3986? Or are they IRIs per RFC 3987? Or are they „URLs“ per the WHATWGs URL spec?

Title-only items?

I really like the aims of this spec but I wanted to raise an issue/question about "title-only items"...

As I read it at the moment, the mandatory fields for an item are:

  • id
  • content_text OR content_html

Have you considered replacing that second requirement with:

  • content_text OR content_html OR title

The reason for this suggestion is that it would allow lists of links to fall within the spec, without mandating a dummy content_text="" field. Examples for this use-case include a minimalist feed for a link aggregation site like Hacker News or Reddit which is just "Link Text" (title) and "Link" (id).

Alternately...

If there is an expectation, in the absence of a title, that content_text may be contextually the "Link Text", perhaps this should be stated explicitly somewhere in the goals, examples or other discussion to reassure users like myself that it wouldn't be perverting the spec to use content_text in this way.

I personally would prefer not requiring a content tag since sometimes title is a more appropriate description of the semantic meaning than "content" (there are even cases where an item that is purely an id might be appropriate). I understand though if you'd rather downplay certain content structures.

Basic Question about JSON Feed Format?

Sorry for the basic question about the JSON Feed Format:

Will this also help with the implementation of the Apple News Format (since it's also JSON)?

Thanks

Support link relations

There are lots of link relations that have been defined and which can be used to add additional metadata to links and link contexts and link relations have shown to be a powerful way to easily and independently extend formats and have them interact with other specifications through a shared basic semantic that enables it to support those other specifications without having to know any specifics about those other specifications.

Link relations are supported in HTML, Atom, HTTP (through Link-header), RSS (by using the Atom namespace) etc

The HTML spec points to the Microformats link relation registry for it's values: http://microformats.org/wiki/existing-rel-values

Historically there has also been a Link Relations spec that unified Link Relations across standards: https://tools.ietf.org/html/rfc5988

I would suggest supporting this at both feed level and item level through something like:

"links": [
  {
    "rel": "payment",
    "href": "https://flattr.com/submit/auto?url=https%3A%2F%2Fvoxpelli.com%2F2016%2F07%2Fbetter-handle-npm-modules%2F&user_id=voxpelli&title=3+tricks+to+better+handle+npm+modules&category=text&tags=blog&language=en"
  },
  {
    "rel": "webmention",
    "href": "https://webmention.herokuapp.com/api/webmention"
  }
]

The rel-payment mentioned there gained pretty good traction among mobile podcast clients that wanted to allow automation of sending of donations to Flattr.com after someone had listened though an entire podcast. It even got Instacast rejected from the app store. It truly showed the power of link relations when I, working at Flattr at the time, could just decide to utilize an existing link relation in places that allowed link relations and that way, without having to convince any specific feed format or feed parsers to add specific support for payment links, get a standard conformat format for donation links – a format that later eg. Gittip could pick and and support as well and more generic donation mechanisms be built around.

Allowing link-relations enables others to extend this format in similar ways in the future – adding capabilities that we either can't imagine right now or that are so obscure that we never will really care about it, but a minor sub-group of the community will feel that it adds a lot for them.

What about JSON Activity Streams?

If the main thing here is JSON, then why not use Activity Streams? That JSON-based spec covers some semantics that describe new things we've done with social media since the era of title/date/link/content blog feed.

If anything, consider this a FAQ/doco request. Would like to see it as a right-tool-for-the-job scenario and not a new RSS-vs-Atom scenario.

Related specs and comparisons

i always find it useful to see a format/spec put itself into context. it seems that JSON feed could do that in relation to at least two existing "JSON feed formats", which are activity streams and Collection+JSON. i don't mean for this to be a competition, but many people reading the spec will wonder what the differences are, and giving them something to read will be very helpful.

Require RFC 3339 datetimes

ISO 8601 actually allows a pretty wide variety of formats. RFC 3339 offers a more specific ISO 8601 profile that's easier to parse correctly. (If I had my way, I'd require all timestamps to be UTC, as well.)

Representing alternate post types

I'm very excited by what I'm seeing in JSONFeed, as it greatly simplifies both the generation of and the consumption of feeds. On my website, I have implemented initial support for JSONFeed:

That said, in my implementation process, I discovered that I would have liked to have a way to express post types like Photos, Likes, Reposts, Bookmarks, Checkins, and other post types that are common in IndieWeb sites that leverage Microformats2, etc.

Currently, I'm doing this using an _indieweb extension in the JSON. Here are a few samples:

For a Checkin:

{
  "_indieweb": {
    "address": "310 Vista del Mar Ste C, Redondo Beach, CA, United States, 90277",
    "lat": 33.818167005599,
    "long": -118.38519045213,
    "placename": "Riviera Barber Shop",
    "type": "checkin"
  },
  "author": {
    "avatar":
    "https://cleverdevil.io/file/2fa19f964fb8970faaf20b909c69d6cb/thumb.png",
    "name": "Jonathan LaCour",
    "url": "https://cleverdevil.io/profile/cleverdevil"
  },
  "content_text": "Checked into Riviera Barber Shop",
  "date_published": "2017-05-19T18:26:35+00:00",
  "id": "https://cleverdevil.io/view/340d808fc59ec35418ca18c66ab0087f",
  "title": "Checked into Riviera Barber Shop",
  "url": "https://cleverdevil.io/2017/checked-into-riviera-barber-shop-1"
}

Here is an example for a "Like" of some content elsewhere on the web:

{
  "_indieweb": {
    "like-of": "https://twitter.com/cashlock/status/864999780178477056",
    "type": "like"
  },
  "author": {
    "avatar": "https://cleverdevil.io/file/2fa19f964fb8970faaf20b909c69d6cb/thumb.png",
    "name": "Jonathan LaCour",
    "url": "https://cleverdevil.io/profile/cleverdevil"
  },
  "content_html": "...",
  "date_published": "2017-05-18T00:24:55+00:00",
  "external_url": "https://twitter.com/cashlock/status/864999780178477056",
  "id": "https://cleverdevil.io/view/34e874491f8b60b1ae8273f3b9bec08f",
  "title": "Like: Christian Ashlock on Twitter: \"@cleverdevil https://t.co/RW3ddHzeIG\"",
  "url": "https://twitter.com/cashlock/status/864999780178477056"
},

I am not sure it makes sense for every different type of post to be represented, but it would be lovely if at least the basics were represented, or if there was an optional guideline or spec for extensions to enable feed readers to be able to represent and differentiate long-form posts, status updates, photos, likes, bookmarks, and reposts.

JSON Feed Validator Plans

Hi,

JSON Feed looks great!

I was wondering if there were any plans for an official validator tool/web service? I think this would be very useful to help people get their feeds right.

In addition to checking that a feed is valid JSON, it could also check:

  • Proper content-type headers
  • Presence of required fields
  • Valid URLs for the feed
  • Make recommendations (like fully qualified URLs in #8)

Thanks!

JSON Schema definition ?

It might be an idea to specify JSONFeed using JSONschema which will give you a formal, machine verifiable definition of the spec.

Creation of a schema can be accelerated by taking an existing feed and using something like https://jsonschema.net/ to generate the first iteration of the schema which can then manually be annotated e.g. using the editor on the same site.

Once you have the schema there's loads of tools to assist in validation (http://json-schema.org/implementations.html) of compliance of a feed, generate test data (e.g. https://github.com/bojand/json-schema-test-data-generator) or generate code (e.g. C#, Typescript, Java etc) from JSON schema (e.g. https://github.com/RSuter/NJsonSchema)

Below an example schema (generated from the "daring fireball" feed).
One would probably remove most default values, edit descriptions and mark certain fields as required.

{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "definitions": {},
    "id": "http://example.com/example.json",
    "properties": {
        "author": {
            "id": "/properties/author",
            "properties": {
                "name": {
                    "default": "John Gruber",
                    "description": "An explanation about the purpose of this instance.",
                    "id": "/properties/author/properties/name",
                    "title": "The Name Schema",
                    "type": "string"
                },
                "url": {
                    "default": "https://twitter.com/gruber",
                    "description": "An explanation about the purpose of this instance.",
                    "id": "/properties/author/properties/url",
                    "title": "The Url Schema",
                    "type": "string"
                }
            },
            "type": "object"
        },
        "favicon": {
            "default": "https://daringfireball.net/graphics/favicon-64.png",
            "description": "An explanation about the purpose of this instance.",
            "id": "/properties/favicon",
            "title": "The Favicon Schema",
            "type": "string"
        },
        "feed_url": {
            "default": "https://daringfireball.net/feeds/json",
            "description": "An explanation about the purpose of this instance.",
            "id": "/properties/feed_url",
            "title": "The Feed_url Schema",
            "type": "string"
        },
        "home_page_url": {
            "default": "https://daringfireball.net/",
            "description": "An explanation about the purpose of this instance.",
            "id": "/properties/home_page_url",
            "title": "The Home_page_url Schema",
            "type": "string"
        },
        "icon": {
            "default": "https://daringfireball.net/graphics/apple-touch-icon.png",
            "description": "An explanation about the purpose of this instance.",
            "id": "/properties/icon",
            "title": "The Icon Schema",
            "type": "string"
        },
        "items": {
            "id": "/properties/items",
            "items": {
                "id": "/properties/items/items",
                "properties": {
                    "author": {
                        "id": "/properties/items/items/properties/author",
                        "properties": {
                            "name": {
                                "default": "John Gruber",
                                "description": "An explanation about the purpose of this instance.",
                                "id": "/properties/items/items/properties/author/properties/name",
                                "title": "The Name Schema",
                                "type": "string"
                            }
                        },
                        "type": "object"
                    },
                    "content_html": {
                        "default": "\n
Walt Mossberg:

\n\n
\n  
This is my last weekly column for The Verge and Recode — the last\nweekly column I plan to write anywhere. I’ve been doing these\nalmost every week since 1991, starting at the Wall Street Journal,\nand during that time, I’ve been fortunate enough to get to know\nthe makers of the tech revolution, and to ruminate — and\nsometimes to fulminate — about their creations.

\n\n
Now, as I prepare to retire at the end of that very long and\nworld-changing stretch, it seems appropriate to ponder the\nsweep of consumer technology in that period, and what we can\nexpect next.

\n
\n\n
Godspeed on whatever’s next, Walt.

\n\n
\n ★ \n
\n\n\t",
                        "description": "An explanation about the purpose of this instance.",
                        "id": "/properties/items/items/properties/content_html",
                        "title": "The Content_html Schema",
                        "type": "string"
                    },
                    "date_modified": {
                        "default": "2017-05-26T03:56:39Z",
                        "description": "An explanation about the purpose of this instance.",
                        "id": "/properties/items/items/properties/date_modified",
                        "title": "The Date_modified Schema",
                        "type": "string"
                    },
                    "date_published": {
                        "default": "2017-05-26T03:56:37Z",
                        "description": "An explanation about the purpose of this instance.",
                        "id": "/properties/items/items/properties/date_published",
                        "title": "The Date_published Schema",
                        "type": "string"
                    },
                    "external_url": {
                        "default": "https://www.recode.net/2017/5/25/15689094/mossberg-final-column",
                        "description": "An explanation about the purpose of this instance.",
                        "id": "/properties/items/items/properties/external_url",
                        "title": "The External_url Schema",
                        "type": "string"
                    },
                    "id": {
                        "default": "https://daringfireball.net/linked/2017/05/25/mossberg",
                        "description": "An explanation about the purpose of this instance.",
                        "id": "/properties/items/items/properties/id",
                        "title": "The Id Schema",
                        "type": "string"
                    },
                    "title": {
                        "default": "Mossberg: The Disappearing Computer",
                        "description": "An explanation about the purpose of this instance.",
                        "id": "/properties/items/items/properties/title",
                        "title": "The Title Schema",
                        "type": "string"
                    },
                    "url": {
                        "default": "https://daringfireball.net/linked/2017/05/25/mossberg",
                        "description": "An explanation about the purpose of this instance.",
                        "id": "/properties/items/items/properties/url",
                        "title": "The Url Schema",
                        "type": "string"
                    }
                },
                "type": "object"
            },
            "type": "array"
        },
        "title": {
            "default": "Daring Fireball",
            "description": "An explanation about the purpose of this instance.",
            "id": "/properties/title",
            "title": "The Title Schema",
            "type": "string"
        },
        "version": {
            "default": "https://jsonfeed.org/version/1",
            "description": "An explanation about the purpose of this instance.",
            "id": "/properties/version",
            "title": "The Version Schema",
            "type": "string"
        }
    },
    "type": "object"
}

Specific publisher extensions can be added by including something like:

"patternProperties": {
        "^_": {
          "$ref": "#/definitions/publisherExtension"
        }
      }

in the properties section and:

    "publisherExtension": {
      "description": "Any property starting with _ is valid.",
      "additionalProperties": true,
      "additionalItems": true
    },

in the definitions section

Clarify WebSub support

As discussed on Twitter (see thread) the current hubs property in JSON Feed is not supported by the discovery mechanisms provided by WebSub (formerly known as PubSubHubbub).

This partly ties into #44 as WebSub currently only supports discovery through the Link Relations standardized in the RFC 5988 that is mentioned in #44. (And currently also through hostmeta, but that support is at risk)

Of the standard ways outlined in RFC 5988, only the HTTP Link: header one can be applied to JSON Feed. There is no standardized way to include link relations within JSON as there is within XML and HTML.

So unless WebSub is extended to specifically support a JSON Feed specific discovery mechanism, then the only supported discovery mechanism is the HTTP Link: header.

Ping @manton @julien51

Allow markup and/or specifying the language in all human readable strings

For ids, urls, version… that's not needed, but in all places where natural languages are expected (author names, titles, descriptions...), it must be possible to at the very least specify which human language is used, and also to use markup.

Without a language tag, text-to-speech engines, from Alexa and Siri to accessibility tools for blind people, will be unable to reliably read things out loud properly.

Also, in some languages, text rendering looks wrong if the language is not specified, because of some ambiguities in unicode.

Also, some languages require markup to be written properly, and would need to drop content if forced into plain text. The classical example being Chinese/Japanese/Korea with ruby markup: 豊臣とよとみ秀吉ひでよし

See also this article.

Feed Icon Link

The current example for linking one's feed with the JSON feed icon is missing an alt text. Since there is no neighbouring text link that makes it somewhat invisible to assistive technology. An example with best practice should have an alt text, I feel.

As an aside the link could also have an link relationship of rel=alternate and a media type of application/feed+json, if that's the media type used. Autodiscovery should detect this link, shouldn't it?

Split home_page_url into two separate fields

home_page_url is described as being the url of the resource the feed describes, which may or may not be a "home" page.

I think this should be broken into two different (and optional) fields for clarity of purpose:

  • site_url should reference the actual home page for the site the feed is for
  • resource_url should reference the url of the resource the feed describes (the current stated purpose of home_page_url).

The need to have a separate site_url and resource_url comes when the home_page_url does not point at a home page, or should not exist because (for whatever reason) there is no explicit resource the feed describes.

For example, if home_page_url == https://subdomain.example.org/someplace/, it might be desirable to instead say

  • site_url = https://example.org/
  • resource_url = https://subdomain.example.org/someplace/

Alternatively, a feed may exist for which there is no url that adequately describes it. For example, one can imagine a (not necessarily web-based) library that generates a jsonfeed based on some arbitrary criteria provided by its user. The library may wish to specify a site_url to advertise itself, but have no resource_url because there is none.

Backwards Compatibility

Simply changing home_page_url to resource_url would represent a breaking change. To follow the spec's stated forward compatibility goals of making version 1 feed valid version 2 feeds, we could declare that new parsers should interpret the two fields as synonyms of each other: if resource_url is provided, then that value will be used; else if home_page_url is present, then that value will be interpreted as the resource_url. home_page_url would be marked as deprecated in the spec. If both are present, home_page_url is ignored.

Alternatively, we can keep home_page_url named as-is, and note its apparent mislabeling as a historical oddity in the spec.

Other issues

  • This issue would obviate #28.
  • Support for relative urls (#8) is also related.

Why allow non-string values for a feed's id?

The spec says

If an id is presented as a number or other type, a JSON Feed reader must coerce it to a string.

If you have a moment to explain, I'd love to understand the motivation for allowing other types in this place at all. So far, to me, it seems like an unnecessary complication.

It would be easier (for me) to reject a feed as invalid if it has anything other than a string there.

It wouldn't be so hard if I wanted to convert whatever value shows up while decoding, and then treat it as a string from then on, but I'm also trying to consider software that needs to both decode and then encode again. For example, an aggregator. Presumably it should faithfully encode whatever value was in the original file, even if it wasn't a string. That means I can't just make the internal representation be string, I have to make it be anything. In that case, accessing the string form of the id is less convenient for a client of my library. Instead of just item.id it would have to be something like item.id.toString().

Now this isn't the end of the world but it is a downside. My question is: what's the upside? Why is this in the spec?

(Usually the advice I give when considering this sort of thing is: it's much easier to add things than to take them out. In fact, for JSON Feed it's impossible to take them out. So if there's any doubt, it's better to make the spec more restrictive to start with. In this case that would mean saying it's required to be a string. If there's lots of demand for non-string ids in the future, then maybe that can be added. I'm not trying to argue for that here because I think it's probably too late. This is more of just a question so I (and others) can understand the reasoning behind this. Apologies if this isn't the right place to ask.)

Language & Internationalization

People outside the United States speak other languages. Sometimes they even write in another script. It happens.

One thing JSON feeds lose without the XML ecosystem is XML's great support for i18n. When processing an JSON feed for end users, language information is somewhat important: for assistive technologies (e.g. pronounciation), for display (directionality of text, other advanced layout), maybe just for filtering multilingual feeds.

item.content_html in theory could carry its own html:lang or xml:lang attribute. But title, description, item.title, item.content_text, item.summary, item.attachment.title and maybe item.tags[] don't have language labeling.

Proposal:

  • a new key named language (optional, string) which contains a BCP 47 language tag.
  • the key should be used top-level, item-level, maybe attachment-level and maybe in extensions. Which would put a new, small limit on extensions.
  • Like for xml:lang and html:lang child elements should inherit their parents language definition unless they define their own.
  • I would make top-level language required, because I can't think of an feed example which doesn't have some form of human language somewhere. Even commits or server logs or other technical stuff often has language in it.

Also: I never got into the details of the Bidi algorithm. But html:dir and xml:dir exist and seem to fill a need which sometimes seem to be reasonable.

"mime_type" should be "media_type"

I believe "media_type" (as in RFC6838) would be a more appropriate property name for use with attachments as it is more general than the the email-related "MIME type" name. Assuming most attachments will be referring to HTTP URLs, this naming would also align with the HTTP spec.

Consider a distinct media type ("MIME type")

While I realize that it is probably too late to change the decision, the fact that JSON Feed does not have a distinct media type from all other JSON content is a problem for code that distinguishes HTTP responses by their Content-Type header value.

The current situation is akin to saying that Atom and RSS are of type 'application/xml' -- that's not terribly helpful (hence the 'application/rss+xml' and 'application/atom+xml' media types used in practice).

Of course, there may be a good reason why not to use a distinct media type for JSON Feed -- if so, I would suggest that it should be included as part of the specification document for rationale.

№ of subscribers in User-Agent header fragments caches

I suspect the (n subscribers) substring in User-Agent is from existing reader behavior, but it means edge caches that Vary on UA accumulate a dead copy every time a feed gets a new subscriber. (UA-sniffing is inadvisable, but widespread.)

Not a huge issue, but I’m curious what the authors think.

Items with summary but without content_*

There are feeds whose items don’t have real content but a short summary. In my opinion, it would be more appropriate to use summary instead of content_text or content_html in those cases.

Consider better (sub)-versioning and a changelog

Yesterday the allowed syntax for dates where the whole scope of ISO 8601. The next version of the spec in the repo references RFC 3339, a more limited scope. Strictly speaking a breaking change per semantic versioning but at this early stage it could be obviously forgiven.

In the long term it is a pain. Following WHAT WGs specs is a stressfull endeavour, which only multiple developers by Big Browser™ backed by billions of $ can do. On a lower scale specs on wikis like in the Microformat or Indieweb space give this feeling on uncertainness, that pushes someone to religiously reading git logs or - worse - writing sarcastic blog posts.

Nobody likes the ground under her changing even a little bit. But of course change exists, happens and will happen. Documenting the change makes life easier for those for whom JSON Feeds is not a full time job.

Thought experiment: using schema.org + JSON-LD instead

First: don't shoot me. This is just a thought experiment, but one that I believe may prove useful.

My reasoning is as follows:

  • Many content producers already have to produce schema.org+JSON-LD for their content. If they could reuse it, it would be better than having to produce a second JSON representation with essentially the exact same information but a different syntax. (For people who don't already produce this, the differences for the most part boil down to taste.)
  • Many consumers already understand it, notably all major search engines.
  • The extensibility model is already covered: no need to think about it.
  • Several issues are already covered (I'm just listing a few that are obvious, there are certainly more): #44, #26, #22, #23, #19, #18, #38, #39, #13, #10, #15, #7.
  • It's already documented and already has a process for change and evolution, a lively community, a successful repo on GitHub, etc. All we need to define is the vernacular.

Downside:

  • It's a bit more verbose (but not that much).

This is just a translation of the example from the site's front page to give a feel for the result. If there's interest we can dig through more details.

{
  "@type": "Blog",
  "name": "My Example Feed",
  "mainEntity": "https://example.org/",
  "url": "https://example.org/feed.json",
  "blogPost": [
    {
      "@type": "BlogPosting",
      "@id": "https://example.org/second-item",
      "articleBody": "This is a second item.",
      "url": "https://example.org/second-item"
    },
    {
      "@type": "BlogPosting",
      "@id": "https://example.org/initial-post",
      "articleBody": { "@type": "rdf:HTML", "@value": "<p>Hello, world!</p>" },
      "url": "https://example.org/initial-post"
    }
  ]
}

Define a top-level field that describes the feed's update frequency

Motivation

Some feeds, like Twitter, might update by the second. Some others, like a typical blog, might update daily. Perhaps a podcast only updates once a week. Others still might even only update once a year. Some utility APIs generate infinite feeds on-demand and so update frequency would not be detectable.

The readers that support JSONFeed might not want to check every feed at the same rate; choosing something arbitrarily will inevitably leave some feeds updating too slowly and others too rapidly. Attempting to use algorithms, heuristics, or AI to determine an appropriate rate per-feed might result in missed posts if the feed suddenly becomes more active, or wasted polls if it suddenly becomes inactive, or worst might end up using far too many system resources checking an infinite feed.

Ideas

Here are my ideas, in no particular order. Feel free to shoot them down, improve on them, or propose entirely new ones:

Recommended number of seconds between checks

{
  "version" : "https://jsonfeed.org/version/2",
  "title" : "My Podcast",
  "update_delay" : 86400
}
{
  "version" : "https://jsonfeed.org/version/2",
  "title" : "My Twitter",
  "update_delay" : 60
}
{
  "version" : "https://jsonfeed.org/version/2",
  "title" : "My Blog",
  "update_delay" : 3600
}

Recommended number of checks per day

... or per whatever unit. I chose "per day" for the examples:

{
  "version" : "https://jsonfeed.org/version/2",
  "title" : "My Podcast",
  "update_frequency" : 1
}
{
  "version" : "https://jsonfeed.org/version/2",
  "title" : "My Twitter",
  "update_frequency" : 1440
}
{
  "version" : "https://jsonfeed.org/version/2",
  "title" : "My Blog",
  "update_frequency" : 12
}

A strictly-defined enumeration

For instance, "per-second", "hourly", "daily", etc.

This one might behave much more differently than the others, since it's not a strict recommendation. For that reason, I kinda prefer this; you say "I update this often" and the reader might check twice as often (or so) just in case you post an important update.

{
  "version" : "https://jsonfeed.org/version/2",
  "title" : "My Podcast",
  "update_frequency" : "weekly"
}
{
  "version" : "https://jsonfeed.org/version/2",
  "title" : "My Twitter",
  "update_frequency" : "per-minute"
}
{
  "version" : "https://jsonfeed.org/version/2",
  "title" : "My Blog",
  "update_frequency" : "daily"
}

Priority

The presence of this field should be seen as a suggestion, and should be overridden by the expired field.

Fallback

If implemented, this should not be a required field. If no frequency is specified, then some arbitrary rate like once every few minutes should probably be used, or the client can infer the frequency using its own algorithms.

Use 'feed' or other link relation for discovery?

There was a comment on a media type discussion that I think deserves to be its own issue:

The W3C and WHAT WG versions of HTML overloaded rel=alternate just for the media types application/rss+xml and application/atom+xml. If one of these is not there it means alternative representation of the current document, not "there is a feed for this blog". If a new media type get's registered, someone should also petition WHATWG and W3C for its inclusion in the HTML specs.

(There was a proposal of rel=feed somewhere. That proposal seems dead except for some usage for the HTML feeds of the Indieweb movement. Maybe that's easier)

#22 (comment)

The reason why it's separate from the media type issue is that a 'feed' or 'jsonfeed' link relation has value in and of itself:

It makes feed discovery more reliable under circumstances where discovery based on media types is unavailable (e.g. due to server configuration or lack of resources) or complicated.

Having a link relation also makes it easier to use JSON feeds in other formats like ebooks or digital publications. For those, the current use of 'alternate' with an 'application/json' media type isn't really workable since many of these formats already have alternate json representations in different json formats.

Some of these problems would also be solved by having a new media type but I think there is value in having both a link relation and a media type in any case.

For reference:

Minting an extension link relation is much easier than registering a media type but might look odd to many:

<link rel="alternate https://jsonfeed.org" href="https//example.com/feed.json" type="application/json" title="My Feed">

Version number is unnecessary

A version 1 feed will be a valid version 2 feed, and so on. Future versions may add things, but won’t make older feeds invalid.

What about future feeds in older readers? The spec doesn't contain a detailed processing model, but chances are that the reasonable behavior when encountering the lack of version or a version from the future is to simply ignore the version.

If that's the case (I hope it is), it would make the most sense not to declare a version at all. E.g. CSS never did and HTML no longer does. Please consider dropping the top-level version key.

Unique ID for feed

Have you considered adding a unique ID field to the feed? The spec comments on the problems occurring when items have no ids, but errors happen when adopting the feed url as unique id too. Feed urls often change, either via permanent redirects or itunes:new-feed-url tag, and the same url can often be written with or without a trailing slash. This can create problems i.e. when syncing clients on multiple devices.

Take for example John Gruber's The Talk Show. If you search the iTunes API for The Talk Show, it returns the following feed url: https://daringfireball.net//feeds/serve?feed=thetalkshow
Then you download and parse the feed, and there both atom:link and new_feed_url report a different feed url: https://daringfireball.net//feeds/serve?feed=thetalkshow

I don't know why iTunes API doesn't parse new_feed_url and update its record, but that's the way it is.

Non-object extensions

As far as I understand, extensions have to be an object:

Publishers can use custom objects in JSON Feeds. Names must start with an _ character and be followed by a letter. Custom objects can appear anywhere in a feed.

However, @manton just wrote in another issue:

… let's imagine thatsticker in one extension is a string value (maybe a URL), and in another extension is an object (maybe with type, size, and URL).

Do extensions have to be an object or are non-object extensions also valid?

Add a Swagger spec to the docs

A swagger spec would be a really nice addition to the documentation—especially if you include a live Swagger UI page where people can experiment with a live feed.

Relative URLs?

The spec doesn't mention relative URLs — are they allowed? And if so, what is the base URL they're interpreted relative to?

I recall this coming up as a bit of a compatibility issue in handling RSS/Atom feeds.

(My preference would be to allow them, since they make the feed more compact, and that they should be relative to the home_page_url, since that's likely to be a parent or sibling of where the individual articles' URLs will be. But then there are edge cases like what if there's no home_page_url...)

Allow only UTF-8 encoding

JSON Feeds should be encoded using UTF-8 — but any encoding that’s legal JSON is legal for JSON Feeds.

Please only allow UTF-8.

  • The web platform is trying to converge on UTF-8. UTF-16 is being removed wherever possible, and is not supported in new web features.
  • UTF-32 is utterly useless for file interchange.
  • Support for multiple encodings is a headache for clients.
    • Other encodings have to be detected and converted
    • UTF-16/32 are endian-dependent and are not ASCII superset
    • UTF-16 is a pain to work with due to surrogate pairs and environments that confuse UCS-2 and UTF-16.
    • Majority of libraries produce UTF-8 (or JS-escaped ASCII), so UTF-16/32 is going to be rare in practice and probably unsupported/broken in most clients anyway

Worth noting: No newlines in JSON strings

If the history of RSS is any guide, people are going to be writing code that generates JSON feeds by ad-hoc string concatenation or template substitution, without going through a real JSON encoder. And they're going to make mistakes that result in invalid JSON, most likely when writing article bodies.

JSON parsers will generally barf on these, which should mean that most of these mistakes get caught in casual testing before being released into the wild, but the different parsers vary in strictness, so it's possible someone will test with a more lenient parser and then their feed(s) will fail for others. Or the mistakes might only occur in some cases that aren't hit during testing.

There are two things I think are worth calling out in the spec:

JSON strings cannot contain newlines or tabs — they must be escaped as \n or \t. (The RFC requires that all control characters be escaped.) Some parsers seem not to mind if this is violated, but some do.

JSON has some very specific rules for how to escape Unicode characters. If someone uses a different library to do the encoding, the results may work most of the time but not always; for example Latin characters might make it through OK but not non-Roman ones. Again, this might slip past the kind of rudimentary testing that a lot of web-devs do (I'm talking about you, PHP kiddies.) For example, I've found that JSON-encoding NSStrings is tricky because NSString's "characters" are not Unicode codepoints but rather UTF-16-encoded values, and if you don't wrap your head around that, lots of higher-Unicode characters come out wrong. (Actually, the popularity of emoji is a real boon here, as emoji represent the most complex case of Unicode character encoding; so if you don't get the escaping correct, emoji tend to break, which is quickly apparent in real world use.)

The best advice for escaping Unicode is probably "don't do it." The spec clearly says that only double-quote, backslash and control characters need to be escaped. Everything else can appear literally in a string.

Is an item's url required or optional?

The spec says:

url (optional, string) is the URL of the resource described by the item. It’s the permalink. This may be the same as the id — but should be present regardless.

Given the phrase "should be present regardless," is the url actually required? Or am I misreading something?

Empty properties

Currently, https://micro.blog/feeds/manton.json includes the following item:

{
  "author": {
    "_microblog": {
      "username": "SciPhi"
    },
    "avatar": "http://www.gravatar.com/avatar/e59f6b41f418b9e6d37aad95db19ea7c?s=96",
    "name": "Phi.Sanders",
    "url": ""
  },
  "content_html": "<p>🔗 Mike Monteiro #μβ — <a href=\"https://twitter.com/monteiro/status/865270045806444544\">twitter.com/monteiro/…</a> </p>",
  "date_published": "2017-05-18T19:09:00+00:00",
  "id": "83721",
  "url": "http://SciPhi.micro.blog/2017/05/18/mike-monteiro-httpstwittercommonteirostatus.html"
}

As you can see, author.url is present but empty (and therefore not a valid URL). Is this considered valid according to the spec?

Support unicode characters in keys

You support emojis in keys, but limit the character set to ASCII instead of supporting unicode? This seems backwards and unnecessary.

Handling of newlines

The spec does not mention what is the meaning of \n and \r\n in plain text fields.

  • Are newlines meaningful?
  • Is text wrapping allowed/forbidden/required?
  • Are newlines allowed in titles? Summary?
  • Is paragraph break in content_text one or two newlines?

feedback

Since I didn't see another place to post feedback, I'm posting it here.

have noticed that JSON has become the developers’ choice for APIs, and that developers will often go out of their way to avoid XML.

This, this, this, and this, 1000 times. I'm a developer. I've got 17 year old APIs I still support using XML and/or SOAP. In every case, high on the TODO list is deprecating the XML support entirely. XML is big, slow, and we have to upgrade the XML libs regularly due to vulnerabilities. I've had to write XML generators, validators, and parsers over the years so I don't fear XML, but fondness is something I've never felt for it.

JSON just works. A line or two of code in every programming language that matters.

DurationInSeconds and SizeInBytes are bad names

I think those should simply be duration and size and have the spec define how those are represented in terms of units. It is extremely uncommon to have descriptive names like this in formats.

MIME-type should be more specific

The MIME-type of JSONFeed is currently application/json. I believe it should be a bit more specific, like application/feed+json. RSS and Atom feeds also use application/rss+xml and application/atom+xml.

If you declare <link rel=alternate type="application/json" href="/feed.json"> in your HTML-page, you're telling the user agents that feed.json is just a JSON-representation of the current document, not necessarily a feed. If you use application/feed+json, you're telling them specifically that it's a feed that can be subscribed to.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.