Git Product home page Git Product logo

epubweb's Introduction

Advancing Portable Documents for the Open Web Platform: EPUB-WEB

This Repo should be considered as closed now. The work has been transferred to the DPUB Interest Group and is now part of the deliverables of that group. See the new repo for the continuation work. (2015-09-26)

Markus Gylling ([email protected]), Ivan Herman ([email protected]).

epubweb's People

Contributors

iherman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

epubweb's Issues

EPUB3 for Integrated and Customizable Representation of a Scientific Publication and its Associated Resources

Scientific publications point to many associated resources,
including videos, prototypes, slides, and datasets. However, discovering
and accessing these resources is not always straightforward: links
could be broken, readers may be offline, or the number of associated
resources might make it difficult to keep track of the viewing order. In
this paper, we explore potential integration of such resources into the
digital version of a scientific publication. Specifically, we evaluate the
most common scientific publication formats in terms of their capability
to implement the desirable attributes of an enhanced publication and to
meet the functional goals of an enhanced publication information system:
PDF, HTML, EPUB2, and EPUB3. In addition, we present an
EPUB3 version of an exemplary publication in the field of computer
science, integrating and interlinking an explanatory video and an interactive
prototype. Finally, we introduce a demonstrator that is capable
of outputting customized scientific publications in EPUB3. By making
use of EPUB3 to create an integrated and customizable representation
of a scientific publication and its associated resources, we believe that
we are able to augment the reading experience of scholarly publications,
and thus the effectiveness of scientific communication.

http://linkedscience.org/wp-content/uploads/2014/10/lisc2014_submission_3.pdf

reword - again on the passive.

"While various browser extensions supporting EPUB exist (Readium-Chrome, EPUB-Firefox, .etc.), as well as solutions for delivering EPUB files in browsers (Readium-Cloud, EPUB.js, Safari Books Online, etc.) these solutions require relatively complex server and/or client software and in some cases depend on non-standardized transformation of packged .epub files into more network-delivery-suitable forms."

How about "Various browser extensions supporting EPUB exist (Readium in Chrome, EPUB in Firefox, et al.). Other solutions exist for delivering EPUB files in browsers (Readium0Cloud, EPUB.js Safari Books Online, et al.). Browser- and cloud-based solutions require relatively complex server and/or client software. In many cases browser- and cloud-based solutions depend on a proprietary transformation of the packaged EPUB files into formats more suitable to network delivery."

wordsmithing 3

"EPUB can be viewed as simply defining a specialization of Web content that assures that a collection of content items has the needed properties of completeness and logical structure, and does so in a standard way so that other processing tools and services can reliably create, manipulate, and present such collections."

delete "so" before "that other processing tools"

Good reference to the scholarly publishing issue (if needed)

The following paper:

T. K. Attwood, D. B. Kell, P. McDermott, J. Marsh, S. R. Pettifer, and D. Thorne, “Calling International Rescue: knowledge lost in literature and data landslide!,” Biochemical Journal, vol. 424, pp. 317-333, 2009. See http://www.biochemj.org/bj/ev/424/0317/bj4240317_ev.htm

has a number of nice references to various experiments that try to use the facilities of HTML for scholarly publishing. The paper is not new but may be useful nevertheless...

Wordsmithing 5

Change "has not been downloaded." to "has not yet been downloaded."

Wordsmithing

re: "The current format- and workflow-level separation between offline/portable (EPUB) and online (Web) document publishing is diminished to zero."

May be better worded as "In this vision, the current format- and workflow-level separation between offline/portable (EPUB) and online (Web) document publishing is diminished to zero."

Key words added "In this vision"

Privacy section

Per issue #41 the former privacy and security section has been reduced to security. It is a question whether there should be a separate privacy issue, and what it should contain.

Add a section on a possible API specification

(This issue was originally raised by Daniel Glazman on W3C's AC forum mailing list. Although that list is member confidential, these extract from Daniel's email have been reproduced with his authorization).

[…]I miss one thing that is in our expertise area: an standard Object Model for EPUB packages.[…] Let me give you a concrete example about why I think we one:

Reaching (reading AND setting) the individual metadata of an EPUB package is a pretty complex task because of the potential multiple levels of indirection between the package-level and the single piece of data you're trying to reach. This is of course doable through the html Document Object Model, but it's tedious, to say the least, and quite error-prone. The metadata system of EPUB3 is IMHO so complex (I wrote several times about it) it prevents reading and authoring systems to use it at its full potential. A dedicated Object Model would allow such systems to use a higher-level API in a standard way to easily manipulate all the data contained in an EPUB package. Other example: adding a file and inserting a reference to that file into the various table of contents EPUB provides is more than tedious, it's painful; similarly, getting a reference to "the next file in package according to ToC" from a given file is also painful. All of that could be greatly simplified in an OM and would allow more cross-browser frameworks and applications.

Specificity

"Fig. 1 The same content can be turned into an archived file and back without any inherent change"

How about "The same content can be turned into an archived file and back without any inherent changes to the core content or associated digital assets."

Provenance and Logic in scholarly publishing

Reproduced from http://www.w3.org/blog/2015/06/planning-the-future-of-the-digital-publishing-interest-group/#comment-88030 (comments by Paul Tyson):

I see 2 very large gaps in the program (not to discount the fine work that has been done, all essential for moving forward).

  1. Provenance. Those trained in classic techniques of scholarship still seek an assurance of authenticity for digital experiences that is comparable to what the elaborate machinery of scholarship provided.
  2. Logic. Not just the first fruits of the semantic web in the form of advanced finding aids, but integrating structured logical definitions, propositions, and arguments into EPUB+WEB to allow e-readers to do automatic fact- and fallacy-checking.

rewording/passive

"On mobile platforms, usage of Web sites is diminishing in favor of native applications. "

"Mobile platform web site use is diminishing in favor of native applications."

Run on sentence

"EPUB can be considered to be at an inflection point. EPUB has been broadly adopted on a global basis for “trade” eBooks, and is beginning to be adopted for e-textbooks and other types of documents, but has been largely an “offline” format. "

How about:

"EPUB can be considered to be at a tipping point. EPUB has been broadly adopted globally for trade ebooks, and is starting to gain adoption among textbook publishers as well as corporate marketing departments. However, EPUB has largely been seen as an "offline" format up until now."

Mention the pagination issue

Pagination may be one of the biggest challenge in digital publishing which is currently poorly supported by the OWP.

I believe this issue is under-represented in the current draft, e.g.:

  • Section 2. "Why work on this now"

    In many cases browser- and cloud-based solutions depend on a proprietary transformation of the packaged EPUB files into formats more suitable to network delivery.

    One of the primary reason RS developers need to transform the packaged EPUB files is for concatenation+pagination of individual files

  • Section 2.6 "Web Browsers"

    the inclusion of EPUB-WEB capabilities to browsers should be fairly straightforward (...) the “extras” to make them a native feature of the Web is limited to some comparatively simple tasks

    that prose may understate the complexity of pagination in the OWP...

new use cases: stable references/citations

An additional use case that would be extremely valuable to academic and scholarly publishing is stable references and citations. There are currently several methods for citing online works (disputed among style guides), but there is no standard method for citations in ebooks. Even if a reflowable ebook is by a scholar, she must refer to PDF, paper copy, or HTML version to cite it in her bibliography. EPUB-Web should enable stable citations.

word use

"The convergence of EPUB and the Open Web Platform provides a common set of solutions and opportunities for various stakeholders:"

to

"The convergence of EPUB and the Open Web Platform provides a common set of solutions and opportunities TO various stakeholders:"

a couple of use cases.

An interesting use case are documents including forms, that one may want to download from the web and fill in locally and off line, and resend the filled form as a package. This could probably be achived with EPUB-WEB, with the OWP technologies.

So maybe adding a few words about forms could be of interest.

Another use case are documents for which the editor wants to add protection (with a password or not allowing copy& paste, etc).
These are available in PDF and would also be of great interest.

Last is downloading of a subset of pages of a book in an EPUB
package. But maybe this is part of section "3.3 Document and fragment
identification"

wordsmithing 6

"Book publishers are currently investing in the development of technical expertise in Web technologies; an expertise that is not part of the “traditional” work of publishers. Whilst gaining some level of expertise is important, the lack of synergy between trade publishers and the Web application developers’ community also means unnecessary duplications, i.e., unnecessary investments.

Being able to collaborate more closely with the Web content development community, with all its inventiveness and agility, would have major benefit for publishers. Through the usage of a universal, interoperable format, it will let publishers concentrate on engaging with the content authors to produce finished and high quality content, and rely on the Web content community to deal with the more sophisticated technical issues (such as intricacies of CSS or of SVG). Possible future content formats on the Web (e.g., 3D rendering) or various interactive Web programs (e.g., visualization tools like D3) will also naturally flow into the realm of publishers through EPUB-WEB and hence increase the publishers’ possibilities. This is especially true for areas of publishing that are traditionally at the leading edge of electronic content, such as STM and educational publishing (see also the section on scholarly publishing below). Basically, a more converged platform, instead of parallel universes, will support more tools and services and a much larger population of trained practitioners."

to

"Book publishers are investing in the development of technical expertise in web technologies. People with technical expertise by anyone other than information technology departments were not previously required in traditional publishing workflows. However, while gaining some understanding of technical topics is important to new and future publishing workflows, the lack of communication between the trade publishers and web application developer communities is resulting in unnecessary duplication and investments in effort.

Collaboration between the web content development and publishing communities will result in major benefits to publishers. Adopting a universal and interoperable format means publishers can concentrate on engaging content authors in the production of high quality content. The web content development community can be relied on to deal with sophisticated technical issues (e.g., CSS, SVG). Potential future web content formats (e.g., 3D rendering) and various interactive web programs (e.g., visualization tools like D3 [Jean note: better be prepared to tell people what D3 is...]) will naturally flow into the publishing realm through EPUB-WEB, hence increasing publishers' opportunities to sell new content products across the board.

Realizing new opportunities is a reality for publishers traditionally considered to be on the leading edge of technological advances in working with content. These publishers include STM and educational publishing houses, as well as scholarly and journal publishing organizations (see the section on scholarly publishing in this document).

A converged platform will support more tools and services and a much larger population of trained practitioners compared to the current state of working in parallel universes."

Rewrite to make sense to mere mortals

"EPUB can be considered to be at an inflection point."

What exactly do you mean by "inflection point"? Are you trying to avoid the overused "tipping point" cliche? Mere mortals will understand "tipping point" more quickly than "inflection".

Forms as used case; protection issues

(Comment from Thierry Michel, reproduced with his permission)

Ivan,

An interesting use case are documents including forms, that one may want to download from the web and fill in locally and off line, and resend the filled form as a package. This could probably be achived with EPUB-WEB as with all OWP technologies.

So maybe adding a few words about forms could be of interest.

Another use case are documents for which the editor wants to add protection (with a password or not allowing copy& paste, etc). These are available in PDF and would also be of great interest.

but maybe that is included in the following statement of section 3.6
"EPUB-WEB must incorporate a state agnostic security and privacy model that defines rules for both the online and portable states."

Last is downloading of a subset of pages of a book in an EPUB package. But maybe this is part of section "3.3 Document and fragment identification"

Thierry

More to re-word.

"A focused effort to make EPUB a first-class citizen of the Open Web Platform and as a result significantly reduce the complexity of deploying EPUB content into browsers, for online as well as offline consumption, would increase momentum for EPUB and associated Web Standards adoption in the many communities who are are looking for an open, non-proprietary, next-generation portable document format."

"A focused effort to make EPUB a first-class Open web Platform citizen will result in significant reduction in the complexity of deploying EPUB content into browsers for both online and offline consumption. Further, this focused effort will increase the momentum of EPUB and associated web adoption across communities who are..." (do something about that "next-generation portable document format." Too many people will think PDF as a knee jerk reaction just to seeing the words.

Wordsmithing 4

"While Web documents and portable documents have distinguishing characteristics, the differences can also be viewed as situational and gradual rather than representing bright-line distinctions."

How about "The differences between the distinguishing characteristics of web documents and portable document can be viewed as situational and gradual rather than as representative of bright-line distinctions."

(Someone beat me over the head with a red marker about passive voice one. Not really. But they did instill the point about avoid the passive...)

More grammar/re-wording

"However, the specific means of delivering such hybrid and Web-technology-based system applications is presently proprietary to particular application and/or browser platforms. Working on EPUB-WEB will increase momentum in solving problems in packaging, metadata, and offline support applicable to both portable documents and installed applications and help ensure the broadest possible adoption for the Open Web Platform in general."

to

"The specific means of delivering hybrid and web-technology-based system applications is currently proprietary to specific applications frameworks and/or browser platforms. The point of EPUB-WEB is to increase problem solving momentum in package, metadata, and offline support applicable to both portable documents and install applications. Open and native solutions to replace proprietary packaging, metadata, and offline support are intended to ensure the broadest possible general adoption OF the Open Web Platform."

wordsmithing 7

"Most scholarly journal publishers also provide the articles for download these days; currently, the format is usually [PDF]. This reflects the history of the scholarly community: scholarly publications were traditionally linear texts, and the PDF files are meant to be a faithful reproduction of the printed formats. Indeed, the original (pre-Web) goal of putting these files online was to allow readers to print the content, instead of borrowing the printed journal and making photocopies."

to

"Scholarly journal publishers also provide articles for download these days. The most popular distribution format for journal articles continues to be [PDF] as a direct reflection of the scholarly community which highly prioritizes linear text and preservation of print typography. Indeed, the original goal for scholarly publisher to make files available online was to enable readers to download and print content directly, instead of borrowing a paper copy of a journal issue. and photocopying relevant articles."

Reference the Service Workers (draft) spec ?

Service Workers is one of the possible technical answers to improving offline use of web apps.

I'm not quite sure how it fits with the Packaging spec, and it's probably more geared towards enabling an offline use of web content than enabling a portable package, but I believe it is certainly a spec to watch and which may play a role in EPUB-WEB.

Passive voice... Please revise

"Reliable navigation of a Web site would increase usability and accessibility."

How about "Reliable navigation of Web sites increases usability and accessibility."

Up the authoring-publisher POV

(authorized reproduction of an internal email)

on that note...it would be interesting to up the authoring-publisher POV here...the ability to painlessly transition between offline to online is only gently touched on with reference to content production but this would actually be the most significant gain for current publishers whether they be 'web first' or a paper-first publisher that also produces EPUB. It could potentially help save publishers an enormous amount of time and expense (in typesetting and document conversion) if the production storage format was EPUB.

This paper points suggestively in that direction but the argument might be made stronger ...of course that might only confuse the issue...I guess it would only really make a difference to advocacy if publishers wanted a voice in their future ie. publishers were 1) aware of the importance of 'the open web platform' in their future 2) members of the W3C

adam, [email protected]

more wordsmithing

"EPUB is built on Web Standards, and the individual items that make up an EPUB publication are exactly the same types of content that make up a Web site: [HTML5], [SVG], [CSS21], [ECMAScript], [JPEG] and [PNG] images, etc."

Why not just say "EPUB is built on Web Standards, and the individual items that make up an EPUB publication are identical to types of content on a Web site:...."

Issues by Dave Pawsons

(Reproduced from email with authorization)

Terminology could do with a glossary

  • Portable document
  • Open web platform
  • Page (as used here)
  • STM (not expanded, ditto other acronyms)

No overall goal? See 2. for possibility " this effort will provide an open, non-proprietary, portable document format"

'Bright line distinctions' - could be clearer? Suggest different categories.

'or on-demand use of their distinguishing characteristics' suggest 'separate' characteristics

' People with technical expertise by anyone other than information technology departments were not previously required in traditional publishing workflows. ' Requires clarification?

'in case the data is too large to be distributed offline' Is this a proposed feature? Very useful for small memory devices, would be a worthwhile goal IMHO.

'as well complex administrative' s/well/well as/

Complex user manuals. A requirement is strict vsn control. Optional feature perhaps?

2.5 No mention of variant screen size? How to address 32" flat screen and 4" mobile issues? Making content flexible enough to be usable on both, or address this issue in the case of non-scalability? I note 2.8 intimates this but reality is that this is not always the case.

2.7 The archival of s/archival/archiving/

Issue: Elephant in the room? Google books, Kindle etc? Possible major users, not mentioned? Hinted at in 3.6

Section 3 reads very well!

3.3 may be very hard! No mention of learning from xlink, xpointer, SGML architectural documents.

3.7 Very important for me as a user.

Summary. An excellent start gentlemen. I wish you well. If I can help further, please ask.

Best wishes.

Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
http://www.dpawson.co.uk

Use of "Portable Document Form"

I recommend that we only use "Portable Document Form" when we actually are referring to PDFs - the PDF standard. We need to find some other way of saying "Portable Document Form" when we do not mean PDF.

Additional (or alternative) example for the scholarly publishing section

Here is a link to the Brembs article I mentioned: http://f1000research.com/articles/3-176/v1. Fig 3 is the figure that is just data+code, that you then plot on the fly and can adjust parameters within the code. Fig 4 is currently a static figure but Version 2 of the article will turn this into a living figure and should be out in the next 1-2 months.

Kind regards
Rebecca

Rebecca Lawrence, PhD
Managing Director, F1000 Research Ltd

more rewording

"Usage of hybrid applications (that utilize Web content along with native application technologies), and Web-technology-based system applications is growing."

to

"Hybrid applications that use web content alongside native application technology, and web-technology-based system applications are growing."

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.