Git Product home page Git Product logo

document-viewer's Introduction

This is the repository for the legacy DocumentCloud site, please see the current repository here:

https://github.com/muckrock/documentcloud

______                                      _   _____ _                 _
|  _  \                                    | | /  __ \ |               | |
| | | |___   ___ _   _ _ __ ___   ___ _ __ | |_| /  \/ | ___  _   _  __| |
| | | / _ \ / __| | | | '_ ` _ \ / _ \ '_ \| __| |   | |/ _ \| | | |/ _` |
| |/ / (_) | (__| |_| | | | | | |  __/ | | | |_| \__/\ | (_) | |_| | (_| |
|___/ \___/ \___|\__,_|_| |_| |_|\___|_| |_|\__|\____/_|\___/ \__,_|\__,_|

DocumentCloud is a catalog of primary source documents and a tool for annotating, organizing and publishing them on the web. Documents are contributed by journalists, researchers and archivists.

This codebase contains the entirety of DocumentCloud.org, and pulls together the rest of our open-source projects: Docsplit is used to extract data from incoming documents; that work is parallelized across CloudCrowd; data on the client-side is modeled by Backbone.js, which depends on Underscore.js for all of its abilities; Jammit concatenates and compresses the dozens of CSS and JS files into a single asset package; the NYTimes' Document Viewer displays the documents, while Pixel Ping records the traffic.

If you find a security issue while browsing the source, please email [email protected] to inform us of the problem.

Code contributed to this project is provided under the MIT license (see the LICENSE file). Some components of the project are subject to their own licenses as indicated (see /vendor and /public/javascripts/vendor directories).

document-viewer's People

Contributors

amclean avatar ashaw avatar bkoski avatar croby avatar etodanik avatar jashkenas avatar knowtheory avatar nathanstitt avatar palewire avatar reefdog avatar samuelclay avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

document-viewer's Issues

Page number text input does not switch to proper page when changed

When the page number is changed in the text input at the top of the sidebar, the viewer should switch to the appropriate page, instead it pauses briefly, then either the text input resets to the previous value, or it jumps to an apparently random page.

I've traced it down to the logic in acceptInputCallBack helper in helpers/search.js

var pageIndex = parseInt(this.elements.currentPage.text(),10) - 1;

Note that it calls .text() to get the value of this.elements.currentPage. This would be fine, except this.elements.currentPage is set to two references. One is the input in the sidebar, and the other is in the footer.

So if the viewer is on the 2nd page, and the input has 3 typed in it, text() will return "32". Three for the user typed value, and 2 for the hidden input in the footer. If the document happens to have 32 pages, it'll jump to that page, otherwise it will revert the edit and set the value of the text input back to "2"

In the past the footer was not rendered if the sidebar was present, so there would only be a single element contained in the this.elements.currentPage jQuery reference.

The check for the sidebar in footer.jst was removed as part of the responsive commit by jashkenas: 8577a01#diff-364047077bc9bf26782abf38ee33fa67

And then the options being sent to footer.jst was removed as well by knowtheory: f172e52

The quick & easy fix would be to change the selector in acceptInputCallBack so it's more specific and only reads the input that is focused. Or should we attempt to figure out why the footer is being rendered and then hidden when the sidebar is visible?

Viewer CSS and JavaScript aren't served gzip-compressed

The DocumentCloud Viewer CSS and JavaScript aren't served gzip-compressed from the Amazon S3 servers.

Example URL:

http://s3.amazonaws.com/s3.documentcloud.org/viewer/viewer.js

Response Headers:

Date: Mon, 22 Dec 2014 17:42:35 GMT
ETag: "c461f0969585f2aaff54e405967d12be"
Last-Modified: Tue, 16 Dec 2014 18:18:14 GMT
Server: AmazonS3
x-amz-id-2: ukt2TSsLb5DM0RUlxdYsr/1pozp/gsREOVRJ9xlWIlPlniZjqA3vf8ZACmCu7mP95yWV78oKMXs=
x-amz-request-id: 16053EED6EADE536

While Amazon S3 doesn't support on-the-fly compression, you can compress the files locally and then upload them. You'll need to add the Content-Encoding: gzip header to the files.

A better solution might be to use CloudFront.

enable annotations

Hi, how can i enable the annotations, cause i cant figure out.

thanks..

pd: theres a guide to implement the project on my own server?

Document zoom level often too low; document would still fit with one or two notches higher zoom

This happens with many documents and browser widths, but for a concrete example: set your browser window width to 1110 (or thereabouts, check in console with window.innerWidth), then go to https://www.documentcloud.org/documents/527670-oregon-zoo-elephant-contract.html

(Also make sure your browser is at default zoom and that the sidebar appears.)

The zoom level that the document loads at is lower than it could be. On Chrome/Mac, you could up the zoom by two notches and still contain the entire document image plus the "p. 1" text. On Safari/Mac you could increase by one notch.

Getting the image as large as possible is important for Overview, because the user relies in part on very rapid user scanning of many documents, and we don't have any screen real estate to waste on needless padding.

don't use document.write to insert script

Chrome emits this warning:
A Parser-blocking, cross-origin script, https://assets.documentcloud.org/viewer/viewer.js, is invoked via document.write. This may be blocked by the browser if the device has poor network connectivity.

Here's the code in question:

/* Request the viewer JavaScript. */
document.write('<script type="text/javascript" src="//assets.documentcloud.org/viewer/viewer.js"></scr' + 'ipt>');

Please replace this with var s = document.createElement('script'); or similar.

mini is not defined

Hello,
I just downloaded the code and ran it as is. I get a "mini is not defined" error in firebug. When I replace the viewer.js with the one on the live documentcloud site, the error is gone and everything works fine. Is the code on here the latest code?

When I do a diff of viewer.js I do see differences. Thanks.

Translation support

I've asked this on irc, but here is more visible. Does the document cloud viewer support translations? And how can we contribute?

I'm happy to collaborate in the spanish version, if is possible.

Images loaded twice

I'm running an embedded docViewer that pulls images from Rails. I put a debug output statement in the image-return controller action. On first load, it reports just one request for each of the first 3 pages. Each page loaded after that generates 2 requests and gets 2 full responses from rails. One request is coming from DV/lib/page.js#drawImage.js, but I don't know where the other comes from. The page doesn't load until the second request completes.

Is this behavior intended? Is there a way to limit it to a single request?

Create a way to remove player from page

The popcorn.js guys have written us a plugin so that popcorn users can control the viewer. As a result, we've discovered that removing the viewer from the page leaves some orphaned listeners which complain.

Responsive option

In addition to the full screen and fixed options, is there a way to offer a viewer option that takes the width of its parent element (calculating an appropriate height from that value)?

We’ve run into an issue on our site now where the Viewer doesn’t fit in with our current responsive grid. I wrote a tiny function to calculate the width/height and pass it to the .load() method, but I’m hoping we can drop this hack at some point.

Suggestion: Remove annotation padding

I've been using DocumentCloud to highlight scientific documents, and we were struggling a bit when it came to highlighting multiple cells in a small table (even with zoom). There were 'dead zones' where the highlight cursor changed to the page grab cursor, near an existing annotation and in a small page-spanning row above it.

I was able to remove this issue by removing the 'padding: 3px' property from '.DV-annotation'. To help users trying to make similarly close highlights, this may be helpful to change.

Variables defined in block scope

I have been documenting the source and have noticed a lot of variables have been defined in a block scope manner or redeclared after being passed as a function's argument. These should be fixed to reflect functional scoping to keep my editors linter from crying ;)

"e is undefined" error when using display:none on an embedded player

I'm creating a DocumentCloud plug-in for Popcorn.js, and this error is proving to be a bit of a blocker in terms of hiding and displaying the players.

The plug-in can insert multiple players into one div and hide/display them successfully with the visibility attribute, but this proves to be a bad choice simply because the hidden players "push" the visible players to the side or down. With in-line display I can make it seem like the hidden players aren't even there.

This is when I get the error. I traced it to a function called sortPages that returns undefined when called in the drawPages function.

Here's a case where the bug is present: http://www.chrisdecairos.ca/projects/documentcloudissue/plugins/documentcloud/popcorn.documentcloud.html

Let me know If I can help with anything, thanks!

Adding an annotation or redaction resets the zoom level

When the viewer is zoomed in or out, adding an annotation or redaction resets the zoom to the default size.

This happens because DV.Api.addAnnotation calls redraw(), which in turn calls this.viewer.helpers.autoZoomPage().

As a test, I removed the call to redraw() and the newly drawn annotations then failed to display. Therefore it appears that the redraw() call is necessary.

Documentation reflects old (NY Times Document Viewer) syntax

It appears the README is a bit out of date.

When following the current documentation, it seems like DocumentViewer's sections/chapter display is broken.

It's not. The documentation is wrong (well... just out of date). It reflects the syntax of the original NY Times Document Viewer, not the updated DocumentCloud syntax.

The documentation says to pass sections like so:

sections :
[
{ title : CHAPTER_TITLE, pages: "1-10" },
{ title : CHAPTER_TITLE, pages: "11-20" }
],

The file the viewer.html

That ain't right! Ted pointed me to a DocumentCloud JSON where sections are working correctly.

Here's how it's sections code works:

"sections":[
{
"title":"1. Introduction",
"page":8
},
{
"title":"Abstract",
"page":7
},
{
"title":"Table of Contents",
"page":4
}
],

Aside from the section pages being passed as an int rather than a string (range) there are a few other minor changes.

Meanwhile the stock viewer.html still points to a NYT json rather than a DV json, which reflects this old markup.

p.s.
I had a pull request ready to go when I thought it was just about sections. I'll be amending that and will be sending a new one shortly.

Add download option on embed view

Now you have to click on the fullscreen button to access to the full-width page in order to download a pdf. Is possible to add a button to directly download a document in the embed visor?

thanks

how to convert the original source PDF into GIF Images to be viewed in this HTML5 document viewer

I was trying to make this plugin work with a plain PDF file but I wasn't able to. Then I read somewhere that this plaugin work only with gif images created from thhe original PDF file, but there's no converter available or documentation on how to do that.
Could you help me?

BTW: This is a awesome plugin but I think that it lacks a lot on documentation to be usable from the open source community.

Installing Document-Viewer

I copied the unpressed source to my web server, however when I load viewer.html it says loading but the document doesn't get rendered. Did I miss anything during installation?

Support translations in viewer interface

Sprung from: documentcloud/documentcloud#121

Would like to pack and deliver (like UPS trucks) translations to the viewer chrome, which will then decide which language to offer the user based on:

  1. Default language for viewer defined at org level
  2. Custom language for viewer defined on document info
  3. Custom language for viewer defined by reader and saved as cookie

List of file formats supported

Hi,

I found this project exciting. Is it only for pdf? Are text(.txt), word(.doc), excel(.xls) and power points(.ppt) files supported too. Can you list the file formats you support?

Edit capability?

Hello,
I was wondering if the document-viewer contains the UI for editing..i.e redactions, annotations..etc.. I see from documentcloud that it should, but going through the code I'm confused if this is only a 'viewer' or if there's a way to turn on the editing capabilities. I can see that there are callback methods I can attach...but that's about the extent. Can someone point me to the source files to go through, or give an overview of how I can turn it on if it's there? Thanks.

Expose document metadata in viewer API

This isn't currently even provided by the document JSON, but when it is, we should have a generic viewer.api.getMetadata(fieldname) function to read it.

Range of z-indexes for viewer is very large

The fellows over at the New Jersey Star-Ledger have indicated that the viewer CSS is messing with their headers, since our z-indexes range from 0 all the way up to 20000.

Someone needs to go through all the z-index declarations in our CSS files and bump them down in a manner that keeps all of our declarations proportional, but reduces the values by either an order of magnitude (20000 to 2000, 15999 to 1599, and 1 to 1) or into simple increments of 5.

Getting stuck on certain searches

Paging through search results seems to fail at certain spots.

To reproduce:

http://www.documentcloud.org/documents/1004523-christie-document-exhibit-b.html#document/p1

  1. Search for the word "general".
    -Shows matching page 1 of 60.
  2. Advance to the next result twice using the "Next" button in the top-middle of the viewer.
  3. It's now on matching page 3 of 60, which is page 30 in the document. It has no highlighted match. At this point, clicking either the "Previous" or "Next" button does nothing.

Searching for "jersey" causes the same problem when you hit the second matching page (page 9 in the document).

Occurred in both Chrome 33.0.1750.149 and Firefox 27.0.1.

Multi-document viewer

Jeremy's had a think with one of his collaborators at the NYT about multi-document viewers this is what's come of it (and the discussion w/ me):

  • We could repurpose the existing document viewer and chaptering system.
  • Provide an additional level of hierarchy in the chaptering interface to distinguish separate documents.
  • Accept an array of document data, rather than just a single record.
  • ideally the viewer should respect both the absolute number of pages as well as the in document page #.

Viewer w/Public Annotations: "Log In" / "Log out" cleanup

On a document with public annotations available, logging in should be via a button, should use same case ("Log in") as home page.

Current:
screen shot 2013-08-15 at 2 36 17 pm

Currently, homepage log in button reads "Log in":
screen shot 2013-08-15 at 2 48 00 pm

"LOGOUT" button should follow the same formatting. "Log out"

js error: "print_notes_url is not defined"

In the downloaded zip: documentcloud-document-viewer-0.1-555-g38f00b4zip
I get this error when I try and run the included production file viewer.html

print_notes_url is not defined

Maintain in-page scroll position when moving between viewer tabs

Steps to reproduce mild annoyance:

  1. Open a document with a long page of text
  2. Scroll down a bit (so the top of the current page extends beyond the top of the viewport)
  3. Flip to the Text tab and scroll it down a bit as well
  4. Flip back to the Document tab
  5. Flip back to the Text tab

What happens: Every time you change tabs, scroll position is returned to the top of the current page.

What would be ideal: Unless the current page has changed, the viewer should maintain the old scroll position. This is particularly helpful when flipping between the Document (or Pages) view and Text, since you're often referencing the text view to check OCR accuracy and scrolling back down to the relevant bit becomes tiring.

on getPageText, DocumentViewer returns undefined if text has not yet been presented in the viewer

Go to this URL: https://www.documentcloud.org/documents/1314000-sbrown-2014-2017-executed-contract.html

In console, run:

DV.viewers[_.keys(DV.viewers)[0]].api.getPageText(2)

It returns undefined.

Now go to page 2. Click the text tab. The text loads. Then in the console, run this again:

DV.viewers[_.keys(DV.viewers)[0]].api.getPageText(2)

This time, it returns page 2 of the document text.

It would be great if the API would pull the requested text on call to getPageText -- even if the user has not clicked that text tab yet. I need this functionality in order to (cleanly) add span tags around each token in the document text. Trying to do this: datamade/parserator#16

An error handling endpoint

We currently provide an afterLoad callback, but no mechanism to inspect whether the ajax response for the requested resource has returned a non 200 status.

One current route around this is to check viewer.api.getState() returns 'undefined' in the afterLoad.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.