Git Product home page Git Product logo

Comments (5)

sean-roberts avatar sean-roberts commented on May 24, 2024

After some research, here are my findings:

  • The Chomsky referred to above has two elements that are in that area. The plain text and a pdf link that is an overlay type of element.
  • The error is triggering in pdf.coffee but it's because a null value is being sent to the function instead of a Node.
  • The function that is unexpectedly grabbing a null value is in annotator.js around line 621 nr.start = r.start.nextSibling;
  • r.start.nextSibling is null for this element. I don't know why exactly. However, this is actually the startContainer that the selection range is returning. And if it does not have access to the nextSibling then that totally fails the root assumption that the code makes.

I don't know what the thoughts are on modifying annotatorjs, but I am going to do it to at least have something to point at in code for what my suggestions are.

from client.

sean-roberts avatar sean-roberts commented on May 24, 2024

@robertknight can you let me know your thoughts on the above + the change?

from client.

robertknight avatar robertknight commented on May 24, 2024

I don't know what the thoughts are on modifying annotatorjs, but I am going to do it to at least have something to point at in code for what my suggestions are.

Upstream the source code for BrowserRange has been removed and the functionality for converting between XPaths and Range objects has been moved into xpath-range, which is conceptually similar to the dom-anchor-* libraries that we use for anchoring quote and position selectors. Ultimately what I'd like to do is replace this code from Annotator with xpath-range which should be easier to hack on.

However, to start with it should be possible to create a test case in anchoring/html/html-test.js which reproduces the problem here - since the BrowserRange#normalize method is also called when resolving XPath RangeSelector selectors to Range objects. Once that is done, we can modify the Annotator code to make the test pass, then later swap out the implementation.

from client.

robertknight avatar robertknight commented on May 24, 2024

Stepping back one level - it isn't completely clear why the range needs to be normalized using the code in Annotator in the first place. My guess is that it is because getNodeTextLayer() requires its argument to be an Element rather than just a Node. If we can figure out what invariants the rest of the describe() function in pdf.coffee requires of the Range, another way we could approach this might be able to replace the call to BrowserRange#normalize with something simpler.

For reference, here is the original CoffeeScript code for BrowserRange with comments which explains a bit more amount what normalization is for - but bear in mind that this code was written several years ago to cope with differences in how browsers represented selections at the time.

from client.

judell avatar judell commented on May 24, 2024

@klemay, @jeremydean here's another thing to be aware of if you weren't already. PDF has its own notion of annotations. I don't think they are very common, but when they exist, as in this example, they interfere with our mechanism.

from client.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.