Git Product home page Git Product logo

Comments (3)

tombh avatar tombh commented on May 23, 2024

Replying here to @tobimensch's query from #42.

Since you're saying you need special treatment for each character, and this is of course slowing things down, I've one very simple idea to speed it up:
Assume that if the first character in a word is visible, then the rest is in 99.9999% of cases visible as well. For example if e in example is visible there's no need to test for xample. I don't know a single webpage where words get split-up into multiple lines or anything like that. If an average word has five characters then this might speed part of your algorithm up by a factor of five. Not sure how much that helps or if it helps at all.

You're definitely thinking in the right direction here! But there are 2 other questions we need to answer first. What is the true error rate for making assumptions on whole words? And where exactly is the bottleneck?

To answer the first question; consider the case of overlays, say a dropdown menu, so that the menu might obscure text at the start of an article. Here you can imagine that actually quite a few lines of text could be truncated mid-word by the body of the menu dropdown. Furthermore, it is arbitrary which text would get overwritten, currently the rule is that whichever text nodes appear later in the DOM would end up clobbering text parsed earlier in the DOM. There is no guarantee that menus will always appear later in the DOM. And of course menus are just one example of text overlaying other text. Interestingly hidden text is probably the most common case of text fighting for the same position in the TTY grid. But why not just use getComputedStyles() to get an element's CSS visibility and z-index? That leads onto the next question...

getComputedStyles() is both expensive and causes DOM reflow. Reflow in particular might be solved by using the Fast DOM library mentioned in the title of this issue. However, the fact is that there currently is no hard data indicating where the algorithm's actual bottlenecks are. For lack of profiling data we can only guess at the moment. BTW, I've tried using Firefox's built in devtools profiler, but from what I can tell it doesn't seem to hook into webextension code? But that might just be my lack of experience with JS profiling. Anyway, as you quite rightly point out, the most obvious bottleneck is the most inner loop of the algorithm: character-specific code. Essentially the algorithm is O(n) so, as you say, making assumptions based on whole words, would give as O(n/5)! But it's not actually clear that that is where the bottleneck is. I think there are 2 others; firstly the pre-rendering screenshot couplet and secondly DOM reflows.

The screenshot couplet occurs before every text parsing frame. As a small side note, text rendering doesn't actually happen for every frame, it should only happen on 3 occasions; the small and fast TTY-sized text frame at page load, the 6 times larger text frame at page load and a small and fast TTY-sized frame after every DOM-mutation event (eg when you click on a dropdown menu, etc). The 2 screenshots consist of the browser's vision of the page with and without text. With this information it is possible to detect the visibility of characters - if the average colour at the coordinates of a given character change between screenshots, then the character is most likely visible. Although taking 2 screenshots sounds excessive it is actually usually heavily optimized by the browser (think of screen recording extensions), even possibly using the GPU. Not only that but it dramatically simplifies the character visibility detection code (apart from a couple of edge cases, see; #23 for one example) and gives us the text colour for free. But is this really worth it? What's the relationship between page size/character count and rendering times? We really need solid profiling data.

And so finally onto DOM reflow performance. I actually feel like this is the key to all this. Hypothetically, if DOM reflows were free, then it would be the perfect solution, because we would finally have perfectly rendered text, without edge cases, as you could just query the visibility, z-index, colour, etc of every single character. Which is why I think it'd good to experiment with Fast DOM as it claims that, if DOM read/writes are properly batched, then significant efficiencies can be gained.

from browsh.

tobimensch avatar tobimensch commented on May 23, 2024

Thanks for the explanation.

Maybe my detect parts of a word and assume that the whole word is there approach can be improved like this:
Sticking with the word "example" for my examples.

Let's say we detect "e" is there, this isn't always enough to conclude the whole word is there, like you explained.

But if instead we skip every second character and detect that a, p and the final e are also visible, wouldn't it be very reasonable (close to 100%) to assume that x, m and l are visible.

Now we know that there is space for exactly one character between e and a, between a and p and between p and e. There's no logical reason for those characters to be replaced with a dot or something like that.

This wouldn't result in the same theoretical speedup as my first suggestion, but it might still result in a factor 1.5 to 2 improvement depending on the rendered page.

Of course, if one of every 2nd character isn't detected we must fall back to detecting the skipped characters.

from browsh.

tombh avatar tombh commented on May 23, 2024

That's certainly a possible approach, but there are so many other areas to try first. I've still got hope that there are some dramatic improvements to found for properly managing DOM relfows.

from browsh.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.