Git Product home page Git Product logo

Comments (19)

agitter avatar agitter commented on July 18, 2024 1

Here's the docker run command you can use to test Athena outside of the Manubot build script:

  docker run \
    --rm \
    --shm-size=1g \
    --volume="$(pwd)/output:/converted/" \
    --security-opt=seccomp:unconfined \
    arachnysdocker/athenapdf:2.16.0 \
    athenapdf \
    --delay=5000 \
    --timeout=10000 \
    --pagesize=A4 \
    manuscript-athena.html manuscript.pdf

Both the delay and timeout are set fairly high there. This assumes you have the HTML in output/manuscript-athena.html relative to where you run the Docker command.

Starting with the HTML file I linked above may be okay. It had no plugins, only the theme.

from covid19-review.

vincerubinetti avatar vincerubinetti commented on July 18, 2024

This is something I missed in #1036.

The issue (I'm almost positive) is these lines in back-matter.md:

<style>
  *:not(h2)[data-collapsed="true"] {
    display: none;
  }
</style>

It should instead be:

<style>
  @media only screen {
    *:not(h2)[data-collapsed="true"] {
      display: none;
    }
  }
</style>

To prevent the section from hiding when printing. Luckily I did it correctly in rootstock so it is not an issue there. I guess it was missed here because this was a quick "hotfix".

from covid19-review.

agitter avatar agitter commented on July 18, 2024

I tested that change in 045149c, but now there is no text at all in the PDF.

Here's a screenshot from the PDF artifact:
image

from covid19-review.

vincerubinetti avatar vincerubinetti commented on July 18, 2024

To be honest I have no explanation for that. That seems like a rendering glitch with Athena. Can you try taking out that <style> tag all together?

from covid19-review.

vincerubinetti avatar vincerubinetti commented on July 18, 2024

That screenshot you posted above isn't really something that could happen via CSS as far as I'm aware. What that really looks like is when a browser is slowly trying to load and render content. In fact I can see something very similar to that in my Chrome browser when loading this giant manuscript. I also notice that trying to print the html manuscript right from the browser, it takes a very long time for the preview to show up.

Is there some kind of Athena timeout setting, like how long it tries to render something before giving up?

from covid19-review.

agitter avatar agitter commented on July 18, 2024

I'm still puzzled about what's going on. It makes sense that this is related to rendering and Athena. However, in #1132 I doubled the rending delay and then doubled it again, and the issue still persists. It's there with and without the <style> tag in the back matter file.

from covid19-review.

rando2 avatar rando2 commented on July 18, 2024

I've been following this though it's unlikely I'll have anything useful to offer -- however, in case my "beginner's mind" could be helpful, Β is there any chance this has something to do with the default collapsing for references we implemented in the HTML output?

{% if format is not defined or format is defined and format != "tex"-%}

### References <!-- $data-collapsed="true" --> {.page_break_before}

from covid19-review.

vincerubinetti avatar vincerubinetti commented on July 18, 2024

@rando2 I could see that affecting the references section somehow, but I can't think of how it would affect the rest of the document.

And we're sure that this problem started happening with #1036 , and nothing else changed? Any settings in AppVeyor or whatever else is involved in producing the PDF?

The only thing I can think to do is start to revert some of the changes in #1036. Maybe start with disabling the scite plugin completely (though really none of that even should be running when printing anyway)?

Another thing we could try is removing some of the sections of the document to see if making it smaller gives Athena less trouble. Was the Athena version changed at all?

I'd offer to help more with this, but as I recall this repo takes an hour for me to clone and just as long to build. But if we can't find the problem I'll have figure out a way to be able to troubleshoot it locally.

from covid19-review.

vincerubinetti avatar vincerubinetti commented on July 18, 2024

Note, I previously had a debug message in here about whether that window.matchMedia("print") was actually working in Athena or elsewhere, but I removed it. My best hunch is that something about this changed, or I was wrong when I tried to verify it with Athena. It doesn't seem to work in Chrome as expected right now: https://jsfiddle.net/zbxLc6u5/show (try print preview).

I'd definitely say try disabling all the plugins for printing first. In fact, I can't recall... does Manubot have a way to have different configs for different export formats, HTML/PDF/DOCX? If so, really all plugins (except maybe mathjax?) should be disabled for everything but HTML anyway, as they're for interactive things.

from covid19-review.

agitter avatar agitter commented on July 18, 2024

I'm going to move my attempts to fix this out of #1132 so we can merge that. I need to test some more builds locally. I'm not confident that this started with #1036. Testing with a much smaller manuscript is a good idea for a next step.

does Manubot have a way to have different configs for different export formats

Yes, we're using a different config for athenapdf. I can try disabling more plugins. scite is already disabled. We use a versioned athenapdf image from DockerHub, so that shouldn't have changed.

from covid19-review.

agitter avatar agitter commented on July 18, 2024

I'm trying some trial and error debugging. I started from be61b3f. My changes are in https://github.com/agitter/covid19-review/tree/pdf-references. My local build command is $ BUILD_HTML=false BUILD_INDIVIDUAL=false build/build.sh.

I removed all sections except for inequality (a small one). The local athenapdf build did not have any references in the PDF.

I added the <style> block above to the back matter. None of the text showed up. The intermediate output file output/manuscript-athena.html does have text. The references are there, initially collapsed.

I increased the athenapdf delay to 5000 and timeout to 10000. The text still did not appear. I increased delay to 20000 and had the same result. This appears to be working because now the output log includes PDF Conversion: 23531.283ms.

Opening output/manuscript-athena.html and printing to PDF with Chrome works fine.

I reset delay to 5000 and removed all themes and plugins from athenapdf.yaml. The references no longer appear in the HTML. However, there is text in the PDF!

I restored these lines of athenapdf.yaml and the text disappeared again:

  • build/themes/default.html
  • build/plugins/core.html # needed for all first-party plugins
  • build/plugins/accordion.html
  • build/plugins/attributes.html
  • build/plugins/mathjax.html

I removed accordion.html. The HTML version does not have references. The PDF does not have text.

I removed core.html and accordion.html. Same result.

I removed mathjax.html (leaving only themes/default.html). Same result.

Here are the current build outputs as of agitter@5e560cf:

Then I removed the entire <style> block in the back matter (agitter@619361a). Now, everything appears to be working. My preliminary conclusions are that

  1. themes/default.html is required for Athena builds for the CSS
  2. Something in themes/default.html interacts poorly with that style block

from covid19-review.

vincerubinetti avatar vincerubinetti commented on July 18, 2024

I'm going to bite the bullet and try to get the manuscript working locally so I can test this.

Or perhaps I can just find a way to use Athena directly with your manuscript-athena.html.txt, which seems to be sort of a minimum working example of the problem? Do you think that would still show the problem? Because being able to just edit the .html file and print immediately, without having to run the whole Manubot process, would speed up troubleshooting.

from covid19-review.

agitter avatar agitter commented on July 18, 2024

from covid19-review.

agitter avatar agitter commented on July 18, 2024

This was accidentally closed automatically by my commit message

Test PDF references fix
https://github.com/greenelab/covid19-review/issues/1133#issuecomment-1100715551

from covid19-review.

vincerubinetti avatar vincerubinetti commented on July 18, 2024

I can provide the HTML files from local builds using any combinations of
content and plugins you'd like to test.

Thank you for the offer but there would be too many to test. Also, the permutations I'm testing aren't linear/deterministic; they're informed by the results as I go along.

I was finally able to get Athena working locally on my M1 Mac, and reproduce the problem. I'm currently testing lots of different changes to the CSS, trying to nail down exactly which combination of selectors and media queries make it malfunction, but it seems clear it's a glitch with Athena. All of these things I'm trying are perfectly valid CSS.

from covid19-review.

vincerubinetti avatar vincerubinetti commented on July 18, 2024

Just to clarify though... you tried removing the <style> block and it still didn't work for you? Because for me (I had to run Athena v2.13.0 to get it to run on my Mac), removing the style block fixed the issue.

from covid19-review.

vincerubinetti avatar vincerubinetti commented on July 18, 2024

Here are my test results:

Test results
/* fails */
@media only screen {
  *:not(h2)[data-collapsed="true"] {
    display: none;
  }
}

/* works */
*:not(h2)[data-collapsed="true"] {
  display: none;
}

/* fails */
@media only screen {
  *:not(h2) {
    display: none;
  }
}

/* fails */
@media only screen {
  :not(h2) {
    display: none;
  }
}

/* works */
@media only screen {
  h2 {
    display: none;
  }
}

/* only certain (bold) text shows, even within paragraphs?????? */
@media only screen {
  p:not(h2) {
    display: none;
  }
}

/* works */
@media only screen {
  * {
    /* display: none; */
    color: red;
  }
}

/* fails */
@media only screen {
  [data-collapsed="true"] {
    display: none;
  }
}

/* works */
[data-collapsed="true"] {
  display: none;
}

/* fails */
@media only screen {
  *[data-collapsed="true"] {
    display: none;
  }
}

/* works */
*[data-collapsed="true"] {
  display: none;
}

/* works */
@media only screen {
  h2[data-collapsed="true"] {
    display: none;
  }
}

/* works */
h2[data-collapsed="true"] {
  display: none;
}

Something about combining @media only screen with a complex selector within it is causing an issue.

Athena is using Electron 3.0.5. The current version of Electron is 18. Electron 3.0.5 is using Chromium version 66.0.3359.181. The current version of Chrome is ~100.

I happen to have a Browser Stack subscription, and I tested printing the test manuscript in Chrome 66. It appears normal, but then going to print preview, lo and behold:

image

Given that this looks exactly like our issue, I'm assuming Athena simply loads up Chrome (via Electron), opens the document, and has Chrome print it to a PDF.

Regardless though, this seems to be a Chrome bug. And the bug is likely that Chrome 66 doesn't have a long enough timeout, as again, this is exactly what it looks like when the browser is in the process of trying to paint/render the whole page.

Considering all of the above, I can say with confidence that it's a Chrome/Athena bug and not anything I'm doing with the CSS/JS. I'd make a strong recommendation to ditch Athena for anything else viable. It is abandonware.

from covid19-review.

agitter avatar agitter commented on July 18, 2024

Just to clarify though... you tried removing the <style> block and it still didn't work for you?

I didn't remove the entire style block. I tested with the style block current on the master branch and your corrected version from #1133 (comment) above.

Great debugging. That is a convincing explanation of what's going on. I agree with your decision to ditch Athena. In the immediate short term, what should we do for this particular manuscript? Some ideas (I haven't thought through carefully):

  • Revert the collapsed references and accept the longer HTML load times
  • Try weasyprint PDF builds
  • Tolerate the missing references from the PDF until we can ditch Athena

from covid19-review.

vincerubinetti avatar vincerubinetti commented on July 18, 2024

It looks like this will work too:

     <style>
      @media only screen {
        #references[data-collapsed="true"] {
          display: none;
        }
      }
    </style>

However, I believe it's probably better to just remove the style block (and its associated comments). Perhaps you'd want to remove the pre-collapsing with <!-- $data-collapsed="true" --> as well?

The benefit of that block was only a tenuous theoretical guess. Even if we observed some speed up from it in one browser, there's no guarantee that it will behave the same in another browser, or even the same browser down the road, because it depends on the implementation details and optimizations of the browser.

Also, even if it did work as expected, I'm not sure that the time saved would be much compared to the loading of the Scite badges, which we've already optimized with IntersectionObserver (only loading them when they come into view).

from covid19-review.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.