dginev / ar5iv Goto Github PK
View Code? Open in Web Editor NEWA web service offering HTML5 articles from arXiv.org as converted with latexml
Home Page: https://ar5iv.org
License: MIT License
A web service offering HTML5 articles from arXiv.org as converted with latexml
Home Page: https://ar5iv.org
License: MIT License
Exact location of issue
https://ar5iv.org/html/2105.07028#bib.bib2
Also the authors list at the top, Hervé Ménager
should be Hervé Ménager
Desktop (please complete the following information)
Thanks for the great service!
Exact location of issue
Table 1
Example:
Column 1, row 1 of Table 1 shows text saying "\rowcolor[gray]0.8" - that text is a LaTeX command that should not be visible.
Problem details
The "\rowcolor[gray]0.8" LaTeX command appears in the visible text of a table. This LaTeX command should not be shown as text in the HTML page.
Desktop
Fatal error in conversation of Lurie's HTT. Apparently, \ar from xypic isn't supported.
Exact location of issue
https://ar5iv.org/html/1608.05377#S0.F1
Problem details
Figure 1, a vector graphic embedded in the original PDF, has been converted incorrectly to an SVG image in the HTML5 version of the paper.
Figure 1 as it appears in the original PDF:
Figure 1 as rendered on ar5iv.org:
Suggested resolution
Although vector images are generally preferable, in my opinion the image converter should be able to tell if it's encountering elements it can't handle and fall back to a raster conversion. Lightweight graphics are nice, but silently corrupting images is very bad.
Desktop
Smartphone:
Additional context
This image was originally created in Keynote on MacOS. (I am the author.)
Conversion is overall very good:
https://ar5iv.org/html/2111.00171
https://arxiv.org/pdf/2111.00171.pdf
Several issues. roughly in order of importance:
1 - Citation style has been switched from name+year to numbered. I prefer numbered but this LaTeX doc was written to directly reference articles (\cite{ }) and indirectly reference articles (\citep{ }). example from PDF:
Maintaining compatibility with both \cite{ } and \citep{ } is probably important for many docs.
2 - Equation numbering messed up. Example:
No idea what happened here, since it seems correct everywhere else.
3 - hyperref package has not been fully included. All links, refs and labels seem to work, but no longer have the red colour applied to them (defined via color package). This seems weird since for example section headers are red in the ar5iv doc:
I think coloured links improve the readability of the doc, as in the PDF version:
Of course now I am just realizing I used the incorrect red for one of colour package defs . . .
4 - Custom symbol for tensor is floating away in ar5iv:
while it's in the correct place in PDF:
Probably my fault for wanting tensors to look different from vectors.
Exact location of issue
Reported on twitter
Still has issues with the fancyvrb package as code examples are not rendered properly,
e.g., https://ar5iv.org/html/1405.2330 (Fig. 1, 4, 5, 6).
Problem details
The HTML is not accurately recreating the fancyvrb listing structure.
(Optional) Expected behavior
Should roughly match the PDF
Desktop (please complete the following information)
Screenshots
conversion fails due to missing macros
for example
Warning:missing_file:jmlr Can't find binding for class jmlr (using OmniBus)
at main.tex; line 4 col 0 - line 4 col 1
Anticipate undefined macros or environments
perhaps adding style files fromhttps://github.com/JmlrOrg/jmlr-style-file
could help
Exact location of issue
For example, in the first section of https://ar5iv.org/html/2012.05155.
Problem details
I expected to see something when hovering (or clicking) the reference [EFLS95]
(for example), but nothing happens.
Desktop
Exact location of issue
Todo note, such as: https://ar5iv.org/html/2104.13225v3#todo1
Problem details
These todo notes should not be displayed, as the sources disables them via
\usepackage[disable]{todonotes}
Exact location of issue
Unsupported latex commands in header (title and author information):
Image display not working correctly:
Problem details
In the header there are 4 latex commands unsupported (marked red) in the author information section, and \capitalisewords
in the title.
Figure 1 is not displayed correctly. The PDF image is not truncated as in the PDF version (using the Tex source files).
(Optional) Expected behavior
Either hide the unknown formatting commands since visually it would still look ok, or support the underlying document style.
Allow the auto-cropping of \includegraphics
of PDFs? (Not sure how it works exactly. And whether this can be supported for HTML.)
Desktop
Smartphone
Screenshots
Exact location of issue
https://ar5iv.org/html/1608.05377#S0.F1
Problem details
The latex \rangle math symbol isn't rendering on mobile (at least for Chrome and Safari, iPhone 12, iOS 15.2.1). For example, it should appear where the red circles are:
This occurs in many places in the paper.
Expected behavior
This math is displayed correctly on desktop:
Note that this problem doesn't appear on either the mobile or desktop versions of the arxiv-vanity rendering of this paper. Interestingly, arxiv-vanity has some errors of its own for this paper which do not afflict the ar5iv version. Note, however, that as arxiv-vanity only has v2 of the paper, rather than v1, it's possible (though I think unlikely) the discrepancies are due to differences in those versions.
Note also that "requesting desktop version" of the page does not fix this problem on mobile even though the webpage renders correctly on an actual desktop.
Smartphone (please complete the following information):
Exact location of issue
Section 1.1 here: https://ar5iv.org/html/2012.05155#S1.SS1.2.1.1
Problem details
Figure reference numbers are incorrect.
For example, it reads Fig. 0(c)
instead of Fig. 1(c)
.
Desktop
We have received requests to take down certain articles when authors are dissatisfied with the output fidelity, for a number of separate specific reasons.
I think the ar5iv web service should automate this. An article_exclusions.toml
file could hold a list of IDs, turned into a HashSet
when the rocket service is deployed, and checked against when articles are served. An article_excluded.tera.html
template can then explain the specific page is missing on author request.
It has also been suggested to add a button where an author could request an article to be added to exclusions. The main question there is how to authenticate the person in question is indeed the author -- to avoid a malicious bot just requesting all articles be taken down. For now I won't work on a button, but will respect private communication that requests takedowns, as I have done during the first week of the launch.
Link:
https://ar5iv.org/html/2109.07151
Problem details
A clear and concise description of what the bug is.
Desktop (please complete the following information)
This article https://ar5iv.org/html/2003.04321
does not render beyond abstract and meta data due to "fatal errors". The first package that seems to cause issues is overpic. From the log:
Error:undefined:{overpic} The environment {overpic} is not defined.
As requested on Twitter, here's one example of several where translation has failed, I guess because this article was prepared using lhs2tex - a preprocessor for generating LaTeX with nice pretty-printing of Haskell programs (and some related languages, such as Agda and Idris).
Exact location of issue
Equation 3 of https://ar5iv.org/html/1505.00444#S2.E3
Problem details
The original LaTeX source code (generated by LyX) \left\Vert ... \right\Vert has been replaced by \left| ... \right| in the ar5iv version, which then displays only the "filler" ... between the | ... |.
Example:
None.
Expected behavior
The original LaTeX source code (generated by LyX) \left\Vert ... \right\Vert should generate || ... ||, which is a double vertical line at the left and right of the bracketed expression
Desktop (please complete the following information)
Exact location of issue
Throughout text of https://ar5iv.org/html/1810.13321
Problem details
Citations in parentheses (\citep
) are mal-formed: they contain double commas
Examples: see screenshots
Desktop
(Optional) Screenshots
(Optional) Additional context
This is a fantastic project, keep up the great work! 👍
Exact location of issue
Sadly, this is in the abstract so it's tough to pinpoint it with an anchor.
Example:
The authors section, for instance with ID id1.1.id1
Problem details
Well, it seems the authors section has been parsed wrongly, see the image below:
As requested on Twitter, an example of an article that uses minted.sty for syntax highlighting, whose translation breaks.
Exact location of issue
https://ar5iv.org/html/2101.04108 (Sec 3.4)
Sorry for not linking the exact location. I can't find the anchored links on this and other subsections, and I think this may be another problem?
Problem details
(Optional) Expected behavior
The images should render correctly as well as the equation.
Desktop (please complete the following information)
Not related to the issue but general:
Thank you for doing this.
Exact location of issue
Bibliography of ar5iv:2102.07081, right before Appendix A.
Problem details
The bibliography is missing (instead there is a line that says \printbibliography).
Desktop
Exact location of issue
https://ar5iv.org/html/1811.01740#p1.2
Problem details
Biblatex is not handled and thus the \bibliography command, which with biblatex just says which bibliography files to use, is interpreted as in ordinary latex, and dumps the contents of the bibliography on the spot, containing commands that do not exist without biblatex and thus are dumped verbatim.
(Optional) Expected behavior
If using BibLaTeX, the \bibliography command just says which file to use, and the \printbibliography command prints the bibliography.
Desktop (please complete the following information)
Ubuntu Chromium but that's irrelevant
Exact location of issue
the superset marks seem to have been parsed as emails. Also, the "topic" type seems not to have been parsed (and left a stray mark on the page)
Example:
The supersets seem to be parsed as parts of the email https://ar5iv.org/html/1705.08023
Problem details
As we discussed in a twitter thread, a support for JAIR LaTeX code (i.e. jair.sty
) is requested for proper rendering of JAIR papers posted on arXiv.
Some noticeable bugs:
Expected behavior
Display of cross-refs without any ?
; Full visibility of footnote URLs on the default display mode.
Desktop
Exact location of issue
Everywhere. The wrong latex file is being shown.
Problem details
ArXiv handels supplementary materials by compiling them and concatenating them at the end of the main PDF (as determined by toplevelfile
in the 00README.XXX
file.
Currently, Ar5iv does not seem to care about a 00README.XXX
file, and (at least for the following examples) compiles only the supplementary materials and presents it as the main paper.
Examples:
Expected behavior
At a minimum, if there is a 00README.XXX
file that has a toplevelfile
declared, ar5iv should try compiling that and leave everything else ignored.
At best it should also compile all other LaTeX files and other text file that are not ignored, either by 00README.XXX
or by having %auto-ignore
as the first line, and finally present them at the end of the main file after the references.
Exact location of issue
Section 6 experiments
https://ar5iv.org/html/2105.01344#S6
Problem details
The display abruptly stops in section 6, whereas the original paper is longer
https://arxiv.org/pdf/2105.01344.pdf
Desktop (please complete the following information)
Ubuntu Chromium
Hi,
Thank you for developing the website. I really like it.
I wonder whether we could also have shortcuts from some functionalities, especially when the toolbar is at the bottom below.
For example, having shortcuts for Feeling Lucky?
and Report an Issue
could fasten the process of testing and reporting issues.
Below is a JS snippet that I've written to add a shortcut for the Feeling Lucky?
feature
// Shortcut `shift+r` for `ar5iv.org`'s `Feeling lucky?` functionality
document.onkeydown = function(ev) {
ev = window.event||ev;
if (ev.keyCode == 82 && ev.shiftKey) {
window.location = "/feeling_lucky"
}
}
Exact location of issue
https://arxiv.org/abs/0907.1520 vs https://arxiv.org/abs/0907.1520v3
Problem details
The wrong article is loaded. https://arxiv.org/abs/0907.1520 has three versions: v1, v2, v3 with urls https://arxiv.org/abs/0907.1520v1, https://arxiv.org/abs/0907.1520v2, https://arxiv.org/abs/0907.1520v3. The root url https://arxiv.org/abs/0907.1520 should lead to the most current version, just like arxiv usually behaves when clicking the button to download a pdf.
Even manually appending v3 to https://arxiv.org/abs/0907.1520 does not help.
Desktop (please complete the following information)
Problem details
Conversion complete: 30 warnings; 4 errors; 4 undefined macros[\AND, \addr, \email, \name]
on the title page there are undefined macros
Exact location of issue
Md frame around Fig 1, and several others.
Problem details
mdframed environment is not displayed.
Smartphone (please complete the following information):
Hi guys,
thank you for your very cool project!
The article does not render at all, presumably because of how the bibliography is handled:
We include it via
\usepackage[style=numeric-comp,sorting=none,giveninits=true,maxbibnames=10,backend=biber]{biblatex}
\addbibresource{main.bib}
\AtEveryBibitem{
\iffieldundef{doi}{}{\clearfield{url}} % removes URL field from bibliography if DOI is present
}
in the preamble and
\printbibliography[title={Bibliography}]
in the text.
Cheers!
This was a first test of the "issue template", but this randomly chosen article should still get support.
Exact location of issue
Frontmatter, and consequences of raw interpretation in following sections of math/0003013.
Problem details
One of the smaller problems is the optional argument to title, which I recently patched in this latexml PR.
The bigger ones come from interpreting smfthm.sty raw, which may need dedicated binding support, or raw interpretation upgrades in latexml.
(Optional) Expected behavior
No errors, and full transport of the article content to HTML.
Exact location of issue
Eqution 1 of math/1711.02437
Problem details
Latex equation is not rendered (with red warning: 'Math input error').
Desktop (please complete the following information)
Exact location of issue
Please provide a link to the source article, ideally pointing to the exact piece of content containing the issue. Our documents have "id" attributes on each logical element.
Hi. 👋 Thanks for making this project — really nice idea and work! The render for basically all of arXiv:2109.04981 (Publishing statistical models: Getting the most out of particle physics experiments) is broken. (I'm one of the authors and have the full LaTeX source so if you need me to give specifics on any part let me know — unfortunately, one of my colleagues uploaded a broken version of the source files, but I can send you it if needed (not sure if this is part of the problem).)
Problem details
The render fails throughout the document. It starts with printing some information from the source files
\usemintedstyle
friendly \setminted[json]fontsize=, numbersep=5pt, frame=lines, framesep=2mm, gobble=0 \setminted[python]fontsize=, numbersep=5pt, frame=lines, framesep=2mm, gobble=0, linenos
and then continues to fail to typset the document properly for the remainder. c.f. https://ar5iv.org/html/2109.04981#S1 for an example.
(Optional) Expected behavior
A clear and concise description of what you think the preferred outcome should be.
For the document to render fully without errors.
Desktop (please complete the following information)
(Optional) Screenshots
If applicable, add screenshots to help explain your problem.
Exact location of issue
https://ar5iv.org/abs/2108.04105
https://arxiv.org/abs/2108.04105
Problem details
ar5iv worked perfectly on this document, amazing work team! The only issue is that the arXiv original was updated to a second revision. Both title and content on ar5iv are out of date. Would be great if https://ar5iv.org/abs/2108.04105 was showing second version and not the first one. Can this be done?
Note: this is related to #10 however I would expect, maybe naively, to see the most recent version of an article.
Thanks!
the conversion fails for unknown reasons
perhaps missing "addtokomafont" macros
Exact location of issue
https://ar5iv.org/html/1510.08473#id1
Problem details
latex constructs missing support
\SetWatermarkText
DRAFT \SetWatermarkLightness0.92 \SetWatermarkScale5
Desktop (please complete the following information)
(Optional) Screenshots
If applicable, add screenshots to help explain your problem.
Exact location of issue
Figures 2 and 6 of https://ar5iv.org/html/1907.07998#S5.F6.1
Problem details
Expected behavior
The figures should look properly aligned and sized in the article frame.
Jess Riedel was sadly taken for a ride by ar5iv not making it clear we only have "v1" of each article.
Rather than drop the version, we should have it always appended, and even send a clear warning when redirecting from "v2", "v3", ... "vN" down to "v1" on the lines of:
"Hey there! ar5iv only has access to the v1 source of all articles. This may be different than what you expect to find."
For example, after equation (1) here:
https://ar5iv.org/html/2012.05155#S1.E1
Problem details
An equation reference is expected. Instead, it reads \reftagform@1
.
Desktop
Exact location of issue
Renders with no spaces for some reason
https://ar5iv.org/html/2109.04981
This is a super cool project! Thanks for your work!
Exact location of issue
First paragraph after the last image of: https://ar5iv.org/html/2110.05444
Problem details
All listing references are shown as "(LABEL:code:socket)".
(Optional) Expected behavior
Expecting "Listing 2.".
Exact location of issue
It seems the emails of authors have been parsed a bit incorrectly.
Example:
The authors section in https://ar5iv.org/html/2102.06911 is parsed a bit wrong on two counts
- The
@google.com
emails are missing the{
and}
- The affiliations are not parsed correctly.
Problem details
versus the paper's PDF
I see three hiccups in this article, which uses an OSA template, since the paper was submitted to an OSA journal.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.