davidar / pandiff Goto Github PK
View Code? Open in Web Editor NEWProse diffs for any document format supported by Pandoc
License: MIT License
Prose diffs for any document format supported by Pandoc
License: MIT License
This is a wonderfull tool!! I am just wondering how one could apply other pandoc
filter like cite-proc
or eqnos
.
I have a workflow in which I would like to use,
git show tag:foo.md | pandiff - foo.md
However, this hangs. Instead what I end up doing is,
T=$(mktemp)
git show tag:foo.md > $T
pandiff $T foo.md
So the request is that pandiff
allow -
as an input file without hanging.
Thanks.
When a diff results in two or more adjacent paragraph insertions, the HTML output returns as if they are one giant paragraph. (I would imagine this applies to two or more adjacent deletions as well).
I've tracked it down to the point in postprocess
when pandoc is run on the markdown representation of the diffed output. It seems like pandoc does not count a new paragraph as a paragraph if it's completely wrapped in <ins>
or <del>
tags. This means that this Markdown:
<ins>Paragraph 1</ins>
<ins>Paragraph 2</ins>
When run through pandoc results in:
<ins>Paragraph 1</ins>
<ins>Paragraph 2</ins>
If, for example, spaces are inserted before the <ins>
tags in the above markup, the correct output is produced:
<p><ins>Paragraph 1</ins></p>
<p><ins>Paragraph 2</ins></p>
As a quick workaround, I added this line in postrender
before the final pandoc call. It produces the correct output, but I don't know if it's the best solution.
text = text.replace(/\n\n<ins>/g, '\n\n <ins>')
Any thoughts?
(I'm running pandoc 2.2.1.)
Used default pandiff output with atx-headers
and wrap: preserve
options.
Input:
## Data Protection Principles\n\n* We identify Non-public data and label it as such\n* We put in place reasonable and appropriate safeguards to protect you, corruption, and modification to data\n* We limit PII collection, sharing, disclosure, and use to business need\n* Yes\n\nSystem | Tolerable outage (RTO) | Tolerable data loss (RPO)\n-- | -- | --\nApplication | 24 hours | 24 hours\nNetwork | 24 Hours | 24 hours\n\n
Output:
## Data Protection Principles\\n\\n- We identify Non-public data and label it as such\\n- We put in place reasonable and appropriate safeguards to protect you, corruption, and modification to data\\n- We limit PII collection, sharing, disclosure, and use to business need\\n- Yes\\n\\n<table>\\n<thead>\\n<tr class=\\\"header\\\">\\n<th>\\nSystem\\n</th>\\n<th>\\nTolerable outage (RTO)\\n</th>\\n<th>\\nTolerable data loss (RPO)\\n</th>\\n</tr>\\n</thead>\\n<tbody>\\n<tr class=\\\"odd\\\">\\n<td>\\nApplication\\n</td>\\n<td>\\n24 hours\\n</td>\\n<td>\\n24 hours\\n</td>\\n</tr>\\n<tr class=\\\"even\\\">\\n<td>\\nNetwork\\n</td>\\n<td>\\n24 Hours\\n</td>\\n<td>\\n24 hours\\n</td>\\n</tr>\\n</tbody>\\n</table>\\n
Hi @davidar
as far as I understand, pandiff shows the full document.
how could one show only the diff, as a typical unified diff, with a bit of context before and after the changes?
thanks!
pandiff old.md new.md --wrap none --output diff.md --> wrap property is ignored and diff is wrapped
pandiff old.md new.md --wrap none > diff.md --> wrap property is used and diff isn't wrapped
Issue happened when try to get diff result in CriticMarkup and pass to: gfm
option.
Insert output looks fine: {++Hey++}
Delete output: {–Hey–}
('–' instead of '--')
Substitution output: {\\\\~<sub>protect access,</sub>\\\\>protect,\\\\~\\\\~}
It also adds \n
characters on the same places in md text despite of wrap: preserver/none
or columns
options. Seems like it just ignores them while using to: gfm
Wrong result with CriticMarkup errors variations also happened when tried to: commonmark
, to: markdown
.
From greenelab/meta-review#200:
- table cells get shifted/munged, likely pandiff bug:
In the resulting output (screenshot of docx below) the diffs aren't friendly. Notice the sentence "Additionally, the lessons learned from that time are at risk of being forgotten" is fragmented, part of it is buried in deleted text and the rest follows all the deleted text. Is there a diff option that could improve on this?
Currently, track changes in DOCX output are tagged as author="unknown"
. It would be handy if "unknown" could be replaced with a user provided parameter.
For example, we now get a warning every time pandiff is executed:
[WARNING] Deprecated: --atx-headers. Use --markdown-headings=atx instead.
It might also be nice to pass all arguments and their values straight through to Pandoc so the CLI doesn't have to try to keep up with changes in the Pandoc CLI?
Thanks for this great program! I'm looking to convert changes between two tex files to docx; including in-text references and a bibliography. However, specifying a bibliography has no effect, and adding a csl file results in an error message:
/usr/lib/node_modules/pandiff/node_modules/command-line-args/dist/index.js:1350
throw err
^
UNKNOWN_OPTION: Unknown option: --csl=/path/to/apa.csl
at commandLineArgs (/usr/lib/node_modules/pandiff/node_modules/command-line-args/dist/index.js:1347:21)
at Object.<anonymous> (/usr/lib/node_modules/pandiff/cli.js:77:6)
at Module._compile (node:internal/modules/cjs/loader:1109:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1138:10)
at Module.load (node:internal/modules/cjs/loader:989:32)
at Function.Module._load (node:internal/modules/cjs/loader:829:14)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:76:12)
at node:internal/main/run_main_module:17:47 {
optionName: '--csl=/path/to/apa.csl'
}
I'm effectively running pandiff --csl=/path/to/apa.csl --bibliography=/path/to/references.bib -o output.doxc old.tex new.tex
Is there something I'm doing wrong? Otherwise it would also be helpful to have a MWE for this kind of case.
I would be nice to add support for files in git repository:
# diff actual working copy with the last commit in actual branch
pandiff path/to/file.md
# diff actual working copy with the version in tag/brach/commit
pandiff path/to/file.md <tag/brach/commit>
# diff version of file in tag1/brach1/commit1 with the version in tag2/brach2/commit2
pandiff path/to/file.md <tag1/brach1/commit1> <tag2/brach2/commit2>
Thank you for your great package. Unfortunately there seems to be a problem with YAML blocks for meta information. I defined my title there, but it is ignored when creating the diff file (warning pushed and default empty title inserted) . I can't pass additional arguments either. Can this please be fixed? It is very unfavorable, if the title is not considered.
I would like a Markdown output file with Pandoc-style track changes markup instead of CriticMarkup. There doesn't appear to be a combination of command line options that gives this result. This is what I end up doing instead,
T=$(mktemp)
pandiff old.md new.md -t docx -o ${T}
pandoc -f docx - markdown+mark --track-changes=all ${T}
The reason is that my workflow does some transformations on the markdown before running it through Pandoc to generate DOCX files:
pandiff old.md new.md -t docx_but_markdown | frob-the-markdown | pandoc -t docx -o output.docx
I'm getting the following two errors when running pandiff
.
[WARNING] Could not convert TeX math '\textrm{PPV} = \frac{55}{210} = 0.26', rendering as TeX
which may be avoided by adding --mathjax
to the command line.
! Package inputenc Error: Unicode char \u8: not set up for use with LaTeX.
…
Try running pandoc with --pdf-engine=xelatex.
which may be avoided by adding --pdf-engine=xelatex
to the command line.
Any suggestions how to fix this?
In the same run, I also get this error:
(node:45417) UnhandledPromiseRejectionWarning: 43
(node:45417) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:45417) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
When images are missing, pandoc fails with
pandoc: /Sites/Handelshogskolan/Static/Images/apsia.svg: withBinaryFile: does not exist (No such file or directory)
And then pandiff does not output any diff.
It would be great that pandiff gracefully handles missing images.
Thanks for the great tool!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.