Git Product home page Git Product logo

Comments (13)

pfoerster avatar pfoerster commented on July 28, 2024 4

The latest master branch of TexLab now uses citeproc-rs in combination with bibutils to render citations. Node.js is no longer needed. A 1.8.0 release is on the way.

from texlab.

cormacrelf avatar cormacrelf commented on July 28, 2024 2

Quick ping: a few breaking changes recently to support more intuitive numbering in editor plugin code. zotero/citeproc-rs#24

  • rename Cluster2 -> Cluster
  • Cluster doesn't have a note number on it any more. You should now call set_cluster_order(&[ClusterPosition { id: 1, note: Some(1) }]) after inserting the single cluster, as it is not considered part of the document until you give it a number.
  • get_cluster now returns Option; None if the cluster isn't included in the document.

from texlab.

KOLANICH avatar KOLANICH commented on July 28, 2024 1

The main concern is backdoors in npm packages. It is not eliminated by not using node in installation, since js code from the npm is still used and JS community has ultimately failed in protecting NPM from malware. (IDK why other langs are not plagued so badly, probably JS and PHP are just considered more valuable targets, we definitely need to do something about it (I mind of combination of taint-checking and permissions but have failed to proove the system security properties in the case when taints can be explicitly modified by a package developer)).

BTW, https://github.com/cormacrelf/citeproc-rs , https://github.com/michal-h21/citeproc-lua , https://github.com/michel-kraemer/citeproc-java and https://github.com/fouke-boss/citeproc-dotnet

from texlab.

cormacrelf avatar cormacrelf commented on July 28, 2024 1

I am entirely aware of what’s missing. It’s about 33% through the CSL test suite, but I’m aiming for CSL 1.0.1 by about the year’s end. It’s a Zotero project and they intend to use it to replace citeproc-js, so when that happens, it should be a good gauge for when it’s ready to use elsewhere.

It’s very nice to see a Rust project that could link to it without any WASM serialisation overhead! It would be great if someone could elaborate on the use case and how I could accommodate texlab — I assume you’re currently wrapping citeproc-js quite heavily? What are the inputs and outputs, and what format are they in? Synchronous or worker thread? Etc.

Note that I believe citeproc-js is dependency-free. The periphery and tests etc use NPM but the main part doesn’t even (afaik) support linking NPM packages in its custom build process. So I am not sure spooky NPM malware is a good enough reason not to use it, especially sandboxed as you have. There is also deno’s runtime as a crate, if for whatever reason Duktape makes you uncomfortable. Moreover, you probably don’t need NPM to install it, as it’s a bundled single file CommonJS build, you could literally curl Unpkg.

from texlab.

cormacrelf avatar cormacrelf commented on July 28, 2024 1

Oh, easy. Converting BibTeX and doing what citation.js does (the wrapping) are the two pieces.

  • BibTeX conversion is something I had wanted to do, so may be upstreamable. Originally I was going to write it in Haskell with Pandoc’s parser as part of larger Pandoc support. Lots of BibTeX/Pandoc users are mathematicians who need full-power LaTeX math in their titles, and also use Markdown with $$ math-inlines, italics, etc, even using Pandoc in cite prefixes/suffixes, so getting the Pandoc parser in there was inevitable. (It’s not easy from there.) A basic parser/converter may be a fit in citeproc-rs. It’s the curly braces inside fields that are hard, especially name fields with weird particles. The conversion target would be a citeproc_io::Reference with CSL-JSON micro-HTML in the strings.
  • The simplified Citation.js API is not tricky to do. I can provide that. You’ll likely have to bundle APA.csl and the right locale, or get them from a CDN which may be in the works.

Additional blockers for integrating are:

  • I haven’t finished bibliography support
  • I haven’t published it as a crate

from texlab.

pfoerster avatar pfoerster commented on July 28, 2024

TexLab makes use of the citeproc-js library. It is used to turn BibTeX entries into a rendered citation. Note that this dependency is only required to compile server and it is not needed at runtime. Instead of relying on a Node.js installation, TexLab uses the embeddable Duktape engine.

from texlab.

efoerster avatar efoerster commented on July 28, 2024

Thanks for the feedback. First of all, we already considered all other citeproc libraries but they were either not good enough, abandoned or not embeddable.

Another point is that we are not using Node.js at runtime. As stated by @pfoerster, we use Duktape as Javascript interpreter. Therefore, the Javascript code is heavily sandboxed and does not have access to any I/O resource and Node/Browser APIs because Duktape does not even implement them.

from texlab.

KOLANICH avatar KOLANICH commented on July 28, 2024

I feel like no sandbox can be considered a reliable protection as long as it is possible to implement a Turing machine upon it. I know that duktape uses no jit, so it should be fully protected by DEP and ASLR, but there are ways to derandomize aslr, and some of them use microarchitectural side channels (though IDK if it is possible to exploit them from duktape).

Another concern that a backdoor can be used to infect developer machine using node and then insert a backdoor into the compiled part.

from texlab.

XVilka avatar XVilka commented on July 28, 2024

Maybe it makes sense to open the bugs in citeproc-rs for the buggy or missing things? So JS dependency can be ditched away.

cc @cormacrelf

from texlab.

pfoerster avatar pfoerster commented on July 28, 2024

TexLab uses citeproc-js (or to be precise citation.js) to convert BibTeX entries to formatted references when completing citations or hovering over them.

For example, the BibTeX entry

@article{Rivest:1978:MOD:359340.359342,
    author = {Rivest, R. L. and Shamir, A. and Adleman, L.},
    title = {A Method for Obtaining Digital Signatures and Public-key Cryptosystems},
    journal = {Commun. ACM},
    issue_date = {Feb. 1978},
    volume = {21},
    number = {2},
    month = feb,
    year = {1978},
    issn = {0001-0782},
    pages = {120--126},
    numpages = {7},
    url = {http://doi.acm.org/10.1145/359340.359342},
    doi = {10.1145/359340.359342},
    acmid = {359342},
    publisher = {ACM},
    address = {New York, NY, USA},
    keywords = {authentication, cryptography, digital signatures, electronic funds transfer, electronic mail,
                factorization, message-passing, prime number, privacy, public-key cryptosystems, security},
} 

gets converted to the following reference:

Rivest, R. L., Shamir, A., & Adleman, L. (1978). A Method for Obtaining Digital Signatures and Public-key Cryptosystems. Commun. ACM, 21(2), 120–126. https://doi.org/10.1145/359340.359342

image

As citeproc-js does not support BibTeX, we are using citation.js to convert BibTeX to CSL-JSON but we could (probably) implement this conversion on our own.

What are the inputs and outputs, and what format are they in?

The input is a reference in CSL-JSON format and the output is the result of executing a style against that reference (HTML format, which is converted to markdown by TexLab).

Synchronous or worker thread?

At the moment, TexLab uses the API synchronously. All entries are rendered lazily.

I assume you’re currently wrapping citeproc-js quite heavily

Actually not, the JavaScript code we use is quite simple:

import { Cite } from '@citation-js/core';
import '@citation-js/plugin-bibtex';
import '@citation-js/plugin-csl';

export default function(code) {
  const cite = new Cite(code);
  const html = cite.format('bibliography', {
    format: 'html',
    template: 'apa',
    lang: 'en-US',
  });
  return html;
}

from texlab.

XVilka avatar XVilka commented on July 28, 2024

There are some projects offering parsing BibTeX in Rust already, probably can be reused somehow:

from texlab.

cormacrelf avatar cormacrelf commented on July 28, 2024

Nom-bibtex looks much better. Neither does the hard part I mentioned, but Pandoc’s implementation Is a guide, esp https://github.com/jgm/pandoc-citeproc/blob/ae1aaffc1b7a53468c7f3e51bac86283f02459ba/src/Text/CSL/Input/Bibtex.hs#L1069

Edit: Pandoc is GPLv2. Take care not to copy it.

from texlab.

KOLANICH avatar KOLANICH commented on July 28, 2024

Thank you!

from texlab.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.