cyphar / paperback Goto Github PK

Paper backup generator suitable for long-term storage.

License: GNU General Public License v3.0

Rust 100.00%

paper backup shamir-secret-sharing secret-sharing encryption user-friendly

paperback's Issues

Can we compress the main document contents?

Compress-then-encrypt is known to be unsafe for interactive sessions such as web browsers since 2012 (thanks to CRIME and BREACH). However, since paperback does not operate as an interactive session system (and we have pretty strong density requirements since we need to fit QR codes on paper), maybe it would be safe to compress the main document contents?

It would be nice to get a cryptographer's opinion on this...

Using a "backdoor" password in addition to Shamir?

Does this only support SSSS or can you use a password as well with AES encryption?

What mean is the paper-based?

what is the paper? and what is paper-based?

Can not paste more than 1023 chars into recovery input

Problem description

When pasting more than 1023 chars into the recovery input, the command stalls after 1024 chars and does not accept any more key presses.

How to reproduce

# Generate dummy ssh key 
$ ssh-keygen -t rsa -b 4096 -f /tmp/id_rsa -N ""

# Create a paperback backup. From the output copy the first line below "Main Document:" 
$ target/release/paperback backup --quorum-size 1 --shards 1 /tmp/id_rsa

# delete the dummy ssh key
$ rm /tmp/id_rsa

# Execute the recovery command and paste the copied line into the recovery command prompt
$ target/release/paperback recover --interactive /tmp/id_rsa
Enter a main document code (unknown number of codes remaining): 9253239620054760480773498079880048392085023609742842040441216447577496617446522511370875848613996704392955370327328922021505813624459430649615647004836257715583803915725820234492619487563511735058047789402755199339859092601746721608432634524157899952992320457872099698050063905737816078540584111814371613885832431003302577055324833129473947363860462659423259525211314871908819730229296258076477521533657856983503245554846583692028058674287473573834129460088792647819166374931328080799339353723362304667836773508987216392810973673734903634617168285294218709368415340502361941598376136512904673448181173749369421372015801785805191848425038073416836444570342722255581674236030369275690062713106590311335700342677911770602043028693566314131764350369055707158650741524998825631546473357742808615494982181991857662250928740189539612597602590152772154238218901242586014158935194393650542887393610351381599452557668098113435215653343861277404681664851781973715584822873633221126356012335405179879816615450526302967659493552671496472^C

# kill the command with ctrl+c

Workaround

Use pbpaste | fold -w 1023 to split the Main Document line into lines of (max) 1023 chars length and paste the ouput into the recovery command:

$ target/release/paperback recover --interactive /tmp/id_rsa
Enter a main document code (unknown number of codes remaining): 925323962005476048077349807988004839208502360974284204044121644757749661744652251137087584861399670439295537032732892202150581362445943064961564700483625771558380391572582023449261948756351173505804778940275519933985909260174672160843263452415789995299232045787209969805006390573781607854058411181437161388583243100330257705532483312947394736386046265942325952521131487190881973022929625807647752153365785698350324555484658369202805867428747357383412946008879264781916637493132808079933935372336230466783677350898721639281097367373490363461716828529421870936841534050236194159837613651290467344818117374936942137201580178580519184842503807341683644457034272225558167423603036927569006271310659031133570034267791177060204302869356631413176435036905570715865074152499882563154647335774280861549498218199185766225092874018953961259760259015277215423821890124258601415893519439365054288739361035138159945255766809811343521565334386127740468166485178197371558482287363322112635601233540517987981661545052630296765949355267149647
226410538752391687911178318761080263108873022289284775587263668520331196972717697729788699398000173167806414614066258077067583088872821697226010398021823269373626541263340690902799891290195493231182505245988875228760735053610597622081372604957530939286812127279080393969338927405945210253668028668802396229106118040187749421774338451359750232325256546621164852961903428689581143751595660495424206222631944811791136556562665493854032676701329609059487180961119135380038394503182348329712756562548915305529935105716937911622233399595667033090052582507631333501183018417557726767882193277875868037920955394012806979455562452417268489519269164171473060246786822153865734340408197037615747796220862512927762776884634853344962301319327658743875225210860203381248682319300437723223314079610373260298521740246726570523473560457579068620266950310955013429105611246634681922824973450998433633929358510851168741554990897898123776323662065302374765725725752498255316418580610280332212084157119108874124773153735166470779108917882795390
09196407271449258621087535011601344373819201070644111198228597890282749477039387676183312395301367402243850223592732949953864817270418614449223792966

Enter a main document code (3 codes remaining):

Build Env

paperback: commit 0bd9e493b1220dc28b9241946b80d53bd7d38cfe (main at the time of writing)
OS: MacOS Ventura 13.1  (Apple Silicon M1) 
Cargo: 1.66.0 (d65d197ad 2022-11-15)

Print placeholders instead of codewords by default

With printers these days you never know what they store or where the documents they print go. Thus it would be insecure to print the whole shard-pdf (or, at least all shards) on a printer as the shard content could be recovered from it. Since there is already a shard password, it would make sense to not print it by default and have the user write it down manually. While this requires quite a bit of manual labor it is the only way I could figure out how to print the shards securely when designing a similar system.

Unable to Input Complete Single-Line QR Code Data on macOS Terminal

Description

On macOS, there appears to be an issue when attempting to input a complete set of QR code data into the terminal in a single line. The macOS terminal input is restricted to 1024 characters as defined by the system limits (source). Once 1024 characters have been entered on a single line, any further input (including the Enter key) is not accepted. Typically, the data extracted from QR codes exceeds this 1024-character limit. As a result, this limitation forces users to engage in inconvenient and confusing multiple manual copy-paste operations to input the entire data string.

Possible Solutions

Modify the QR code generation process to output data in multiple lines rather than a single long line. This approach can help users to more easily handle and input data within the limitations of the macOS terminal character count.
Explore using libraries like rustyline which might allow bypassing the 1024 character limit per line. This could provide a more seamless input experience for users dealing with lengthy single-line data.

This issue is particularly critical as the program necessitates the entry of large amounts of data, making it inevitable for macOS users to face this limitation. Users unfamiliar with this system constraint might not understand why they are unable to input data successfully, leading to significant usability concerns.

We hope this issue can be prioritized for a resolution to enhance the functionality and user experience of macOS terminal operations. Thank you for your attention and looking forward to any updates or feedback from the development team.

spurrious test failures in "safety" checking

I've run into two cases of failures in safety-checking code, leading me to wonder whether the logic is sound.

The sanity check in Dealer::shard that checks whether the evaluated polynomial equals x failed once with thread '<unnamed>' panicked at 'assertion failed: self.threshold == 1 || y != poly.constant()', pkg/paperback-core/src/shamir/dealer.rs:111:17. I guess it is possible for the polynomial to equal the secret at a random x value, but we still should avoid allowing that even if an attacker couldn't know whether that was the case. Maybe we should handle it better? Dealer::next_shard should probably re-try at a different x value rather than crashing the program.
In CI, the test for limited_recover_fail failed because it seems one of the polynomials returned the same value for a random x value. thread 'shamir::dealer::test::limited_recover_fail' panicked at '[quickcheck] TEST FAILED. Arguments: (9, [30], [GfElem(4011157905), GfElem(1345881083)])', /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/quickcheck-1.0.3/src/tester.rs:165:28

Maybe we need to reconsider if these properties are actually guaranteed or just very unlikely to happen (probably the latter) and how we should protect against them without using asserts.

How to handle multi-page documents

At the moment paperback has an artificial limit of 9 QR codes as part of the main document. The reason for this is that once you go above 9 QR codes, it will be necessary to create additional pages which may result in cases where someone has lost one of the pages and the data is basically irrecoverable. We can't really make the QR codes dynamically smaller because that'll affect the ability of QR readers to be able to reliably read the encoded data.

Switch to azul-text-layout

At the moment all of our text layout code is mostly hard-coded or calculated using somewhat loose approximations. In order to have nicely formatted text (and properly wrapped codewords) we need to switch to azul-text-layout for laying out the text and then outputting the laid out text to the PDF.

This would probably require some more complicated internal code to make the PDF generation less painful (namely having our own justification code for vertical justification, based on the azul-text-layout-computed heights and widths). There were some bug reports in azul-text-layout last year indicating it seems to struggle with line breaking, but hopefully it works okay now...

Questions regarding paperback

Is this the same as OllyBdg's original PaperBack?
What barcode does it use to store data?
Does it store everything over a single barcode, or is it multiple?
Can it store data over multiple pages for bigger data?

pdf: allow "re-printing" a new PDF from old data

Given a single document it would be nice to be able to generate a new PDF from the data. This would solve two problems:

Minor stylistic changes in the PDF that a user would like to have (such as improving the descriptions of sections). This would be at least somewhat useful for early adopters of paperback that want to be able to update some of the documents (if they choose) without needing to form a quorum and create a new backup needlessly.
To salvage a document which has degraded significantly but is still readable -- this would allow you to potentially avoid needing to come to a quorum to issue new shards if the shard holder makes use of this feature when they notice the shard is getting a bit long in the tooth.

This doesn't require any fancy cryptography since it's just a repackaging of the existing data in the document.

Building paperback results only in "Hello world!"

I'm unsure if this is still a work-in-progress, but when I read the design document, it seemed pretty finished.
Am I doing something wrong, or does building it result in an executable that just says "hello world"?

Release it as precompiled binary on github

Thanks for your work in this interesting tool.

I haven't yet tried out, but I'm eager to.
Would it be possible to release it as a rust precompiled binary on github ? At least compiled for x86_64 ?

It would make much easier for people who dont have / dont want to download the 500 mega rust distribution just to quickly try it out.

This would also make it easier to provide packages for other linux distros.

Perhaps you can also have a look at how other cli rust tools do it, for example,

skim, https://github.com/lotabout/skim/releases
fd, https://github.com/sharkdp/fd

I assume you are aware of github actions right ?
Have a look at how ripgrep, https://github.com/BurntSushi/ripgrep does it, https://github.com/BurntSushi/ripgrep/tree/master/.github/workflows

Thanks in advance.

When creating a backup, quorum size and shards should default to 1 unless otherwise specified

In order to streamline the process of creating a backup, the software should assume a threshold and shard count of 1, unless the user specifies otherwise.

printable minimal paperback algorithm for long-term backups

If someone is storing paper archives for decades, they may not have access to the paperback software when they are trying to recover the data (internet issues, etc)

Would it be feasible to include manual decoding steps as an optional print out (some combination of command line tools?)
Would it be possible to encode a small decoding program as part of the archive?

shamir: optimise expansion by using interpolation

Recovering the full polynomial is a neat party trick but is not necessary for quorom expansion because we can just evaluate the X values we'd like. However in order to do this efficiently we need to make use of the barycentric form of the Lagrange polynomials so we can evaluate multiple X values more efficiently.

pdf: qr: should we add tags to every QR code?

At the moment we have a type byte in the split main document QR codes. I originally planned to have such a tag in every QR code, which would allow us to:

Detect if someone scanned the wrong code and warm them appropriately (rather than producing random other errors).
Make auto-scanning of the PDFs easier in the future (we can scan all QR codes on the page then use the tags to figure out which code is which -- which would allow us to handle cases where the QR code scanner returns the codes in a different order to the expected order).

The only downside of this approach (aside from the minor increase in data payload) is that the document ID is no longer easily defined as being the last 8 characters of the checksum (because adding a prefix with the type information will offset all of the later bytes, changing the zbase32 representation). We can work around this but it would mean that our document IDs need to embed this PDF-specific information, which seems a bit ugly. This is the main reason I haven't implemented this yet.

Maybe we don't need tags for the checksums (it's obvious given the document we scanned)?

Dumb proofing

Hi,

I was evaluating paperback for future use when i passed by some issues will trying to recover a document (a private rsa ssh key, whatever).

the document is spread on 3 qr-code. Each time i tried to recover i have been grated with a Error: failed to parse data after inputting the main document. I tried to input them differently, one by one, or concatenate everything first in a text file. It worked with a "small" file so i still convicted that i did something wrong on my part which lead me to the next thinking.

Whould it be nice to attach to the main document a notice page for dummies, untechsavy, not even born yet person (it supposed to be long term strorage after all, github may not event exist when someone will try to recover) ? Also could be a good idea to have some metadata in each qrcode to be able to guide the user through the process (for things like out of order input, partial input, or even add parity qrcode in case of damage of the main docuement) ?

best regards,

ps: as i dev, i know it's the worst kind of ticket, but i had to do it ;)

Issue

How to build instruction to readme

Hello, I found it at https://privacytoolslist.com but no instruction here or there about how to actually build it and use. is it rust? Possible to add instructions how to quickly setup environment and build it? Thanks

"Super Shamir" support?

Trezor finally has developed SSS support (though it lacks some of the expansion and strong verification features we have -- their design where they use interpolation as part of their polynomial generation also looks a little strange). They've also implemented an interesting mode called "super shamir" where after creating key shards for your bitcoin key, you then further split the shards into sub-shards.

I suspect this was done in order to work around their 16-share limitation by making reconstruction more performant. But the interesting aspect of this feature is it allows you to create segregated groups of shard holders that may completely betray you and still would not gain any information.

However I'm not entirely convinced that this is solving a real threat model -- if you can confidently segregate N people into A group of size B, where you will never have inter-group betrayals above the threshold you could create the same amount of shares but use a B*2 or higher threshold (the bound is probably lower depending on the thresholds of each group but this is the worst case).

Add file size limit of a sheet of paper to README.md

Just thought it would be a really common thing for someone to be skimming through the readme to find how much data they could store on one sheet of paper in order to judge if they would like to use paperback or not. Statements like "as most users won't need more than 50 sheets" aren't nearly as helpful as for example, "you can store ~12kb of data on one sheet."