Git Product home page Git Product logo

paperback's Introduction

paperback

NOTE: While paperback is currently fully functional, all of the development of "paperpack v0" is experimental and the format of the various data portions of paperback are subject to change without warning. This means that a backup made today might not work with paperback tomorrow. However, once there is a proper release of paperback, the format of that version of paperback will be set in stone and any new changes will be done with a new version of paperback (paperback can detect the version of a document, so older documents will always be handled by paperback).

paperback is a paper-based backup scheme that is secure and easy-to-use. Backups are encrypted, and the secret key is split into numerous "key shards" which can be stored separately (by different individuals), removing the need for any individual to memorise a secret passphrase.

This system can also be used as a digital will, because the original creator of the backup is not required to be present (or consent to) the decryption of the backup if enough of the "key shards" are collected. No individual knows the secret key (not even you), and thus no party can be compelled to provide the key without the consent of k-1 other parties.

To make this system as simple-to-use as possible, paperback creates several PDFs which you can then print out and laminate, ready for recovery. Here are some examples of the generated documents:

Mockups Current Status
Main Document
Key Shard

These "key shards" can then be given to a set of semi-trusted people. paperback also supports (k, n) redundancy, allowing for n key shards to be created but only k being required in order for the backup to be recovered.

"Semi-trusted" in this context means that you must be sure of the following two statements about the parties you've given pieces to:

  1. At any time, at least k of the parties you've given pieces to will provide you with the data you gave them. This is important to consider, as human relationships can change over time, and your friend today may not be your friend tomorrow.

  2. At any time, no party will maliciously collude with more than k-1 other parties in order to decrypt your backup information (however, if you are incapacitated, you could organise with the parties to cooperate only in that instance). Shamir called this having a group of "mutually suspicious individuals with conflicting interests". Ideally, each of the parties will be unaware of each other (or how many parties there are), and would only come forward based on pre-arranged agreements with you. In practice, a person's social graph is quite interconnected, so a higher level of trust is required.

Each party will get a copy of their unique "key shard", and optionally a copy of the "master document" (though this is not necessary, and in some situations you might want to store it separately so that even if the parties collude they cannot use the "master key" as they do not have the "master document"). We recommend laminating all of the relevant documents, and printing them duplex (with each page containing the same page on both sides).

Note that this design can be used in a more "centralised" fashion (for instance, by giving several lawyers from disparate law firms each an individual key shard, with the intention to protect against attacks against an individual law firm). Paperback doesn't have a strong opinion on who would be good key shard holders; that decision is up to you based on your own risk assessment.

A full description of the cryptographic design and threat model is provided in the included design document.

Usage

Paperback is written in Rust. In order to build Rust you need to have a copy of cargo. Paperback can be built like this:

% cargo build --release
warning: patch for the non root package will be ignored, specify patch at the workspace root:
package:   /home/cyphar/src/paperback/pkg/paperback-core/Cargo.toml
workspace: /home/cyphar/src/paperback/Cargo.toml
    Finished release [optimized] target(s) in 3m 42s
% ./target/release/paperback ...

The general usage of paperback is:

  • Create a backup using paperback backup -n THRESHOLD -k SHARDS INPUT_FILE. The -n threshold is how many shards are necessary to recover the secret (must be at least one), the -k shards is the number of shards that will be created (must be at least as large as the threshold). The input file is the path to a file containing your secret data (or - to read from stdin).

    The main document will be saved in the current directory with the name main_document-xxxxxxxx.pdf (xxxxxxxx being the document ID), and the key shards will be saved in the current directory with names resembling key_shard-xxxxxxxx-hyyyyyyy.pdf (with hyyyyyyy being the shard ID).

  • Recover a backup using paperback recover --interactive OUTPUT_FILE. You will be asked to input the main document data, followed by the shard data and codewords. The output file is the path to where the secret data will be output (or - to write to stdout).

    Note that for key shards, the QR code data will be encoded differently to the "text fallback". This is because it is more space efficient to store the data in base10 with QR codes. As long as you copy the entire payload (in either encoding), paperback will handle it correctly.

    Paperback will tell you how many QR codes from the main document remain to be scanned (they can be input in any order), as well as how many remaining key shards need to be scanned (along with a list of the key shards already scanned).

  • Expand a quorum using paperback expand-shards -n SHARDS --interactive. The -n shards number is the number of new shards to be created. You will be asked to input enough key shards to form a quorum.

    Paperback will tell you how many remaining key shards need to be scanned (along with a list of the key shards already scanned).

    The new key shards will be saved as PDF files in the same way as with paperback backup.

  • Re-generate key shards with a specific identifier using paperback recreate-shards --interactive SHARD_ID.... You can specify as many shard ids as you like. Shard ids are of the form "haaaaaaa" ("h" followed by 7 alphanumeric characters). You can specify any arbitrary shard id.

    This operation is mostly intended for allowing a shard holder to recover their key shard (which may have been lost). Using recreate-shards is preferable because (assuming you're sure the ID you recreate is the ID of the shard you originally gave them) it means that they cannot trick you into getting new distinct shards by pretending to lose an old shard. The recreated shards are identical in almost every respect to the old shards (except with a new set of codewords), so having many copies gives you no more information than just one.

    Paperback will tell you how many remaining key shards need to be scanned (along with a list of the key shards already scanned).

    The new key shards will be saved as PDF files in the same way as with paperback backup.

  • Re-print an existing paperback document using paperback reprint --[type] --interactive. --[type] can either be --main-document or --shard and indicates what type of document needs to be reprinted.

    You will be asked to enter the data of the document you have specified. The new document will be saved as a PDF file in the same way as with paperback backup.

    When reprinting a main document, paperback will tell you how many QR codes from the main document remain to be scanned (they can be input in any order).

Note that when inputting data in "interactive mode" you have to put an extra blank space to indicate that you've finished inputting the data for that QR code. This is to allow you to break the input up over several lines.

Currently, paperback only supports "interactive" input. In the future, paperback will be able to automatically scan the data from each QR code in an image or PDF version of the documents.

Paper Choices and Storage

One of the most important things when considering using paperback is to keep in mind that the integrity of the backup is only as good as the paper you print it on. Most "cheap" copy paper contains some levels of acid (either from processing or from the lignin in wood pulp), and thus after a few years will begin to yellow and become brittle.

Archival paper is a grade of paper that is designed to last longer than ordinary copy paper, and has standardised requirements for acidity levels and so on. The National Archives of Australia have an even more stringent standard for Archival paper and will certify consumer-level archival paper if it meets their strict requirements. Though archival paper is quite a bit more expensive than copy paper, you can consider it a fairly minor cost (as most users won't need more than 50 sheets). If archival paper is too expensive, try to find alkaline or acid-free paper (you can ask your state or local library if they have any recommendations).

In addition, while using hot lamination on a piece of paper may make the document more resistant to spills and everyday damage, the lamination process can cause documents to deteriorate faster due to the material most lamination pouches are made from (not to mention that the process is fairly hard to reverse). Encapsulation is a process similar to lamination, except that the laminate is usually made of more inert materials like BoPET (Mylar) and only the edges are sealed with tape or thread (allowing the document to be removed). Archival-grade polyester sleeves are more expensive than lamination pouches, though they are not generally prohibitively expensive (you can find ~AU$1 sleeves online).

The required lifetime of a paperback backup is entirely up to the user, and so making the right price-versus-longevity tradeoff is fairly personal. However, if you would like your backups to last indefinitely, I would recommend looking at the National Archives of Australia's website which documents in quite some detail what common mistakes are made when trying to preserve paper documents.

It is recommended that you explain some of the best practices of storing backups to the people you've given shard backups to -- as they are the people who are in charge of keeping your backups safe and intact.

For even more recommendations (from archivists) about how best to produce and store paper documents, the Canadian Conservation Institute has publicly provided very detailed explanations of their best practice recommendations. Unfortunately, there aren't as many details given about what a producer of a document should do.

License

paperback is licensed under the terms of the GNU GPLv3+.

paperback: resilient paper backups for the very paranoid
Copyright (C) 2018-2022 Aleksa Sarai <[email protected]>

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <https://www.gnu.org/licenses/>.

paperback's People

Contributors

cyphar avatar dependabot[bot] avatar tsoutsman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

paperback's Issues

shamir: optimise expansion by using interpolation

Recovering the full polynomial is a neat party trick but is not necessary for quorom expansion because we can just evaluate the X values we'd like. However in order to do this efficiently we need to make use of the barycentric form of the Lagrange polynomials so we can evaluate multiple X values more efficiently.

Print placeholders instead of codewords by default

With printers these days you never know what they store or where the documents they print go. Thus it would be insecure to print the whole shard-pdf (or, at least all shards) on a printer as the shard content could be recovered from it. Since there is already a shard password, it would make sense to not print it by default and have the user write it down manually. While this requires quite a bit of manual labor it is the only way I could figure out how to print the shards securely when designing a similar system.

How to handle multi-page documents

At the moment paperback has an artificial limit of 9 QR codes as part of the main document. The reason for this is that once you go above 9 QR codes, it will be necessary to create additional pages which may result in cases where someone has lost one of the pages and the data is basically irrecoverable. We can't really make the QR codes dynamically smaller because that'll affect the ability of QR readers to be able to reliably read the encoded data.

pdf: allow "re-printing" a new PDF from old data

Given a single document it would be nice to be able to generate a new PDF from the data. This would solve two problems:

  1. Minor stylistic changes in the PDF that a user would like to have (such as improving the descriptions of sections). This would be at least somewhat useful for early adopters of paperback that want to be able to update some of the documents (if they choose) without needing to form a quorum and create a new backup needlessly.
  2. To salvage a document which has degraded significantly but is still readable -- this would allow you to potentially avoid needing to come to a quorum to issue new shards if the shard holder makes use of this feature when they notice the shard is getting a bit long in the tooth.

This doesn't require any fancy cryptography since it's just a repackaging of the existing data in the document.

Switch to azul-text-layout

At the moment all of our text layout code is mostly hard-coded or calculated using somewhat loose approximations. In order to have nicely formatted text (and properly wrapped codewords) we need to switch to azul-text-layout for laying out the text and then outputting the laid out text to the PDF.

This would probably require some more complicated internal code to make the PDF generation less painful (namely having our own justification code for vertical justification, based on the azul-text-layout-computed heights and widths). There were some bug reports in azul-text-layout last year indicating it seems to struggle with line breaking, but hopefully it works okay now...

Release it as precompiled binary on github

Thanks for your work in this interesting tool.

I haven't yet tried out, but I'm eager to.
Would it be possible to release it as a rust precompiled binary on github ? At least compiled for x86_64 ?

It would make much easier for people who dont have / dont want to download the 500 mega rust distribution just to quickly try it out.

This would also make it easier to provide packages for other linux distros.

Perhaps you can also have a look at how other cli rust tools do it, for example,

skim, https://github.com/lotabout/skim/releases
fd, https://github.com/sharkdp/fd

I assume you are aware of github actions right ?
Have a look at how ripgrep, https://github.com/BurntSushi/ripgrep does it, https://github.com/BurntSushi/ripgrep/tree/master/.github/workflows

Thanks in advance.

spurrious test failures in "safety" checking

I've run into two cases of failures in safety-checking code, leading me to wonder whether the logic is sound.

  • The sanity check in Dealer::shard that checks whether the evaluated polynomial equals x failed once with thread '<unnamed>' panicked at 'assertion failed: self.threshold == 1 || y != poly.constant()', pkg/paperback-core/src/shamir/dealer.rs:111:17. I guess it is possible for the polynomial to equal the secret at a random x value, but we still should avoid allowing that even if an attacker couldn't know whether that was the case. Maybe we should handle it better? Dealer::next_shard should probably re-try at a different x value rather than crashing the program.
  • In CI, the test for limited_recover_fail failed because it seems one of the polynomials returned the same value for a random x value. thread 'shamir::dealer::test::limited_recover_fail' panicked at '[quickcheck] TEST FAILED. Arguments: (9, [30], [GfElem(4011157905), GfElem(1345881083)])', /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/quickcheck-1.0.3/src/tester.rs:165:28

Maybe we need to reconsider if these properties are actually guaranteed or just very unlikely to happen (probably the latter) and how we should protect against them without using asserts.

Questions regarding paperback

  1. Is this the same as OllyBdg's original PaperBack?
  2. What barcode does it use to store data?
  3. Does it store everything over a single barcode, or is it multiple?
  4. Can it store data over multiple pages for bigger data?

printable minimal paperback algorithm for long-term backups

If someone is storing paper archives for decades, they may not have access to the paperback software when they are trying to recover the data (internet issues, etc)

  1. Would it be feasible to include manual decoding steps as an optional print out (some combination of command line tools?)
  2. Would it be possible to encode a small decoding program as part of the archive?

pdf: qr: should we add tags to every QR code?

At the moment we have a type byte in the split main document QR codes. I originally planned to have such a tag in every QR code, which would allow us to:

  • Detect if someone scanned the wrong code and warm them appropriately (rather than producing random other errors).
  • Make auto-scanning of the PDFs easier in the future (we can scan all QR codes on the page then use the tags to figure out which code is which -- which would allow us to handle cases where the QR code scanner returns the codes in a different order to the expected order).

The only downside of this approach (aside from the minor increase in data payload) is that the document ID is no longer easily defined as being the last 8 characters of the checksum (because adding a prefix with the type information will offset all of the later bytes, changing the zbase32 representation). We can work around this but it would mean that our document IDs need to embed this PDF-specific information, which seems a bit ugly. This is the main reason I haven't implemented this yet.

Maybe we don't need tags for the checksums (it's obvious given the document we scanned)?

Dumb proofing

Hi,

I was evaluating paperback for future use when i passed by some issues will trying to recover a document (a private rsa ssh key, whatever).

the document is spread on 3 qr-code. Each time i tried to recover i have been grated with a Error: failed to parse data after inputting the main document. I tried to input them differently, one by one, or concatenate everything first in a text file. It worked with a "small" file so i still convicted that i did something wrong on my part which lead me to the next thinking.

Whould it be nice to attach to the main document a notice page for dummies, untechsavy, not even born yet person (it supposed to be long term strorage after all, github may not event exist when someone will try to recover) ? Also could be a good idea to have some metadata in each qrcode to be able to guide the user through the process (for things like out of order input, partial input, or even add parity qrcode in case of damage of the main docuement) ?

best regards,

ps: as i dev, i know it's the worst kind of ticket, but i had to do it ;)

Building paperback results only in "Hello world!"

I'm unsure if this is still a work-in-progress, but when I read the design document, it seemed pretty finished.
Am I doing something wrong, or does building it result in an executable that just says "hello world"?

Can we compress the main document contents?

Compress-then-encrypt is known to be unsafe for interactive sessions such as web browsers since 2012 (thanks to CRIME and BREACH). However, since paperback does not operate as an interactive session system (and we have pretty strong density requirements since we need to fit QR codes on paper), maybe it would be safe to compress the main document contents?

It would be nice to get a cryptographer's opinion on this...

Add file size limit of a sheet of paper to README.md

Just thought it would be a really common thing for someone to be skimming through the readme to find how much data they could store on one sheet of paper in order to judge if they would like to use paperback or not. Statements like "as most users won't need more than 50 sheets" aren't nearly as helpful as for example, "you can store ~12kb of data on one sheet."

"Super Shamir" support?

Trezor finally has developed SSS support (though it lacks some of the expansion and strong verification features we have -- their design where they use interpolation as part of their polynomial generation also looks a little strange). They've also implemented an interesting mode called "super shamir" where after creating key shards for your bitcoin key, you then further split the shards into sub-shards.

I suspect this was done in order to work around their 16-share limitation by making reconstruction more performant. But the interesting aspect of this feature is it allows you to create segregated groups of shard holders that may completely betray you and still would not gain any information.

However I'm not entirely convinced that this is solving a real threat model -- if you can confidently segregate N people into A group of size B, where you will never have inter-group betrayals above the threshold you could create the same amount of shares but use a B*2 or higher threshold (the bound is probably lower depending on the thresholds of each group but this is the worst case).

Can not paste more than 1023 chars into recovery input

Problem description

When pasting more than 1023 chars into the recovery input, the command stalls after 1024 chars and does not accept any more key presses.

How to reproduce

# Generate dummy ssh key 
$ ssh-keygen -t rsa -b 4096 -f /tmp/id_rsa -N ""

# Create a paperback backup. From the output copy the first line below "Main Document:" 
$ target/release/paperback backup --quorum-size 1 --shards 1 /tmp/id_rsa

# delete the dummy ssh key
$ rm /tmp/id_rsa

# Execute the recovery command and paste the copied line into the recovery command prompt
$ target/release/paperback recover --interactive /tmp/id_rsa
Enter a main document code (unknown number of codes remaining): 9253239620054760480773498079880048392085023609742842040441216447577496617446522511370875848613996704392955370327328922021505813624459430649615647004836257715583803915725820234492619487563511735058047789402755199339859092601746721608432634524157899952992320457872099698050063905737816078540584111814371613885832431003302577055324833129473947363860462659423259525211314871908819730229296258076477521533657856983503245554846583692028058674287473573834129460088792647819166374931328080799339353723362304667836773508987216392810973673734903634617168285294218709368415340502361941598376136512904673448181173749369421372015801785805191848425038073416836444570342722255581674236030369275690062713106590311335700342677911770602043028693566314131764350369055707158650741524998825631546473357742808615494982181991857662250928740189539612597602590152772154238218901242586014158935194393650542887393610351381599452557668098113435215653343861277404681664851781973715584822873633221126356012335405179879816615450526302967659493552671496472^C

# kill the command with ctrl+c

Workaround

Use pbpaste | fold -w 1023 to split the Main Document line into lines of (max) 1023 chars length and paste the ouput into the recovery command:

$ target/release/paperback recover --interactive /tmp/id_rsa
Enter a main document code (unknown number of codes remaining): 925323962005476048077349807988004839208502360974284204044121644757749661744652251137087584861399670439295537032732892202150581362445943064961564700483625771558380391572582023449261948756351173505804778940275519933985909260174672160843263452415789995299232045787209969805006390573781607854058411181437161388583243100330257705532483312947394736386046265942325952521131487190881973022929625807647752153365785698350324555484658369202805867428747357383412946008879264781916637493132808079933935372336230466783677350898721639281097367373490363461716828529421870936841534050236194159837613651290467344818117374936942137201580178580519184842503807341683644457034272225558167423603036927569006271310659031133570034267791177060204302869356631413176435036905570715865074152499882563154647335774280861549498218199185766225092874018953961259760259015277215423821890124258601415893519439365054288739361035138159945255766809811343521565334386127740468166485178197371558482287363322112635601233540517987981661545052630296765949355267149647
226410538752391687911178318761080263108873022289284775587263668520331196972717697729788699398000173167806414614066258077067583088872821697226010398021823269373626541263340690902799891290195493231182505245988875228760735053610597622081372604957530939286812127279080393969338927405945210253668028668802396229106118040187749421774338451359750232325256546621164852961903428689581143751595660495424206222631944811791136556562665493854032676701329609059487180961119135380038394503182348329712756562548915305529935105716937911622233399595667033090052582507631333501183018417557726767882193277875868037920955394012806979455562452417268489519269164171473060246786822153865734340408197037615747796220862512927762776884634853344962301319327658743875225210860203381248682319300437723223314079610373260298521740246726570523473560457579068620266950310955013429105611246634681922824973450998433633929358510851168741554990897898123776323662065302374765725725752498255316418580610280332212084157119108874124773153735166470779108917882795390
09196407271449258621087535011601344373819201070644111198228597890282749477039387676183312395301367402243850223592732949953864817270418614449223792966

Enter a main document code (3 codes remaining):

Build Env

paperback: commit 0bd9e493b1220dc28b9241946b80d53bd7d38cfe (main at the time of writing)
OS: MacOS Ventura 13.1  (Apple Silicon M1) 
Cargo: 1.66.0 (d65d197ad 2022-11-15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.