I was tempted to open this as a bug, but that would be provocative: age does not make

sensitive payloads can be identified by their size alone </blockquote

Padmé padding for age? about age HOT 6 CLOSED

filosottile commented on July 19, 2024 1

Padmé padding for age?

from age.

Comments (6)

cyb3rz3us commented on July 19, 2024

sensitive payloads can be identified by their size alone

Help me understand how this works for many of the targets that would be encrypted by 'age'?

For example, let's say I have a TAR-ball that is comprised of say 5000 pictures and when encrypted, it comes in around 25GB. And then I then have another TAR-ball of documents, if might come in when encrypted at around 12GB. And finally, a 3rd TAR-ball of videos, pics, and documents at 42GB when encrypted. How is the size of the payload helping to identify the payload?

I'm sure there's something I'm not understanding so please receive my question as only seeking to understand...not push-back...

from age.

colmmacc commented on July 19, 2024

Well let's say that the tarball is 'illegal' material that you have an interest in denying possession of. There are awful and cruel examples of this, but there are also righteous and good examples of this. For example the encrypted file might be the 'Bible' for a religion that an oppressive state has made illegal to possess, or it might be a tarball of material that a (corporate or state) whistle-blower leaked to a news-organization.

Without any padding, an exact byte-match of the illicit material is extremely suggestive circumstantial evidence of possession. With well-crafted padding; there is a greater range of potential content that encrypts to that size and the match is not as definitive.

from age.

colmmacc commented on July 19, 2024

I should add too that padding can be a useful mitigation against another kind of attack: shared compression tables.

Suppose that the tarball is a gzipped backup of your email inbox and that you make that backup every day and send it to a Cloud service for storage. The size of the backup can be seen by anyone who can observe the upload.

Now suppose I send you an email every day; and also observe the size of the backup. Over time, by trying different strings in my email, I can statistically profile where there Is overlap between the strings in my email, and other strings in your inbox (other emails); or at least the frequency of them. This is because the compression table entries will be shared, and so when there's overlap the output size is smaller than I would otherwise expect it to be. Exposing exact payload sizes make these kinds of shared-compression-table makes the attacks very practical.

For example: suppose I send you a 200 byte email that contains the string "TOP SECRET DOCUMENTS", over the weekend when you're not getting much other email, and the size of your backup only goes up by 12 bytes, I can guess that the string "TOP SECRET DOCUMENTS" appears elsewhere in your inbox.

Padding doesn't prevent these attacks; the attacker can pad themselves until the email crosses a padding boundary, helps makes these attacks much more costly and slow and only one bit of information is leaked each time a padding boundary is crossed, so it's just not nearly as practical.

from age.

cyb3rz3us commented on July 19, 2024

Taking the second set of comments first, as you mentioned, padding does nothing to prevent that type of attack. Also, what you outline is an attack far beyond the scope of 'age' or really any tool used for only file encryption.

Now, the first set of comments...I'm a bit dubious on entities being able to reliably determine someone has an encrypted form of a given work from only the byte count. And even if they might be able to do this, thinking just about only obscurity for a moment, then I think we first need to ask is if providing that type of obscurity is within the scope of the 'age' itself. From my reading of FiloSottie's blog and the front page of this project, it seems to me that the goal is to provide a very easy-to-use and lightweight file encrypt\decrypt tool. Perhaps I'm wrong here but that's my interpretation.

from age.

colmmacc commented on July 19, 2024

In context that's absolutely not what I wrote, but I should have been more careful not to say 'prevent' so casually. For compression attacks, sufficient padding does increase the costs for attackers, well beyond practicality in most cases. One bit per padding length is very very low bandwidth with which to try and work out the compression table collision.

Sounds like in your threat model that you don't care about an attacker being able to identify the plaintext from a set of known plaintexts. That's ok, but it's a pretty non-standard assumption.

from age.

cyb3rz3us commented on July 19, 2024

I don't see this as particularly "my threat model"...I see it as the likely threat models that are relevant to the typical user of 'age'. Said differently, I don't see the typical 'age' user as one who is encrypting a lot of widely known and\or disseminated plain-texts.

The issue you describe is more of a privacy concern as opposed to an encryption concern and again, based on my interpretation of FiloSottie's writings re: PGP and the reason why 'age' was developed in the first place, I view 'age' as an encryption tool...nothing more. That's not to say it can't be more but then that is really the dev's decision...

from age.

Padmé padding for age? about age HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent