Git Product home page Git Product logo

Comments (107)

apoelstra avatar apoelstra commented on July 22, 2024 1

Heh, your comment inspired me to actually read dual-EC-DBRG. It's quite elegant, other than the obvious backdoor, and also because it's obscenely slow. My read of it is that backdoor allows an attacker who has seen 30 (aligned) bytes of output can learn the internal state of the RNG, with a little bit (16 bits) of grinding. Once they know the internal state they can predict every future random number. However they cannot learn previous states of the RNG, and in particular there is no way for them to learn the seed, assuming the discrete log problem is hard.

So for this particular usecase, dual-ec-drbg would actually be perfectly fine :P.

But regarding scrypt parameters, let's back up a bit. I spoke on the phone with @roconnor-blockstream and we briefly touched on this. One observation he made is that any considerations applied to the identifier also applies to the shares (we should assume our attacker has at least one share; arguably we should assume they have k-1 shares). So I think we should do the same thing in both cases, and in this case the 20-bit ID is the least of our problem next to the key-length-many-bits shares.

Our attack scenario is: an attacker has access to the 20-bit ID, a share, multiple shares, whatever. And they are trying to access the actual master secret. The scenarios are:

  • If the data was generated maliciously then the seed could just be leaked outright.
  • If the data was generated correctly but with a fast RNG, then an attacker can grind out the secret far faster (by a large constant factor) by re-generating the data than by deriving addresses.
  • If the data is generated by a slow RNG/KDF, then an attacker with the data has no advantage over one without.

Having auditability forces us into the middle choice. If we don't worry about auditability then we'll be in either the "all good, no attacker advantage to having share data" case or the "all bad, the seed was totally leaked" case. And we can bias toward "all good" with the usual techniques. Open source software, independent audits, etc.

Given this, my feeling is that if somebody cares enough to audit their random data by hand, they care enough to just generate all their data by hand (which would actually be much faster than auditing, no matter what RNG we chose). And if they don't care enough, they'd be better off if we used scrypt everywhere rather than chacha everywhere.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024 1

Auditing a wallet always requires a computer to verify the public keys. So hand auditable share sets are not very important considering as you say the share set could be generated by hand faster for the paranoid.

That said, it makes sense for a given codex32 secret to always produce the same shares and share index order on electronics, but it is fine to require electronics to compute the shares with an expensive KDF.

I initially used Scrypt because it was built in hashlib, but the current state-of-the-art for memory hard KDFs is Argon2id. Its main advantage is being able to set the time cost and memory cost independently, while with Scrypt the time cost depends on memory cost.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024 1

Unfortunately it looks like no, in general, this sort of parity check is not preserved by interpolation. Damn.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

To improve this, I propose an update to the BIP where, during hand creation, dice are rolled to select each new share index.

This would actually complicate things a fair bit -- generating shares in order allows us to have a fixed set of tables that users can use to derive additional shares. If you were generating shares in random order you would need to do something akin to the recovery process -- and for each share index we'd need a "recovery wheel" for that index.

I wouldn't mind releasing a set of "recovery wheels" and suggesting people do this as an additional privacy measure (we've wanted to do this anyway as a "recover the remaining shares from an incomplete set" process), but I think it's too onerous to suggest doing as the default. Especially as the k/n parameters are not something that we think of as secret data. k is even encoded in the share header.

But I agree that this generation mode be the default for electronic implementations.

Below is an example of how I generated the share payloads in Python:

Because you're starting with the master seed, I don't think there's any value in hardening this. If the seed itself is of low-quality randomness then the user is screwed anyway. So I would actually suggest going in the other direction, toward chacha20 rather than scrypt, in the hope of coming up with something computationally simple enough that it could be (theoretically) audited by hand.

Similarly, with the share indices I don't even think we need cryptographic randomness -- the goal is simply to break the implication "share G implies that shares B,C,D,E,F exist". Though of course, if you are using the master seed as entropy, you want to run that through a cryptographic RNG to make sure that no information leaks into the indices themselves. I also don't think the share indices need to be auditable, though there's no reason for them not to be so we might as well.

BTW, I've been holding off on PR'ing to the BIP repo because it seems like you (and us) are still iterating on this, but I definitely do intend to incorporate your suggestions once we've nailed everything down. Thanks a ton for digging into this!

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

@BenWestgate also, if you are on IRC, #volvelle-wizards is our discussion channel for this stuff.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Thank you for your response I will make changes to my implementation to make it closer to a standard recommendation.

I will get rid of the KDF and use ChaCha20 for filling the secret payload padding, the share payloads and shuffling the indicies so auditing electronic codex32 share sets by hand is possible.

Share indices MUST be derived by a cryptographically secure randomness if not fixed, and we agree they should shouldn't be fixed by default in electronic implementations for the small "n" privacy gain.

By my math if the user made 31 shares and numbered them 1 to 31 it can leak a surprising amount of data: 87 bits.
So even with encrypted payloads, the seed would be effectively leaked by a malicious non-deterministic implementation with enough indices known. Although maybe 2nd level SSS encryption should encrypt the plaintext share's threshold, identifier, and indices for reasons like this. #16

Without encryption the max threshold is 9, so the maximum data leak we're concerned with is from 8 indices
log2(31*30*29*28*27*26*25*24) = 39 -bits leaked

Leaving only 89-bits of security for the master_seed, which is probably guessable by TLAs in our lifetime if not already.

Regarding the algorithm for the share index selection, using python random's random.seed() and then random.sample() is probably not hand auditable due to the Mersenne Twister.

Instead I could get at least 87-bits from CHACHA20, convert to bech32, and take the first unique character to be the next share index.

But how to know how much data to produce from the hash so I don't run out? Is there a better way to deterministically "shuffle" a list by hand computations?

  1. Use Chacha20 with a secret key (master_seed) to generate a keystream.
  2. Take the first byte of the keystream (or any desired number of bits, depending on the number of elements in the list) and use it as an index to select the first symbol in the shuffled list, unless the value is "s" or a previously selected character.
    3.Use the next byte of the keystream (or any desired number of bits) to use as an index to select the next symbol for the shuffled list
  3. Increment the counter for Chacha20 by 1 if out of bytes.
  4. Repeat steps 2 to 4 until all symbols are selected, creating the shuffled list.

Perhaps the chacha20 hash mod 31, mod 30, ... etc
What is the right balance between length of secure hash output needed and math operations to derive each index indices that's most efficient for hand auditing and also doesn't introduce modulo bias?

The share payloads are much easier in this regard as they allow repeating characters and use the full 5 bits per character so the chacha20 output stream can be directly used.

I joined the IRC channel as benwestgate.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Because you're starting with the master seed, I don't think there's any value in hardening this. If the seed itself is of low-quality randomness then the user is screwed anyway. So I would actually suggest going in the other direction, toward chacha20 rather than scrypt, in the hope of coming up with something computationally simple enough that it could be (theoretically) audited by hand.

from Cryptodome.Cipher import ChaCha20

padding _length = 32 - len(master_seed)

key = master_seed + bytes(padding_length)

nonce = bytes(hrp+k+identifier+"s", 'utf')

cipher = ChaCha20.new(key=key, nonce=nonce)

cipher_stream = cipher.encrypt(bytes(8))

Does this look good as the RNG stream for setting the padding bits in the last symbol of the codex32 secret payload, the share indicies and k-1 share payloads?

The nonce can be incremented by 1 whenever a 512-bit block is exhausted.

Do you prefer zero padding the decoded master_seed bytes as shown above or directly utf-8 encoding the 32 bech32 characters after "s", ie key = bytes(codex32_secret[9:41], 'utf-8')?

Unsure which is easier by hand.

Sad to see double_sha256 go as the padding as 256-bit master_seed's were direct truncation of WIF serialization. But that novelty is a liability for auditing by hand.

Likewise, it feels wrong to set the secret payload padding bits with a RNG stream that depends on the header, causing the payload for identical master_seed's to vary if the header changes.

Yet it seems smart to conserve labor for hand auditors by avoiding computing a fixed nonce ChaCha20 block only to set 2-4 padding bits then discard. These bits should be the first data extracted from the stream as a fast first check for malicious tampering. The next should be the share payloads, as these are also recoverable error detecting and auditable data, the last should be the indices as the order the software provided them is meant to be lost during recovery and so they cannot detect tampering.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Share indices MUST be derived by a cryptographically secure randomness if not fixed, and we agree they should shouldn't be fixed by default in electronic implementations for the small "n" privacy gain.

By my math if the user made 31 shares and numbered them 1 to 31 it can leak a surprising amount of data: 87 bits.

Heh, yeah, this seems reasonable. It seems pretty implausible that the user would number their shares 1 to 31 like that but why have a leak when we don't need to. If users want to do this by hand they can use dice rolls, which are cryptographically secure.

Regarding the algorithm for the share index selection, using python random's random.seed() and then random.sample() is probably not hand auditable due to the Mersenne Twister.

Hmm, true. And I see now the value of auditability, since in theory each share index could be leaking 5 bits (to an attacker who knows the order that the shares was generated anyway).

But how to know how much data to produce from the hash so I don't run out? Is there a better way to deterministically "shuffle" a list by hand computations?

Knuth's "Algorithm P" can do this. (The question is about finding a uniform subset, but conveniently this algorithm finds a subset in a uniform order. If you want a sorted order, which is just as good from a security POV, you can use "Algorithm S" or just sort manually.) Howevery, algorithm P assumes you can uniformly sample from the range [i, 31] for each i, which is hard to do without a theoretically indefinite stream of randomness. Though if you use rejection sampling, the probabilities are at least much easier to compute.

Let me think about this.

Does this look good as the RNG stream for setting the padding bits in the last symbol of the codex32 secret payload, the share indicies and k-1 share payloads?

The RNG initialization looks good to me, although

  • I guess identifier comes out of a different RNG?.
  • I also don't know what encrypt() does; I assume it xors the input with the random stream.
  • It's a cool idea to set the nonce uniquely per seed; I'd have probably just left it at all-zeroes...see below comments.

Also, I wouldn't bother setting the padding bits randomly. You can just leave them as 0. They aren't used and do not feed into BIP32 derivations.

The nonce can be incremented by 1 whenever a 512-bit block is exhausted.

Yep, though this is a tiny bit subtle. If you were to produce 256 blocks then you'd be incrementing the identifier part of the nonce, and would produce the same output as you would've if you'd simply started with that identifier.

But since we're only producing 31 shares, each of which only takes half a block, this is fine.

Do you prefer zero padding the decoded master_seed bytes as shown above or directly utf-8 encoding the 32 bech32 characters after "s", ie key = bytes(codex32_secret[9:41], 'utf-8')?

Unsure which is easier by hand.

Directly using the bytes is definitely easier, and I think is also more elegant.

Sad to see double_sha256 go as the padding as 256-bit master_seed's were direct truncation of WIF serialization. But that novelty is a liability for auditing by hand.

Yeah, that is a cool property. Though TBH I don't think users frequently see WIF serialization of xprivs so it's something of a curiosity.

Likewise, it feels wrong to set the secret payload padding bits with a RNG stream that depends on the header, causing the payload for identical master_seed's to vary if the header changes.

Yeah, hmm. There's two ways you could think of this:

  • Changing the threshold or identifier is a "reset the world" change and we should put this stuff into the RNG. This prevents users from re-sharing the same secret in different contexts while producing the same shares, which is good because the different share sets would interact algebraically in possibly surprising ways.
  • Changing the threshold or identifier is a "relabelling" which we ought to support. I actually considered producing a worksheet for this because in the early days I found myself relabelling stuff a lot (to get encrypted shares with distinct but human-readable headers).

The threshold value should definitely go into the RNG. Under no circumstances should you "relabel" the threshold, unless you are doing something dangerous and way off-script in which case the spec doesn't matter anyway. After thinking about it for a bit, I also think we should do this with the identifier. It creates a nice workflow where "if you want to re-split the secret, you have to change the identifier" and conversely "if the identifiers are the same (and the they come from the same seed), they're compatible shares".

Re-sharing secrets is a safe thing to do as long as the initial shares are independently random, and there are lots of reasons that users might want to do it.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

I guess identifier comes out of a different RNG?.

#54 (comment)

identifier is either selected by the user or defaults to the hash160(master_pubkey), making it possible to tell which share set recovers spending a watch only wallet, as it is prepended to child xpubs in descriptors. Of various options this one seems most useful for these 4 characters.

I also don't know what encrypt() does; I assume it xors the input with the random stream.

That's exactly what it does, the stream is being XOR'd with zeros in order to print the cipher stream directly.

It's a cool idea to set the nonce uniquely per seed; I'd have probably just left it at all-zeroes...see below comments.

It is an essential feature.

Having the nonce be a function of the header allows for creating new backups for the same master_seed that don't reuse the original randomness. For example if a share or two was compromised, the old set could be destroyed and replaced with a share set using a new header (new randomness) to restore security without sweeping funds. (Whether it's advisable to "rotate" shares given an attacker may reach a threshold before you've Completely Destroyed the partially-compromised share set is another question.)

If the nonce was fixed to bytes(8), you'd get the same independent shares for the same master_seed every time, even if you changed the threshold or identifier. This even helps with access control, if you want one group to be able to recover at threshold 3 and another group to recover at threshold 2, make two share sets with each threshold and one set will NOT leak payload data of the other set.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

It is an essential feature.

This is a great argument. Agreed on all counts.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Also, I wouldn't bother setting the padding bits randomly. You can just leave them as 0. They aren't used and do not feed into BIP32 derivations.

I dislike wasted space, especially in handwriting. But I agree padding should not be set randomly. Let's see how to make those bits do some work:

  1. These bits are as secret as the secret, it's a waste to set them with a non-linear secure hash function. double_sha256(seed) etc with no efficient erasure correction ability.
  2. These bits are as secret as the secret, it's a waste to set them to some public data like the hash160(master_pubkey) as you can't see them until you can already compute the pubkey, this is just another form of 1 with extra wasted work.
  3. But they could be parity bits or linear binary error coding. These algorithms are pretty fast to do on the master_seed bytes by hand.

Do you have any suggestions?

Naive idea is upper half and lower half parity, which has nice property of keeping the same 64-bit chunk length for 128/256-bit seeds. But falls apart at 512-bit seeds which have 3-bits of padding (170 bit chunks?)
This is not the most efficient error coding. Proper ECC should correct contiguous bit erasures anywhere in the master_seed that are the length of the checksum and is what should be used. This offers a little extra help in the absolute worst case scenario, where you have threshold shares but some are invalid beyond correction by the codex32 BCH code but are less than a symbol away from getting the right correction, potentially making brute forcing your valid codex32 secret 4 to 16 times cheaper.

What is the best ECC to put here? Will it wastefully repeat the same info as the codex32 secret checksum or can it supplement it?

Agreed, regarding re-sharing the same secret:

The BIP already says only shares with a matching identifier can be combined. Allowing "relabeling" is not very useful because it changes the checksum, even if share payloads do not change. And then you'd create a situation that the BIP says is not supported: two shares of different identifiers can be combined, but only if you ignore their checksum and identifier and just combine payloads. It's neat to just increment the LSB of the identifier to re-share an existing master_seed, then it still retains 3 symbols of the master_pubkey fingerprint, while allowing 32 re-shares. And setting a custom identifier to reshare should also be supported as long as it's unique from all previously used identifiers at that threshold.

On this topic of identifier uniqueness for re-sharing, it's not clear a user would know all previously used identifiers besides the one they have just combined with to do a reshare. Especially if they are not destroying old sets, they may combine with an older share set, and if we naively increment that, we will re-use the same nonce! So I think the default for "re-shares" needs to generate some independent randomness when users don't know all outstanding identifiers for this seed and threshold. Then we have a birthday problem and just pray for no collisions... The alternative is to not support useful cases:

Can you think of any reason to have 2 outstanding share sets of the same threshold but different identifiers? I did: prevent one set from combining with another set to defend against custodian collusion.

I might make a 2-of-3 with my dad and executor and a 2-of-3 with my mom and executor, each with unique identifier of same master_seed, and now recovering the master_seed requires me or my executor and 1 parent, but both my parents cannot combine shares to recover without me or my executor. Making them less attractive targets for a physical attack than if this was a 2-of-4 between my parents, myself and my executor. Useful.

How much randomness do we have to inject into the identifier if we don't know all the old identifiers to be pretty safe we don't reuse one? Can we still keep some bech32 encoding of the public key fingerprint, to identify the underlying wallet master_seed and connection between unique share sets of it or must inject all the entropy we can to be safe?

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

I dislike wasted space, especially in handwriting. But I agree padding should not be set randomly. Let's see how to make those bits do some work:

This is an interesting direction. We can definitely stick a checksum in here and we can definitely make it "independent" of the existing checksum in the sense that it will meaningfully add error correction. But it will be a bit-level error correcting code which has some unintuitive properties when applied to data that's normally encoded as 5-bit characters.

In particular I think a 2-bit code will only be able to detect one error (I guess, two, if they're consecutive), so it would mean we could detect certain character errors but not others. This is not zero value but it's fairly limited. I think we should keep thinking whether there's a better use of 2 bits before committing to this.

Allowing "relabeling" is not very useful because it changes the checksum, even if share payloads do not change.

It does, but you can compute what the checksum will change to, far more quickly than recomputing the whole thing. You can even use a computer to do most of the work without exposing any secret data to the computer.

On this topic of identifier uniqueness for re-sharing, it's not clear a user would know all previously used identifiers besides the one they have just combined with to do a reshare. Especially if they are not destroying old sets, they may combine with an older share set, and if we naively increment that, we will re-use the same nonce! So I think the default for "re-shares" needs to generate some independent randomness when users don't know all outstanding identifiers for this seed and threshold. Then we have a birthday problem and just pray for no collisions... The alternative is to not support useful cases:

Agreed. Unfortunately I think we just need to live with the birthday problem. In practice I think it'd be okay, both because the chance of collisions is fairly low (you need close to 1024 re-shares before the probabilites get high, and like, what are you doing with your secrets then..) and because the failure mode is that you'll generate exactly the same share set as before. Which is not much worse than having the original set still out there.

Can you think of any reason to have 2 outstanding share sets of the same threshold but different identifiers? I did: prevent one set from combining with another set to defend against custodian collusion.

Yeah, this is sorta what I'm imagining. That, or having "backup shares" that might have the same/overlapping custodian sets, but nonetheless shouldn't be mixable. Unfortunately these are the exact cases where collisions would undermine your security model.

How much randomness do we have to inject into the identifier if we don't know all the old identifiers to be pretty safe we don't reuse one? Can we still keep some bech32 encoding of the public key fingerprint, to identify the underlying wallet master_seed and connection between unique share sets of it or must inject all the entropy we can to be safe?

The birthday paradox shows up at around sqrt(N). So if you used the entire 20-bit identifier you'd expect collisions around the time you had 10 outstanding bits (1024). The curve is pretty steep so you have very low probabilities for quite a while, so it's fine to use 1000 as a rough estimate. If you used only half the identifier, 10 bits, you'd expect collisions after around 30 ... and have nontrivial probability of collisions after only a few.

My feeling is that you'd need to use the entire identifier for this.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

I think using the entire identifier is fine. This is a sub-case of the support custom identifier case, where we MUST randomize it when the user and software can't recall all previously generated share set identifiers for this master_seed.

Otherwise implementations can default to suggesting to increment it, preserving most of the hash160(master_pubkey) fingerprint or as always the user can pick anything unique: "faml" "moma" "dada" "lawr" "acct" "aunt" etc. Wallets should store the identifiers they have created to prevent the user from repetition or at least warn in event it is deliberate to repair or extend an existing share set.

Since this "I forgot" case requires new randomness that cannot be derived from master_seed the following procedure from auditable bitcoin wallets is needed so that the new random identifier can't leak seed data or deliberately repeat an identifier to break security:

identifier = convertbits(hashlib.scrypt(password=user_data, salt=app_entropy, n=2**20, r=8, p=1, maxmem=1025**3, dklen=3), 8, 5)[:4]
  • These parameters will need to be reduced to support HWWs and hand auditing (see below)

Where the user provides randomness and the machine generates 20-bits and unconditionally displays it and they are mixed with a one-way compression function.

Process of creating the new identifier should be functionally equivalent to the following steps:

  1. Wallet generates a pseudo-random number (20 bits or more) that we call app entropy.
  2. Wallet unconditionally displays app entropy to the user (in bech32, hex etc).
  3. User is then asked for some data to be entered that allows to improve security in case app entropy generator has a backdoor (see Dual EC DRBG). This could be a one-time text entered by the user only for this purpose, or a passphrase containing enough entropy that is used later for other purposes.
  4. Wallet uses computationally expensive KDF to deterministically mix app entropy and user-entered text to produce the final identifier.

Important considerations:

  • App entropy must be displayed unconditionally so that wallet cannot know if user is going to audit it or not.
  • User-entered data should be visible and reproducible by the user to be able to arrive to the same data in an auditing software. Good: text or pattern of button presses. Bad: accelerometer data, mouse movements (hard to record and reproduce).
  • User-entered data should contain enough entropy and should be unique. If the wallet has a backdoored RNG which does not generate unique numbers, we're relying on user-entered data to avoid nonce reuse and if the wallet is trying to leak the master_seed we're relying on user entropy for secure key mixing.
  • Unless necessary, wallet should not rely solely on user input to generate identifier because they may inadvertently give a non-unique input. In such case, honest wallet with correct RNG implementation would with strong probability generate a unique nonce.
  • Process of mixing app entropy and user input must use computationally expensive KDF to compensate for non-uniqueness OR low-entropy in user input.

On second thought, it may not need to be an expensive KDF, but it may need to be a one way compression function. HMAC-SHA256 or our ChaCha20 hash. With the user input as key and app_entropy as nonce. I believe simple XOR won't provide protection from master_seed leakage when users enter less than 20-bits of entropy, which is likely when limited to 4 bech32 characters.

The worst case is the attacker predicts the user's input they can recover 20-bits of leaked master_seed. My KDF above costs $0.0002 to check all possible 20-bits with known user data. So there's no way to secure this other than that the user should be able to type more than 4 characters to maximize their chances of a unique string w/ 20-bits entropy.

from Cryptodome.Cipher import ChaCha20
from Cryptodome.Random import get_random_bytes
from segwit_addr import convertbits

# user data is limited to 32 bytes
padding _length = 32 - len(user_data)
key = user_data + bytes(padding_length)

nonce = get_random_bytes(8)
cipher = ChaCha20.new(key=key, nonce=nonce)

identifier = convertbits(cipher.encrypt(bytes(3)),8,5)[:4]

For auditing by hand, provide 8 de-biased dice rolls to select the custom identifier avoids doing a chacha20 hash just for this.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Agreed. I don't have much of an opinion between scrypt and argon2id. I'd be tempted to use scrypt because it's more widely supported and has been tested pretty heavily by the various cryptocurrencies using it a PoW. I'm also not sure there's a ton of value, for us, in being able to set time and memory separately. (And we could emulate that somewhat by just iterating scrypt.)

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024
  • If the data was generated correctly but with a fast RNG, then an attacker can grind out the secret far faster (by a large constant factor) by re-generating the [share] data than by deriving addresses.

  • If the data is generated by a slow RNG/KDF, then an attacker with the data has no advantage over one without.

This means the KDF hardness parameters only need to cost more than deriving an address? The cheapest "address" is the master pubkey hash we suggest to use as default identifier.

Cost to beat is: HMAC-SHA512, produce a pubkey, then ripemd160(sha256(master_pubkey))

That has 1 in 2^20 chance of a false positive. (Or it's faked) In that case, a known address must be derived.

Setting time cost and memory cost an order of magnitude (or more) higher than an address derivation should be safe and able to run on HWWs

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

This means the KDF hardness parameters only need to cost more than deriving an address? The cheapest "address" is the master pubkey hash we suggest to use as default identifier.

Yeah, that's my thinking.

Setting time cost and memory cost an order of magnitude (or more) higher than an address derivation should be safe and able to run on HWWs

+1

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

I did some research with the chat robot and it claims HWWs are using "a few kB" of memory for address derivation from a BIP32 seed and most popular HWWs could use as much as 64-512kB for KDF memory cost parameters.

This makes memory hard algorithms on HWWs useless against GPUs: 8GB / 0.5MB > # GPU cores, but may still help somewhat against ASICs.

PBKDF2 in BIP39 is especially weak against GPUs (needs 8+ random word passphrases to be secure).

Given this, creating a secure auditable master_seed may want to use different parameters on mobile and desktop wallets (1+ GB) than hardware wallets (512kB). Here's an example from my implementation that costs 1GB and a few seconds:

def fresh_master_seed(seed_length,user_entropy,bitcoin_core_entropy):
    """Derive a fresh master seed with seed length bytes from user entropy and bitcoin core's entropy aka app entropy."""
    if 16 > seed_length > 64:
        return None
    master_seed = hashlib.scrypt(password=bytes(user_entropy+str(seed_length),"utf"), salt=bytes(bitcoin_core_entropy,"utf"), n=2**20, r=8, p=1, maxmem=1025**3, dklen=seed_length)
    return master_seed

A time_cost of 5-10 seconds may be recommended to reduce the disadvantages of weak hardware. Unlike authentication, where user must wait time_cost each login so is limited to 1-2 seconds or less, this is a one time wait at master_seed generation.

As long as the parameters were something unconditionally displayed like App Entropy anyone can verify they get the same codex32 secret by computing the KDF on a trusted device.

For share set generation, which only needs to cost more than an address derivation which I'm told takes "a few seconds or less" for a HWW then targeting something that takes 10 seconds for the worst hardware we want to support and uses 64kB or more on should be 2-10x as time costly and 10-100x as memory costly as address derivation.

Here all wallets should use the same parameters so codex32 secrets always derive identical codex32 share sets. Much like how 2048 rounds is part of bip39.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Yeah, "a few KB" for address generation sounds about right. Memory-wise it's extremely cheap and therefore extremely parallelizable. There's also a time-memory tradeoff. I think I could probably do it with <100 bytes of RAM but it wouldn't be very fast.

Given this, creating a secure auditable master_seed may want to use different parameters on mobile and desktop wallets (1+ GB) than hardware wallets (512kB). Here's an example from my implementation that costs 1GB and a few seconds:

Is the assumption here that the user entropy or Core entropy might be weak? I think if together they exceed 128 bits of entropy then there isn't any need for hardening at all. Or am I confused?

For share set generation, which only needs to cost more than an address derivation which I'm told takes "a few seconds or less" for a HWW then targeting something that takes 10 seconds for the worst hardware we want to support and uses 64kB or more on should be 2-10x as time costly and 10-100x as memory costly as address derivation.

Yep, sounds good to me.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Is the assumption here that the user entropy or Core entropy might be weak? I think if together they exceed 128 bits of entropy then there isn't any need for hardening at all. Or am I confused?

It's to protect against RNG backdoors in "secure element" HWWs.

Neither the device or user should be trusted, so the hardening makes the most of what entropy the user is willing to provide to protect themselves. Increasing the cost tremendously vs SHA2 for the backdoor author to steal.

This is the "lite" effort version of the dice debiasing worksheet. KDF offers real protection even without achieving 128-bits user entropy, even half that is act of congress expensive:

Attack cost estimates of memory hard KDFs by entropy

65-bits would be on order of 10 billion USD with 1GB and 1 second argon2id parameters.

50-bits would be around $1 million. It's less profitable to backdoor the RNG if user can mash their keyboard for a minute or toss a dice 10 times and verifiably make this attack cost more than they'll ever store.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

In particular I think a 2-bit code will only be able to detect one error (I guess, two, if they're consecutive), so it would mean we could detect certain character errors but not others. This is not zero value but it's fairly limited. I think we should keep thinking whether there's a better use of 2 bits before committing to this.

From BIP-173:

...ordering is chosen to minimize the number of pairs of similar characters that differ in more than 1 bit.

The most commonly substituted characters are all 1-bit errors.

For 256-bit seeds w/ 4-bit ECC as padding, the total error correction becomes 4 substitutions plus 1 substitution if it is a 1-bit error.
For 128-bit seeds w/ 2-bit ECC as padding, there's no corrections but it can narrow down a 1-bit error substitution to the upper or lower half of the code, maybe better.
512-bit seeds also gain a 1-bit substitution error correction boost.

If the same 2 or 4 bit code is added to each independent share then do all the derived shares get valid ecc padding as well?

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

It's to protect against RNG backdoors in "secure element" HWWs.

OK, yeah, this sounds good to me. Though of course, verifying this is done correctly will require redundant hardware. But better to have the ability than to not.

If the same 2 or 4 bit code is added to each independent share then do all the derived shares get valid ecc padding as well?

I am 90% sure, yes

For 256-bit seeds w/ 4-bit ECC as padding, the total error correction becomes 4 substitutions plus 1 substitution if it is a 1-bit error.

Similarly, I am 90% sure this is right ... the two codes don't directly add so that you can do some combined error correction algorithm (as far as I'm aware). But I do think that you can do a form of soft error correction where you can find all the possible strings that are 5 errors away from the target string. There will be several but only one, I think, with pretty-good probability, will have the extra parity bits set correctly.

And even if there's more than one, it's easy to manually check them all. In fact, in general you should be able to do this -- "correct" more than four errors by using error correction to reduce the search space by
a factor 2^20. Then having a couple extra bits will give you a couple extra bits' worth of reduction.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024
from electrum import bip32
for candidate_seed in range(2**20):
    bip32.BIP32Node.from_rootseed(candidate_seed.to_bytes(16, 'big'),xtype="standard").calc_fingerprint_of_this_node()

This takes 100 seconds on i5-1135G7 laptop and calculates fingerprints for 2^20 candidate seeds.

For comparison, checking the first wpkh() address of 2**16 candidate seeds:

from electrum import bip32
from electrum import ecc
from electrum.crypto import hash_160

for candidate_seed in range(2**16):
    priv_masterkey = bip32.BIP32Node.from_rootseed(candidate_seed.to_bytes(16,'big'),xtype="standard")
    hash_160(priv_masterkey.subkey_at_private_derivation([2147483732, 2147483648, 2147483648]).subkey_at_public_derivation([0,0])[1].get_public_key_bytes(compressed=True))

Took 46 seconds, so 736 seconds to check 2**20 seeds. The master_pubkey fingerprint identifier is a useful 7.4x speed up vs known wpkh addresses.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Back to the desired hardening of share payload derivation:

import hashlib
for candidate_seed in range(2**16):
    hashlib.scrypt(password=candidate_seed.to_bytes(16,'big'), salt=bytes('2test','utf'), n=2**6, r=8, p=1, maxmem=68*1024, dklen=16)

This generates the 1st share's payload for 2^16 candidate master_seed's with codex32 secret threshold = '2' and identifier = 'test' in 7 seconds.

This is 2.2x slower than checking the fingerprint (3.1 seconds) but still 7x faster than checking the first wpkh() address.

  • If the data is generated by a slow RNG/KDF, then an attacker with the [share] data has no advantage over one without.

The maxmem cannot be raised any higher as most HWWs have 128kiB of RAM and there is no reason to not support them. So following your advice, lets slow it down by doing 16 rounds. If you believe share payloads must be slower to derive than an address then we can use your idea:

(And we could emulate that somewhat by just iterating scrypt.)

import hashlib

for candidate_seed in range(2**14):
    intermediate_key = hashlib.scrypt(password=candidate_seed.to_bytes(16,'big'), salt=bytes('2test','utf'), n=2**6, r=8, p=1, maxmem=68*1024, dklen=16)
    for _ in range(15):
        intermediate_key = hashlib.scrypt(password=candidate_seed.to_bytes(16, 'big'), salt=intermediate_key, n=2 ** 6, r=8, p=1, maxmem=68 * 1024, dklen=16)
    intermediate_key

This generates the 1st share's payload for 2^14 candidate master_seed's with codex32 secret threshold = '2' and identifier = 'test' in 43 seconds, or ~172 seconds for 2^16 which is much slower than 46 seconds to derive 2^16 wpkh() address payloads.

I have no idea why iterating it 16 times caused a 25x+ slowdown, computers are magical.

Any problems here?

I'm still wondering why the shares need to be hardened since the search space is the 128-bit master_seed. Are we concerned that may some day not be long enough to be secure against brute force without the slowdown of address derivation?

Our attack scenario is: an attacker has access to the 20-bit ID, a share, multiple shares, whatever. And they are trying to access the actual master secret. The scenarios are:

If the data was generated correctly but with a fast RNG, then an attacker can grind out the secret far faster (by a large constant factor) by re-generating the data than by deriving addresses.

I thought it was not even possible to increment a 128-bit counter thru all values without 10 nuclear plants running at full capacity for a year (or something like that).

So lets confirm with a 4th person, the threat model is valid before we roast these poor Trezors. If the share payloads were fast to derive with ChaCha20 they'd become an almost unlimited source of error detection with each independent share acting as a cryptographic hash checksum the length of the secret. Is that more useful than making it faster for guessers to check a number that cannot be counted to?

These scrypt parameters with more iterations will work for hardening the master_seed against RNG backdoors with user supplied entropy on 128kiB RAM HWWs, as well as for deriving a unique identifier securely when re-sharing a master_seed when not all previous identifiers are known.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

I'm still wondering why the shares need to be hardened since the search space is the 128-bit master_seed. Are we legitimately concerned that may some day not be long enough to be secure against brute force without the slowdown of address derivation?

If we really have 128 bits of good entropy then I agree that hardening has no value. But maybe the seed isn't really that strong, or some of it is leaked, or whatever ... then my thinking is that we ought to try to "not make the situation worse" than bare BIP32, and we can do that by hardening all our randomness roughly as much as an EC mult.

Now, I guess you're suggesting that we make the 128 bits itself hard, even against weak entropy, by doing a serious amount of hardening on the user's input. This will protect us from bad input but not from key leakage.

So maybe the "should we bother hardening the shares/IDs" question comes down to "should we attempt to ameliorate the damage caused by leaked seed data"? My feeling is that the hardening doesn't add much cost so we might as well.

This is contradicting this:

Yeah, sorry, my thinking has changed a bit over the course of this thread.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

So maybe the "should we bother hardening the shares/IDs" question comes down to "should we attempt to ameliorate the damage caused by leaked seed data"? My feeling is that the hardening doesn't add much cost so we might as well.

Makes sense.
This is like medicine "First, do no harm."

So then do we need to harden the master_pubkey fingerprint default identifier? Otherwise it's a 7.4x speed up with a 1 in a million chance of being a false positive when grinding out a partially leaked master_seed?

Or is that not important because the attacker could get the fingerprint from the watch-only wallet?

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

If we are actually using the BIP32 fingerprint then we already have enough hardening, since the bip32 fingerprint requires an EC mult to calculate it from the seed :).

If we are using some alternate notion of "fingerprint" then yes, we should make sure that it's at least as hard as an ecmult.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

If we are actually using the BIP32 fingerprint then we already have enough hardening, since the bip32 fingerprint requires an EC mult to calculate it from the seed :).

Yes, that is the fingerprint we have in mind.

True about the EC mult, but the fingerprint is hash_160(master_pubkey) while a wpkh() address is something like hash_160 of [BIP32_fingerprint/84'/0'/0']xpub.../0/0 so there are 5 additional HMAC-SHA512 child derivations to slow it down, assuming BIP84 standard derivation path used. That slowdown is 7 fold as I calculated earlier. Unless the slowdown thanks to HMAC-SHA512 can be ignored due to ASICs or something. We also don't specify implementations use a particular derivation path. It may only be 1 HMAC-SHA512 if they use m/*.

A cheeky fix would be HMAC-SHA512-ing the master_pubkey fingerprint 5 times to set the identifierto make it identical work to check as a known address. But this unfortunately breaks the ability to check the fingerprint matches the share identifier by converting from hex to bech32 in your head.

My feeling is since descriptors felt it safe to include 4 bytes of hash_160(master_pubkey) even in encrypted wallets, it should be safe to include 20-bits in our shares which are much less likely to be stolen than an online descriptor. An attacker has to consider the identifier may just be random noise as well since it's only a default.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

My feeling is since descriptors felt it safe to include 4 bytes of hash_160(master_pubkey) even in encrypted wallets, it should be safe to include 20-bits in our shares which are much less likely to be stolen than an online descriptor.

Agreed on all counts.

I would further say that, while BIP49-style keys (which I guess is what 99.9% of all keys are..) do 7 derivations, in principle, people could be using their master keys to generate addresses. Or m/* or something.

A single constant-time EC mult is something like 100x the cost of an HMAC (500ns vs 50us). I think we should try to match the 100x factor, but not bother with the extra 7x factor since we're getting into increasingly tenuous justifications for it..

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

The electrum bip32 library I used must be unrepresentatively slow at the HMACs, it should have been a few percent slower, but not 7x. The chat robot agrees 80%+ of the time is spent going from private to public.

One scrypt round n=2**6, r=8, p=1 would be twice as slow on my hardware, but I don't trust the numbers. Part because this library gave bad data.

But in general, there's no particular scrypt iterations we can say for sure will be more expensive than the EC mult on all hardware and software that may ever exist to attackers. Without grievous overestimation. But what if we did the exact same work, plus some?

This might sound dumb, but why not use public key hash for the share payloads?

Consider this is the time to beat,

import bitcoin.bip32 as bip32
import hashlib

def compute_bip32_fingerprint(master_seed):
    # Step 1: Calculate the extended private masterkey (m) from the master seed using HMAC-SHA512.
    m = bip32.HMAC_SHA512(b"Bitcoin seed", master_seed)

    # Step 2: Compute the extended public masterkey (M) from the extended private masterkey (m).
    M = bip32.privkey_to_pubkey(m)

    # Step 3: Take the first 4 bytes of the hash160 of the serialized extended public masterkey (M).
    M_serialized = M.serialize()
    fingerprint = hashlib.new('ripemd160', hashlib.sha256(M_serialized).digest()).digest()[:4]

    return fingerprint

then,

def compute_share_payload(master_seed, threshold, identifier, share_index):
    # Step 1: Calculate the extended private masterkey (m) from the master seed using HMAC-SHA512.
    m = bip32.HMAC_SHA512(bytes(threshold+identifier+share_index, 'utf'), master_seed)

    # Step 2: Compute the extended public masterkey (M) from the extended private masterkey (m).
    M = bip32.privkey_to_pubkey(m)

    # Step 3: Take the first 16 bytes of the hash160 of the serialized extended public masterkey (M).
    M_serialized = M.serialize()
    share_payload = hashlib.new('ripemd160', hashlib.sha256(M_serialized).digest()).digest()[:16]

    return share_payload

This is guaranteed to not be cheaper, and it's also not any slower than it needs to be. It's also very easy for wallet developers to implement, they nearly already have (perhaps too close and they forget to remove b"Bitcoin seed"...). But since it's a 128-bit match rather than 20-bits or 4 bytes, it seems there should be one more step inserted somewhere to slow it down. Open to suggestions, if you agree.

It also needs to support 16-64 byte outputs. So something like HMAC-SHA512, scrypt, pbkdf2 could work to stretch and slow it. The master_seed should be mixed back in to preserve the original entropy. The target cost to exceed is an address with 5 extra derivations.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

This might sound dumb, but why not use public key hash for the share payloads?

I think this is too much of a burden on implementors. Though I guess if they support BIP32 they need to support this anyway. But my feeling is, demanding scrypt is fine because it's a standard function in most any general-purpose crypto library. But support for constant-time multiplication on a specific elliptic curve is a bit more of a demand, especially since you do want it to be efficient.

EC mult is also way easier for implementors to screw up and introduce sidechannels (though again, the same risk exists for BIP32 derivations, so maybe this isn't an additional problem).

But in general, there's no particular scrypt iterations we can say for sure will be more expensive than the EC mult on all hardware and software that may ever exist to attackers.

Sure, but I don't think we really need to. Let's just assume a 100x differential.

I apologize, I think I've been barking up the wrong tree here. On my system hmac takes 1.35us and a single ecmult gen takes 25.3us, so there's a ~20x differential, which is already less than I expected. But this is not the right comparison. If an attacker is grinding through different EC mults then their marginal operation will be an EC add (0.35us) or double (0.15us), both of which are an order of magnitude faster than an HMAC. (You might think, if the attacker needs to hash-then-ec-mult, all the ecmults will be with respect to different random values, so he has to do separate ecmults for each. Not so. Algorithms like pippinger will let him reuse computations across all of them. Or maybe you assume the attacker has to do 7 ecmults in series; same story, but with a 7x slowdown.) So against a naive software-only attacker, just using a sha256-hmac already provides more protection than the EC mult.

Now, you might assume that the attacker has a sha2 ASIC (or an scrypt ASIC, or whatever) and can do these HMACs a million times faster, but does not have an EC ASIC. Or you might assume the opposite. And I agree that "just do both operations", which is what "copy the BIP32 derivation logic" gets us, will protect us no matter what.

But I think we're way overthinking this. My new vote is that we should just use sha256-hmac for these derivations, on the basis that it's easy to implement and guaranteed to be available with any crypto library, even one that has no awareness of bitcoin, and will be constant-time even if the implementor wasn't thinking about this. We could use sha512-hmac if you want to imitate BIP32, but I don't think it matters. You could iterate it 100x if you really want to cover the case where public addresses only occur after sereval derivation steps, but I don't think it matters.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

we should just use sha256-hmac for these derivations
For simple hmac, you'd want HMAC-SHA512 because we need to make 512-bit share payloads for the 512-bit master_seed case. It's also slightly slower.

But I believe PBKDF2_hmac is also in every standard crypto library and it can implement the iteration for us, it also has a dklen parameter that somehow produces 64-bytes even with iterations=1 and hash_name='sha256'.

hashlib.pbkdf2_hmac()

import hashlib
share_payload = hashlib.pbkdf2_hmac('sha256', master_seed, salt=b'ms2testa', 100, dklen=len(master_seed)

Example deriving payload of independent share "a" from a codex32 secret starting with ms12test.

If this is sufficient, I think we've covered every implementation detail except the shuffling algorithm. (and the ECC code padding, since we haven't thought of a better use for 2 to 4 bits yet)

We were still hanging on to hand auditability back then, so now we want something that can be seeded by the pbkdf2_hmac() above but with salt=b"ms1tests" for the codex32 secret (since we don't know which share comes first yet) and always produce the same share index shuffle across many popular crypto libraries.

The robot tells me:

One such algorithm is the Fisher-Yates shuffle (also known as the Knuth shuffle), which is widely used and known for its simplicity and efficiency.

...[Knuth shuffle] is often the default shuffling algorithm used in many standard crypto libraries and programming languages.

I haven't experimented with this to have a code snippet yet. I've only ever used random.sample() and shuf in command-line although perhaps I'm lucky and random.sample() IS this algorithm.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

If this is sufficient, I think we've covered every implementation detail except the shuffling algorithm. (and the ECC code padding, since we haven't thought of a better use for 2 to 4 bits yet)

Nice :) yeah, I think this is fine.

If I had to guess the "Knuth shuffle" is my "Algorithm P", which comes from Knuth. So I guess we should use that. Like all other algorithms we considered, this involves an indefinite amount of randomness, but if we're not worried about hand-auditability, I think we could just pick any arbitrary way of stretching our kdf output (say, sha2'ing it repeatedly, or sha512'ing repeatedly, whatever we feel is more consistent with the rest of everything).

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Nice :) yeah, I think this is fine.

We can note pbkdf2 was chosen for simplicity over scrypt which needed iterating due to HWW memory constraints to cost more than address derivation and over argon2id due to library availability.

From the 13 year old, NIST SP 800-132 (being revised this summer):

A minimum iteration count of 1,000 is recommended.

for candidate_seed in range(2**15):
    hashlib.pbkdf2_hmac('sha256',candidate_seed.to_bytes(16,'big'),b'2testa', 1000, dklen=16)

This checks 2^15 candidate_seeds in 10 seconds. That's still twice as fast as checking an address in electrum's library. 21 seconds if hash_name='sha-512' still a little faster than addresses in this library.

From the competing BIP39:

To create a binary seed from the mnemonic, we use the PBKDF2 function with a mnemonic sentence (in UTF-8 NFKD) used as the password and the string "mnemonic" + passphrase (again in UTF-8 NFKD) used as the salt. The iteration count is set to 2048 and HMAC-SHA512 is used as the pseudo-random function. The length of the derived key is 512 bits (= 64 bytes).

We wouldn't want codex32 shares to be weaker against leaked seed data than the competition.

for candidate_seed in range(2**15):
    hashlib.pbkdf2_hmac('sha512',candidate_seed.to_bytes(16,'big'),b'2testa', 2048, dklen=16)

This runs in 44 seconds, about twice as long as checking 2^15 addresses from candidate seeds. With a coincidence like this, it's worth copying BIP39.
Especially if as you say the EC mult work is reusable in an optimized attack making hmac the main slowdown.

If I had to guess...

The Fisher-Yates shuffle algorithm, which is also known as the Knuth shuffle, is often referred to as "Algorithm P".

Both random.sample() and shuf use it by default. But my initial test failed to reproduce the same sort from same seed:

head -c16 /dev/urandom > random.seed
string=qpzry9x8gf2tvdw0s3jn54khce6mua7l
for ((i=0; i<${#string}; i++)); do
    echo ${string:i:1}
done | shuf --random-source=random.seed

produced a different order than:

import random

CHARSET = "qpzry9x8gf2tvdw0s3jn54khce6mua7l"

with open('random.seed', 'rb') as file:
   index_seed = file.read()
random.seed(index_seed)
random.sample(CHARSET,32)

So we may need to explicitly write the shuffle implementation unless we can find some reproducible shuffles between multiple different libraries and tools. Let me know if you can make two different libraries produce the same shuffle from the same seed.

If we have to write it, the most promising method is:

chacha20_key = hashlib.pbkdf2_hmac('sha512',master_seed, b'2tests', 2048, dklen=32)

Use this as the key stream for "Algorithm P", now the chacha20 nonce / counter can be set to 0 with bytes(8) since the hmac did the header mixing and we have indefinite RNG that's as hardened as the share payloads.

say, sha512'ing it repeatedly

With hmac-sha512 with master_seed as key and message as a counter? So it's a slow stream cipher? Like this? Incrementing counter each time more randomness is needed for "Algorithm P"?

def generate_key_stream(key, counter):

    hash_obj = hmac.new(key, counter, hashlib.sha512)
    return hash_obj.digest()

Using HMAC-SHA512 with the message as counter can be a valid method for generating a key stream, especially for shorter key streams.

However, ChaCha20 has several advantages: It is optimized for generating large amounts of random data, extensively used in protocols like TLS, and has strong security properties due to thorough analysis.

If the goal is to generate a long key stream efficiently, ChaCha20 is a better choice. But if a relatively short key stream is sufficient and you prefer consistency with other HMAC-based operations, using HMAC-SHA512 with the counter as the message is a viable option.

I understand this function is safe, if this isn't, neither is BIP32 with it's indices.

Is it preferred to use hmac for consistency or more extensively-audited ChaCha20 designed specifically for key stream generation?

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

I've written the cryptographic shuffle, I don't know what the algorithm is called, but it's correct.

I use pbkdf2 to create index_seed the same way as share payloads.

Then hmac-sha512(index_seed,counter//32) and assign 2 bytes to each of the 31 characters. If it finds a repeat, it increments the counter, if that happens more than once, it has to move to the next hmac block at message=1 and handles that case correctly. There's about a 1% chance of colliding 31, 2 byte tags so 99.99% of runs it'd only need the first hmac-sha512.

import hashlib
import hmac

# note: "s" has been removed
CHARSET = "qpzry9x8gf2tvdw03jn54khce6mua7l"

# hmac-sha512 keystream, 16-bits per count
def rand_func(index_seed, counter):
    digest = hmac.new(index_seed, (counter // 32).to_bytes(8,"big"), hashlib.sha512).digest()

    return digest[counter%32*2:counter%32*2+2]


# Function to assign 16-bit values to characters
def assign_2_bytes_to_characters():
    counter = 0
    assigned_values = {}
    for char in CHARSET:
        while True:
            value = rand_func(index_seed, counter)
            counter += 1
            if value not in assigned_values.values():
                assigned_values[char] = value
                break
    return assigned_values


# Example usage:
if __name__ == "__main__":
    master_seed = bytes(16)
    index_seed = hashlib.pbkdf2_hmac('sha512', master_seed, b'ms2tests', 2048, dklen=len(master_seed))

    # Assign 32-bit values to characters
    character_values = assign_2_bytes_to_characters()

    # Sort the characters based on their assigned values
    sorted_characters = sorted(character_values.keys(), key=lambda x: character_values[x])

    # Print the sorted characters and their assigned values
    for char in sorted_characters:
        print(f"{char}: {character_values[char]}")

Code is slow because it hmacs for every 2 byte tag instead of only as necessary.

I let it grind for a while and it correctly handled an event where the counter hit 33 (3 tag collisions)

index_seed = b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00K\xc5\x88'
hmac(index_seed,0) = b"\xa0\xd8t\xdbm\xd7\x11\xf0\x8dCB\x0c\x8e\\\xf4\xdcft\x1a\x94\xff\x02T\t\x13#y\xc2r\xcc\xf0\x0f\xa0\xe5\x11\xf0$\xb1I\x1bq\xbe\xe8}m\xd7\x02\xb7\x1aq\xb5\x9a\x87oI\x1b\\v\xeb9\xef\xa4'\x84"
hmac(index_seed,1) = b'B\x1eq\xd5y\xa2a\xa1\xff7\x10\xbd*\xf6\xf0\xa0*f\x9b\xe3\x0bEl\xcf=N6\xf5\x9f\x08\x12\xe5J[\xb9\xda\x92\x04\x89\x10\xec\x98zc<m\xa5i\xe4\x18\xabP_C\'\xce\x1b\xb2?\xc1"m\\\x8d'
{'q': b'\xa0\xd8', 'p': b't\xdb', 'z': b'm\xd7', 'r': b'\x11\xf0', 'y': b'\x8dC', '9': b'B\x0c', 'x': b'\x8e\\', '8': b'\xf4\xdc', 'g': b'ft', 'f': b'\x1a\x94', '2': b'\xff\x02', 't': b'T\t', 'v': b'\x13#', 'd': b'y\xc2', 'w': b'r\xcc', '0': b'\xf0\x0f', '3': b'\xa0\xe5', 'j': b'$\xb1', 'n': b'I\x1b', '5': b'q\xbe', '4': b'\xe8}', 'k': b'\x02\xb7', 'h': b'\x1aq', 'c': b'\xb5\x9a', 'e': b'\x87o', '6': b'\\v', 'm': b'\xeb9', 'u': b'\xef\xa4', 'a': b"'\x84", '7': b'B\x1e', 'l': b'q\xd5'}

And you can see the last two character tags before sorting came from the first 4 bytes of the second hmac. Some of the colliding tags were b'\x11xf0', b'm\xd7', I\x1b.

chacha20 could drop in for hmac-sha512 here, and would have been a bit easier to work with as the libraries for it usually have a call "give me keystream byte N" rather than having to carefully subscript.

And from the previous post for share payloads:
hashlib.pbkdf2_hmac('sha512',candidate_seed.to_bytes(16,'big'),b'2testa', 2048, dklen=16)
It would be better to do dklen=derived_shares*len(master_seed) and split the derived key rather than pbkdf each derived share's index independently.

Finally, the pbkdf2 for indexes could be skipped, keeping it to one hardening operation per codex32 secret > codex32 share set by deriving 16 extra bytes in dklen= to use as the index_seed when fetching the share payloads. 16-bytes is sufficient to produce every permutation.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Interesting approach. You probably need to call the HMAC only once (expected number of RNG calls is 31.007 and you get 32 values from each HMAC). Let me take a bit of time to get the probability you need to call it 3 times, which I suspect will be an extremely low number, and you can rewrite your code to simply always call it twice, and cache the values.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

...and cache the values.

Here I re-wrote it to cache the digest if not counter % 32. It's cleaner to read this way also.

import hashlib
import hmac

# note: "s" has been removed
CHARSET = "qpzry9x8gf2tvdw03jn54khce6mua7l"

# Function to assign 16-bit values to characters
def assign_2_bytes_to_characters():
    counter = 0
    digest = b''
    value = b''
    assigned_values = {}
    for char in CHARSET:
        while value in assigned_values.values() or not value:
            if not counter % 32: # hmac-sha512 keystream
                digest = hmac.new(index_seed, (counter // 32).to_bytes(8, "big"), hashlib.sha512).digest()
            value = digest[counter%32*2:counter%32*2+2] # 16-bits per count
            counter += 1
        assigned_values[char] = value
    return assigned_values


# Example usage:
if __name__ == "__main__":
    master_seed = bytes(16)
    # index_seed = hashlib.pbkdf2_hmac('sha512', master_seed, b'ms2tests', 2048, dklen=len(master_seed))
    index_seed = b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00K\xc5\x88' # a seed that needs 2 hmacs counter>32
    # Assign 32-bit values to characters
    character_values = assign_2_bytes_to_characters()

    # Sort the characters based on their assigned values
    sorted_characters = sorted(character_values.keys(), key=lambda x: character_values[x])

    # Print the sorted characters and their assigned values
    for char in sorted_characters:
        print(f"{char}: {character_values[char]}")

Side note, the only reason I chose 16-bit tags to sort was it was 512 // 31, other than that it was completely arbitrary. Unsure if 1-byte tags or 5-bit tags would be faster by consuming less randomness.

you can rewrite your code to simply always call it twice, ...

There's no point in calling HMAC any fixed number of times, if limited randomness is sufficient, then get it directly from pbkdf2_hmac.

derived_key = hashlib.pbkdf2_hmac('sha512', master_seed, b'2tests', 2048, dklen=128+len(master_seed)*(k-1))
for i in range(k-1):
    share_payload[i] = derived_key[i*len(master_seed):(i+1)*len(master_seed)]
share_index_randomness = dervied_key[(k-1)*len(master_seed):] # 128 bytes of randomness

It only saves one line of code to do it the wrong but extremely often correct way.
ChaCha20 would save the same line because it auto-increments its internal counter to stream from new 512-bit blocks without implementer having to write or even be aware of it happening.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Unsure if 1-byte tags or 5-bit tags would be faster by consuming less randomness.

In terms of expectation it seems like 7 bits is the minimum (needs 247.4 bits from the RNG, on average), but 8 is really close (263.8), more convenient to implement, and also it has a tighter distribution (much lower probability of needing much more extra randomness than the expected value.)

In particular I think, based on explicitly computing out probabilities for needing up to 38 samples then extrapolating:

  • With the 16-bit sampler we'll need more than 64 samples (1024 bits) with probability less than 2^300
  • With the 8-bit one we'll need more than 64 samples (512 bits) with probability less than 2^40
  • With the 7-bit one we'll need more than 74 samples (518 bits) with probability less than 2^20

There's no point in calling HMAC any fixed number of times, if limited randomness is sufficient, then get it directly from pbkdf2_hmac.

Heh, nice. I think though that we ought to reduce our sampler to an 8-bit one, and in that case we arguably can't get away with a fixed amount of randomness. I also prefer to minimize the use of the KDF in the spec, which feels opaque to me in a way that sha2-hmac or chacha don't. Ideally, if somebody had a strong justification and wanted to generate the raw seed data without the KDF, or with reduced rounds, or something, they'd still be able and willing to follow the rest of the spec.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Ideally, if somebody had a strong justification and wanted to generate the raw seed data without the KDF ... or something, they'd still be able and willing to follow the rest of the spec.

This is not currently possible because we used PBKDF2(master_seed, header) to get the randomness that sets share payloads and the stream cipher key for index shuffle. The other option was advantaging someone who has part of your seed (ie 1.3 shares, threshold 2), because the share payloads could be checked faster than an address or fingerprint. Or copying BIP32 but that was called hard to implement correctly.

..., or with reduced rounds, ...

If they reduced the rounds, they'd get a different set of shares and indexes so I'm not seeing the value.

A "Codex32 Secret" is the input to the deterministic share split. That can be made however is secure. But then it is hardened before assigning indexes or share payloads.

Whole header (perhaps excluding '1') or whole string should be used in case the hrp ever gets used for something else or checksum changes, then an identifier and threshold collision won't re-use randomness. The share index MUST be in the header to support sub-sharing in the future. I like password=codex32_secret, salt=master_seed as the kdf parameters. Since the codex32_secret may contain user_entropy, while the salt is random bytes common to all codex32 secrets with that payload.

But we can rely less on the dk_len= parameter which is sketchy and probably reduces security when set greater than the hash_name 's length. Produce a derived key of default length, which is the length of the hash_name, then that key= can be Hmac'd with the message=b'Index seed' with this output fed to chacha20 or the custom iterated hmac construction, and the payloads for the independent shares hmac with message=b'Share data 0', message=b'Share data 1' ... etc.

I have a feeling share data should be produced with a stream cipher too. Not because an unlimited amount is needed, but because the amount needed ranges from tiny 128-bit to 4096-bits (512-bits * 8) and it's cleaner to ask for the required bytes from a stream cipher than make implementers remember to iterate their hash enough times for all threshold-1 * len(master_seed) possibilities.

Either this or getting them directly w/ dk_len= from pbkdf2 is very easy. Although I have no idea how it makes 512 bytes from a 64 byte hash. "concatenating earlier results" is what the robot said, which makes the effective iterations less (or more?). While the dk_len = hash_name length pbkdf2 case is straightforward and easy to implement even without a library for pbkdf2.

7 bits is the minimum (needs 247.4 bits from the RNG, on average), but 8 is really close (263.8)

This suggests another way to optimize the cipher stream for shuffling, if not using ChaCha20, is hmac-sha256, sometimes it'll only need 1 hash, and the hash is more than twice as fast.

Speaking of which, if grabbed directly from pbkdf2's dklen=, how many bytes should index_seed be? The key to the stream cipher, it must be 256-bits if we switch to ChaCha. I think the total number of possible sorts is 2^87, Sha2 probably has some collisions so you may not get every sort with 11-bytes, 16-bytes is a round number close by and as secure as the seed itself.

Or if doing index_seed = hmac(key=kdf_derived_key, message=b'Index Seed') use the entire hmac output.

The key= for the share payload stream cipher should support up to 64-bytes, which rules out ChaCha20 here, because otherwise the shares produced have less entropy than the master_seed when they didn't need to.

Heh, nice. I think though that we ought to reduce our sampler to an 8-bit one, and in that case we arguably can't get away with a fixed amount of randomness. I also prefer to minimize the use of the KDF in the spec, which feels opaque to me in a way that sha2-hmac or chacha don't.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

This is not currently possible because we used PBKDF2(master_seed, header) to get the randomness that sets share payloads and the stream cipher key for index shuffle. The other option was advantaging someone who has part of your seed (ie 1.3 shares, threshold 2), because the share payloads could be checked faster than an address or fingerprint. Or copying BIP32 but that was called hard to implement correctly.

I no longer believe this is an advantage. I think we could just use an HMAC, or maybe an HMAC iterated a dozen times, and then checking the 20 bits would be (basically) just as slow as computing an entire address.

If we were to use an HMAC here, then the PBKDF would only be used to produce the actual seed data, and this doesn't really need to be deterministic, the user just needs to somehow know that they got enough entropy into it.

I have a feeling share data should be produced with a stream cipher too. Not because an unlimited amount is needed, but because the amount needed ranges from tiny 128-bit to 4096-bits (512-bits * 8) and it's cleaner to ask for the required bytes from a stream cipher than make implementers remember to iterate their hash enough times for all threshold-1 * len(master_seed) possibilities.

I think we should use an HMAC keyed on the share index, since we always need a fixed amount of data per index, and using an HMAC gets us close to the "as expensive as doing a BIP32 derivation" ideal. If we were to use chacha20, it would be faster for somebody to grind through secrets trying to get the share, than it would be to grind through them trying to get an address.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

I no longer believe this is an advantage. I think we could just use an HMAC, or maybe an HMAC iterated a dozen times, and then checking the 20 bits would be (basically) just as slow as computing an entire address.

Isn't this extremely close or identical to derived_key = pbkdf2('sha512', header, master_seed, 12)?

Also I see pbkdf2 pre-hashes passwords that are longer than the hash used, so only the header b'ms1cashs' should be used as password (or message if raw HMAC) otherwise 32+ byte codex32_secret's have an extra step. Likewise, it means using 'sha512' to support the 64-byte master_seed option without entropy reducing pre-processing.

If we were to use an HMAC here, then the PBKDF would only be used to produce the actual seed data, and this doesn't really need to be deterministic, the user just needs to somehow know that they got enough entropy into it.

Yeah that would be nice. Are you sure the pbkdf2 snippet above is not identical to dozen iterations of hmac-sha512 you have in mind? Can you share a code snippet for how you're thinking of processing the codex32_secret?

I have a feeling share data should be produced with a stream cipher too. Not because an unlimited amount is needed, but because the amount needed ranges from tiny 128-bit to 4096-bits (512-bits * 8) and it's cleaner to ask for the required bytes from a stream cipher than make implementers remember to iterate their hash enough times for all threshold-1 * len(master_seed) possibilities.

I think we should use an HMAC keyed on the share index, since we always need a fixed amount of data per index, and using an HMAC gets us close to the "as expensive as doing a BIP32 derivation" ideal. If we were to use chacha20, it would be faster for somebody to grind through secrets trying to get the share, than it would be to grind through them trying to get an address.

So HMAC-SHA512(key=derived_key, message=b'ms1cashq')[:len(master_seed) or HMAC-SHA512(key=master_seed, message=b'ms1cashq')[:len(master_seed)?

Truncating hmac-sha512 to the seed length to set dependent payloads.

Even though this is extra hmacs vs making a stream cipher from hmac.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Isn't this extremely close or identical to derived_key = pbkdf2('sha512', header, master_seed, 12)?

Oh, so it is :). For some reason I thought that PBKDF2 was much more complicated, but Wikipedia says that it's more-or-less just iterating the rng function.

I think it's fine then to just use PBKDF2 everywhere as you suggest.

So HMAC-SHA512(key=derived_key, message=b'ms1cashq')[:len(master_seed) or HMAC-SHA512(key=master_seed, message=b'ms1cashq')[:len(master_seed)?

I think, the latter. Are we still using derived_key elsewhere? If so, then we could use it here ... but if not, we could save the implementor the trouble of computing it.

Truncating hmac-sha512 to the seed length to set dependent payloads.

Even though this is extra hmacs vs making a stream cipher from hmac.

Yep.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

I like the symmetry where the password or message parameter is always the header including the share index.

That way index "s" is used to seed the hmac stream and the other indexes are truncated and used directly as share data.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Ah, yeah, I like that too.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Oh, so it is :). For some reason I thought that PBKDF2 was much more complicated, but Wikipedia says that it's more-or-less just iterating the rng function.

There might be slight differences between what is passed each round. Pbkdf2 uses the output of the last hash as the salt for the next hash, password stays constant repeated iteration times.

I think it's fine then to just use PBKDF2 everywhere as you suggest.

So HMAC-SHA512(key=derived_key, message=b'ms1cashq')[:len(master_seed) or HMAC-SHA512(key=master_seed, message=b'ms1cashq')[:len(master_seed)?

I think, the latter. Are we still using derived_key elsewhere? If so, then we could use it here ... but if not, we could save the implementor the trouble of computing it.

The point of making a derived_key is to not repeat hardening on the same or nearly the same input to produce multiple output keys. From master_seed you'd need a secure amount of iterations for 'ms1cashs' for the index_seed, again for k-1 independent shares like ms1casha, ms1cashb.

Instead the iterations could just be tripled for the same creation time and triple the security.

Since the attacker is going to be checking against known indexes or a known share that's only 1-2 pbkdf2 to do not k (one for index shuffle, k-1 for dependent shares).

derived_key = pbkdf2('sha-512', password=b'ms1cashs', salt=master_seed,iterations=k*(iter-1)) this creates a set as fast as separate pbkdf2's for each payload but is harder to attack as all work must be done to check if a payload matches, the attacker can't skip pbkdf2 rounds by checking Only one share's payload or worst case (1 pbkdf2), only the order of the first 8 indexes order (if they somehow learned these).

I do like including the share index in the message or password, it requires the attacker to compute 31 shares to check a payload or to do the shuffle first.

Truncating hmac-sha512 to the seed length to set dependent payloads.

Even though this is extra hmacs vs making a stream cipher from hmac.

Yep.

Sounds good.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

The point of making a derived_key is to not repeat hardening on the same or nearly the same input to produce multiple output keys. From master_seed you'd need a secure amount of iterations for 'ms1cashs' for the index_seed, again for k-1 independent shares like ms1casha, ms1cashb.

Instead the iterations could just be tripled for the same creation time and triple the security.

Ah, yeah, these are good points. Though as a point of terminology I think we should refer to the input to the pbkdf as entropy or something, and reserve the term "master seed" for the s share, since that's what actually gets used as the "master seed" in BIP32.

So agreed, let's make a derived_key, which I guess we should define as some xprv that users are super unlikely to derive for other reasons?

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

we should refer to the input to the pbkdf as entropy or something, and reserve the term "master seed" for the s share

When using the more efficient and secure derived_key approach,entropy input to the pbkdf2 is always header of the codex32_secret ("ms1idens") and master_seed the same as "S" from bip32:

From BIP32:

Generate a seed byte sequence S of a chosen length (between 128 and 512 bits; 256 bits is advised) from a (P)RNG.
Calculate I = HMAC-SHA512(Key = "Bitcoin seed", Data = S)

As long as you don't change the codex32 header to be "Bitcoin seed" instead of "ms1seeds" and set iterations=1, derived_key will never be their extended private masterkey / xprv.

From HMAC:

K' = H(K) if K is larger than block size
K' = K otherwise
K' is a block-sized key derived from the secret key, K; either by padding to the right with 0s up to the block size, or by hashing down to less than or equal to the block size first and then padding to the right with zeros.

It is counter-intuitive that BIP32 uses the secret master_seed as 'data' and a public constant as 'key'. My suspicion why is so that it handles any arbitrary length of S, because K gets hashed or padded.
It would be wrong if 0xdeadbeef produced the same master xprv as 0xdeadbeef000000. The hardened children also use the "public" data as Key= and the private key data as Data=. Maybe when the hmac hash output is a secret, it's better to reverse them?

If so (hardened child): let I = HMAC-SHA512(Key = cpar, Data = 0x00 || ser256(kpar) || ser32(i)).

Speaking of which, we need to define which pbkdf2('sha512', password=?, salt=?) parameter is master_seed and which is header:

Using HMAC to derive keys from our KDF "master" derived_key may be non-standard:

And then for the index_seed, and share_payload() whether derived_key and header (including index) is the key or the message.
Typically HMACs are sent with the data they authenticate, and key is secret, so that would mean data=header key=derived_key.

The correct primitive to use to derive share payloads from key-ring quality material appears to be HKDF:
https://en.wikipedia.org/wiki/HKDF

It supports a salt, context data and variable length and HKDF is standardized, avoiding custom stuff again.

The extra context field isn't needed for shares but could protect implementations from developer typos. ie context="Share data 1"

And lets the keystream generator use the derived_key directly without chance of rolling over the ms1seeds "s" character and reusing a share's entropy in a cataclysmic shuffling event.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

When using the more efficient and secure derived_key approach,entropy input to the pbkdf2 is always header of the codex32_secret ("ms1idens") and master_seed the same as "S" from bip32:

I'm a little lost. This sounds fine to me but then how is master_seed derived? I the s share was the output of the PBKDF.

Speaking of which, we need to define which pbkdf2('sha512', password=?, salt=?) parameter is master_seed and which is header:

I think we should do the "obvious" thing and say password is master_seed and salt is the header. BIP32 makes confusing choices ... your theory that they did this to avoid hashing the secret sounds plausible to me. The chaincodes in BIP32 come from a similar "entropy accounting" reasoning. I don't think this reasoning is useful, though it's certainly not harmful, other than causing some confusion. Pieter has suggested to me (and, I think, in public posts on stackexchange and elsewhere) on a number of occasions that if he were designing BIP32 today, he wouldn't have bothered with this stuff.

The correct primitive to use to derive share payloads from key-ring quality material appears to be HKDF:

Great find! This sounds perfect.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

As for pbkdf2 parameters:
From RFC 2898:

The salt can be viewed as an index into a large set of keys derived
from the password, and need not be kept secret.

This suggests salt should include the identifier.

From RFC 6234 PKCS #5: Password-Based Cryptography Specification: Version 2.1

For instance, one might derive a set of keys with a single application of
a key derivation function, rather than derive each key with a separate
application of the function
. The keys in the set would be obtained as
substrings of the output of the key derivation function. This approach
might be employed as part of key establishment in a session-oriented
protocol.

This shows my original approach of taking substrings of the pbkdf2 derived_key is a standardized way to use KDF outputs.

If there are concerns about interactions between multiple uses of the same key:

the salt should contain data that explicitly distinguishes between different
operations and different key lengths, in addition to a random part that is at
least eight octets long, and this data should be checked or regenerated by
the party receiving the salt.

I'm unsure if "salt" or "password" get passed as the "Data/Message" to the first HMAC in PBKDF2, but if my explanation why bip32 uses Data=S is correct, it seems better:

  • Don't pass master_seed as the Key so that it does not get padded or hashed
    • Adding q's to the payload could produce the same derived_key
    • Gets reused on subsequent re-shares with new headers.

My guess is password = message in PBKDF2 because passwords get reused while salts are supposed to be unique per password hash, see "viewed as an index" above.

And applicable for us, since this has to be deterministic:

Note: If a random number generator or pseudorandom generator is not
available, a deterministic alternative for generating the salt (or
the random part of it) is to apply a password-based key derivation
function to the password and the message M to be processed. For
instance, the salt could be computed with a key derivation function
as S = KDF (P, M). This approach is not recommended if the message M
is known to belong to a small message space (e.g., "Yes" or "No"),
however, since then there will only be a small number of possible
salts.

If we won't take substrings of PBKDF2's output, so that it's a simpler implementation with hash_name='sha-512', dk_len = 64, then HKDF is the way to expand that keyring material into multiple keys. But again, this is probably just recreating what pbkf2 does when dk_len > hash_len.

From RFC 5869 for HKDF:

We only need the "expand" stage to turn derived_key into the payloads

In some applications, the input may already be a good pseudorandom
key; in these cases, the "extract" stage is not necessary, and the
"expand" part can be used alone.

The second stage "expands" the pseudorandom key to the desired
length; the number and lengths of the output keys depend on the
specific cryptographic algorithms for which the keys are needed.

def hkdf_expand(input_key_material: bytes, info: bytes, length: int) -> bytes:
    temp = b""
    output_key_material = b""
    i = 0
    while len(output_key_material) < length:
        i += 1
        temp = hmac_sha256(prk, temp + info + bytes([i]))
        output_key_material += temp
    return output_key_material[:length]

3.2. The 'info' Input to HKDF

While the 'info' value is optional in the definition of HKDF, it is
often of great importance in applications. Its main objective is to
bind the derived key material to application- and context-specific
information. For example, 'info' may contain a protocol number,
algorithm identifiers, user identities, etc. In particular, it may
prevent the derivation of the same keying material for different
contexts (when the same input key material (IKM) is used in such
different contexts). It may also accommodate additional inputs to
the key expansion part, if so desired (e.g., an application may want
to bind the key material to its length L, thus making L part of the
'info' field). There is one technical requirement from 'info': it
should be independent of the input key material value IKM.

We specify an Info field to produce the "Share data" or "Index seed"

I think this could be easier to follow than slicing a larger pbkdf2 dk_len to produce index_seed and share_data. But both are pretty easy to implement.

Additional feature (I need it for my project):

Another aspect that the deterministic implementation should support is optionally passing pre-existing shares to leave unchanged in the new set. This means parsing the strings for their indexes and deleting them from CHARSET before the shuffle. Just like we already do with the codex32 secret and its index "s". And outputting fewer new independent deterministic shares.

This is generalizing the function from turning 1 codex32 string into a deterministic share set, to turning any codex32_strings_provided <= k into an auditable deterministically generated and shuffled share set.

An example application is if the user has memorized a "brain share" and doesn't wish to change it. Or to re-share and reuse k-2 existing shares. Or produce decoy seeds & shares from k-1 correct shares, etc.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

*One of the up to k codex32 strings provided MUST be the codex32 secret. Only index "s" can be fed to PBKDF2 to derive shares!

The provided shares are only used to delete indexes, decrement the quantity of new independent shares to create and then used with the secret and new independent share(s) (if any) to derive the rest of the set . This means the concept of "secure re-shares require a new identifier" still applies.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

When using the more efficient and secure derived_key approach,entropy input to the pbkdf2 is always header of the codex32_secret ("ms1idens") and master_seed the same as "S" from bip32:

I'm a little lost. This sounds fine to me but then how is master_seed derived? I the s share was the output of the PBKDF.

The BIP32 seed is produced with User_Entropy: dice rolls, key mashing, coin flips, etc and unconditionally displayed App_Entropy: a 64-byte pseudo random number from the device. Then they are passed through as computationally expensive KDF as the hardware supports:

In order of preference:
1. Argon2id using half the device's memory and 10 seconds as time cost.
2. Scrypt using half the devices memory, needs iterating to increase time cost, not recommended for low memory devices.
3. PBKDF2 where the iterations take approximately 10 seconds on the devices hardware.

master_seed can also be created from debiased dice rolls as per the current booklet for hand generation. This is the BIP32 master seed, so how it is generated is not pertinent to the share set generation. The user may already have a BIP32 master_seed they are using that they want to split into a codex32 SSS backup for example.

Edit: I think to help with these questions, I'll fork this repo and add a Python reference for auditable implementations. Commenting out parts still under consideration or displaying multiple implementation options.

Then we can do Draft reviews until it's Ready. This thread is unwieldy but the code explains itself much more simply than words do.

For example here:
https://github.com/BlockstreamResearch/codex32/tree/master/reference
/python-codex32/src/lib.py

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

https://github.com/BenWestgate/codex32/tree/master/reference/python-codex32/src

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Okay, I solved a great problem:

password=master_seed indeed has the flaw that zero padding it without changing the header won't change the derived key.

Which is a problem because it changes the master xprv (since bip32 has them reversed).

While len(master_seed) in salt was my original quick fix. And perhaps it still belongs since it's data revealed in the shares. What would be even better is adding the master pubkey fingerprint after the codex32 secret header.

salt=header||fingerprint

A salt is supposed to include a pseudorandom component anyway. And this one is expensive.

Alternatively, salt=master_seed password=header avoids the flaw.

Salts aren't supposed to be reused (they would be here when re-sharing a secret).

Adding len(master_seed) to salt doesn't fix its reuse across different passwords (master_seeds).

Adding Fingerprint is better. Salt will always be unique per unique master_seed OR header.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Wow, this padding in HMAC is a huge footgun. Good catch. I agree with you then that we should only use fixed/non-secret data for the "key" and put our actual key data into "data".

It also sounds like you have good justification from RFCs for all of the choices you're making, which is great because this has gotten complicated enough that it's quite hard to vet.

This is generalizing the function from turning 1 codex32 string into a deterministic share set, to turning any codex32_strings_provided <= k into an auditable deterministically generated and shuffled share set.

Innteresting. So, there is a danger here that the user can abuse this to create multiple sets that share some shares (but not others). I don't mind supporting this but we should have some pretty strong warnings saying not to do this, and if they want to have the same person participate in multiple sharings, they should give that person multiple shares.

There is also a danger that users would do this accidentally; they generate some shares from s as expected, then later they don't have s but do have a bunch of shares, so they try to regenerate the original shares from that. Or maybe they do have s, but also have a, and think "I've got this extra data, might as well enter it right?" and then accidentally generate new shares instead of regenerating old ones.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Wow, this padding in HMAC is a huge footgun. Good catch. I agree with you then that we should only use fixed/non-secret data for the "key" and put our actual key data into "data".

Putting len(master_seed) and a bip32 fingerprint into the salt also solves the password=master_seed 0 padding issue. And ticks more RFC compliance boxes (pseudorandom salt component, salt can be public, salt is never reused across different passwords, password remains secret.).

Innteresting. So, there is a danger here that the user can abuse this to create multiple sets that share some shares (but not others).

Yeah I agree, changing the default identifier could avoid this danger.

It's not the master_seed that defines whether shares are compatible but rather the threshold independent codex32 strings used to interpolate the rest of the backup.

If only the codex32 secret is passed, identifier can default to use the bip32 fingerprint.

If shares are passed, a hash of k shares data (ie data of share a, b and c if k = 3) is a more powerful ID for preventing improper share combinations.

This creates a circular reference if the identifier is in salt and determines the share data for a master_seed. But then the share data provided needs to determine the identifier.

I'll have to think about this more. But a hash of k shares data is definitely the safest default ID for backups made by specifying shares and their data.

I don't mind supporting this but we should have some pretty strong warnings saying not to do this, and if they want to have the same person participate in multiple sharings, they should give that person multiple shares.

Absolutely, and if the default output involves "re-labeling" the data provided, then the need to make a new share set will be obvious. It's the identifier (salt?) reuse that is dangerous.

There is also a danger that users would do this accidentally; they generate some shares from s as expected, then later they don't have s but do have a bunch of shares, so they try to regenerate the original shares from that.

If they have k shares they can already recover everything.

The function is called generate_NEW_shares(), it's not for recovering the original set, we already have that in Python and Rust with ms32_interpolate.

You shouldn't be able to generate a new share set without providing the codex32 secret. If you pass k shares you can calculate the secret but have no degrees of freedom to create any new data, so it'd just set the correct identifier and deterministic shuffle for displaying the dependent shares.

Or maybe they do have s, but also have a, and think "I've got this extra data, might as well enter it right?" and then accidentally generate new shares instead of regenerating old ones.

Titling the functions:

Recover Old Shares
Generate New Backup

Is very important to prevent these mistakes, but if you only pass codex32 string data from an existing backup it made, it ought to produce the exact same backup, as it's deterministic. Unless you override the identifier. Then it uses the provided data and fills in the rest. (That's what it's doing in the first case too but you provided identical data as it was going to generate had you not provided it.)

Only if you give it share data that's not from the default output of a given master_seed and header will it make something unique. (And relabel it)

This will make more sense if I try to code it first.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Here's my first stab at a reworked Generating Shares, For a fresh master seed:

def fresh_master_seed(bitcoin_core_entropy, user_entropy = '', seed_length = 16, k = '2', identifier = ''):
    """Derive a fresh master seed of seed length bytes with optional user-provided entropy."""
    import hmac
    import hashlib
    share_list = []
    share_payload = []
    if 16 > seed_length > 64:
        return None
    if 9 < k < 2:
        return None
    letters = sorted(filter(lambda c: c.isalpha(), CHARSET))
    derived_key = hashlib.scrypt(password=bytes(user_entropy + str(seed_length+k+identifier), "utf"),
                                 salt=bytes(bitcoin_core_entropy, "utf"), n=2 ** 20, r=8, p=1, maxmem=1025 ** 3, dklen=64)
    identifier = 'temp' if not identifier # place holder until we have better info
    relabel = True if not identifier else False

    for i in range(int(k)):
        # generate random initial shares
        share_index = letters[i]
        header = 'ms1'+ k + identifier + share_index
        info = str(seed_length)+'-byte share with header: '+header
        share_payload[i] = hmac.new(key=derived_key, message=info, hashlib.sha512).digest()[:seed_length]
        share_list += [encode('ms1', k, identifier, share_index, share_payload[i])]

    master_seed = recover_master_seed(share_list)
    if relabel:
        salt = b''.join(share_payload)
        identifier_data = hashlib.pbkdf2_hmac('sha512', password = master_seed, salt = salt,
                                             iterations = 2048, dklen = 3)
        new_identifier =  ''.join([CHARSET[d] for d in convertbits(identifier_data, 8, 5)[:4]])
        share_list = relabel_shares('ms1', share_list, new_identifier)
        return master_seed, share_list

I've not implemented shuffle yet so it returns master_seed and the k random shares generated at indices a, c, ... etc.

This default identifier was chosen to give incompatible share sets unique IDs, even if they belong to the same master_seed.
I think bip32 master pubkey fingerprint is more useful (for fresh master seeds, NOT re-shares) I just didn't know which bip32 library to use or to attempt to do it myself with python's cryptography library.

This identifier is derived by pbkdf2 using BIP39's cost parameters so that valid shares can't be checked for correctness against the identifier faster than against an address.

If you like this, I'll make it the default for re-shares when identifier is not specified and suggest to use fingerprint for reshare=False (when user has an existing bip32 master_seed or extended private masterkey and wants to encode it in a codex32 share set.)

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Identifier.

If you want to use a temporary identifier like this, I'd use 1111 rather than temp. The current code, I believe, will produce the same share data for a user-provided identifier of temp as it will for a non-user-provided identifier.

But instead I'd just split up these cases, encode one of

  • Length: + hex-encoded 2-digit length with 0x prefix + ID: + identifier + Threshold: + k
  • Same as above, with "ID:+identifer" replaced with NO IDEN.

Then in both cases you have a string of the same length where all fields can be found at the same offsets and there is no way that one case can be interpreted as the other.

PBKDF.

I'd use the full shares as salt, not just the payload. In particular you want to make sure k is somehow in there so that the total length is unambiguous. (Currently I think you can get the same salt if you have k = 4 and a 128-bit secret as if k = 2 and a 256-bit secret. Or at least, you need to depend on the security of the HMAC to guarantee this doesn't happen, which is a more complicated argument than just having the length be unambiguous.)

Other than that, this looks good to me!

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Re moving the index from position 6 to position 1, @roconnor-blockstream argues on IRC that

  1. I'm going to push back and disagree the index is the most important. If you are assembling shares, and the first 5 data characters dont' agree (8 characters if you include the prefix) then you are doing something wrong. It is helpful to have all the constant characters all together and at the front.
  2. Because all the constant characters are at the front, it means the first few columns of your checksum worksheet are also going to be constant. That can save a tiny amount of work.

Personally I think (2) is a big deal -- when doing the checksum worksheet by hand, it's refreshing to get a "head start" on the worksheet by having the first several lines all give the same characters. It's faster, it reassures you that you're doing the right thing (or at least, a consistent thing).

Also there is the argument that we already printed a bunch of codex32 books and can't change it :).

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Also there is the argument that we already printed a bunch of codex32 books and can't change it :).

Fair enough, the constants autocomplete anyways on electronic recovery.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Identifier.

If you want to use a temporary identifier like this, I'd use 1111 rather than temp. The current code, I believe, will produce the same share data for a user-provided identifier of temp as it will for a non-user-provided identifier.

Decode would fail with identifier = 1111 due to:

    if not all(x in CHARSET for x in bech[pos + 1:]):
        return (None, None, None, None, None)

If the user provides an identifier parameter of temp that will be the final identifier because relabel is assigned False.

    identifier = 'temp' if not identifier
    relabel = True if not identifier else False
    
    if relabel:

Should the functions below not validate the charset?

def ms32_decode(bech):
    """Validate a MS32 string, and determine HRP and data."""
    if ((any(ord(x) < 33 or ord(x) > 126 for x in bech)) or
            (bech.lower() != bech and bech.upper() != bech)):
        return (None, None, None, None, None)
    bech = bech.lower()
    pos = bech.rfind('1')
    if pos < 1 or pos + 46 > len(bech):
        return (None, None, None, None, None)
    if not all(x in CHARSET for x in bech[pos + 1:]):
        return (None, None, None, None, None)
    hrp = bech[:pos]
    k = bech[pos + 1]
    if k == "1" or not k.isdigit():
        return (None, None, None, None, None)
    identifier = bech[pos + 2:pos + 6]
    share_index = bech[pos + 6]
    if k == "0" and share_index != "s":
        return (None, None, None, None, None)
    data = [CHARSET.find(x) for x in bech[pos + 1:]]
    checksum_length = 13 if len(data) < 95 else 15
    if not ms32_verify_checksum(data):
        return (None, None, None, None, None)
    return (hrp, k, identifier, share_index, data[:-checksum_length])


def decode(hrp, codex_str):
    """Decode a codex32 string."""
    hrpgot, k, identifier, share_index, data = ms32_decode(codex_str)
    if hrpgot != hrp:
        return (None, None, None, None)
    decoded = convertbits(data[6:], 5, 8, False)
    if decoded is None or len(decoded) < 16 or len(decoded) > 64:
        return (None, None, None, None)
    if k == "1":
        return (None, None, None, None)
    return (k, identifier, share_index, decoded)

A way to recover master_seed without a temporary identifier is to directly ms32_recover() the share payloads converted to lists of base32 ints rather than first create shares and then decode() those to recover. This might be more elegant, I can use ms32_payload_list += [convertbits(share_payload[i], 8, 5, False)] to convert bytes to lists of base 32 ints.

Would ms32_recover(ms32_list) work without the identifier in the list? This may be more elegant, do you agree?
But an advantage to keeping it encode(payload[i]) is if we make the padding a bch binary checksum the new generated shares will get it as padding is handled by encode().

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

If the user provides an identifier parameter of temp that will be the final identifier because relabel is assigned False.

Yes, and despite having a different identifier, the share data will be the same.

Decode would fail with identifier = 1111 due to:

It should fail because what you're hashing here shouldn't be interpretable as valid share.

A way to recover master_seed without a temporary identifier is to directly ms32_recover() the share payloads converted to lists of base32 ints rather than first create shares and then decode() those to recover. This might be more elegant, I can use ms32_payload_list += [convertbits(share_payload[i], 8, 5, False)] to convert bytes to lists of base 32 ints.

Yes, I think that would be much easier to follow. As written, you create a temporary identifier, stick this into info, stick that into an HMAC, use that to get share payloads, then recover a master seed from them. It took me a while to confirm that the "temporary" identifier was indeed determining all of the data that was being produced, and doing so in a way that would collide with non-temporary identifiers.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

But instead I'd just split up these cases, encode one of

  • Length: + hex-encoded 2-digit length with 0x prefix + ID: + identifier + Threshold: + k
  • Same as above, with "ID:+identifer" replaced with NO IDEN.

Then in both cases you have a string of the same length where all fields can be found at the same offsets and there is no way that one case can be interpreted as the other.

This is referring to this line?
info = str(seed_length)+'-byte share with header: '+header

Can you give an example info= string for each case?

The problem isn't the HMAC, message= is fine with variable length, the problem is decode() needs a valid identifier or it fails. I call decode() in my brute force error correction so it should fail on this too.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Can you give an example info= string for each case?

  • Length:0x20ID:TEMPThreshold:2
  • Length:0x10NO IDENThreshold:3

The problem isn't the HMAC, message= is fine with variable length, the problem is decode() needs a valid identifier or it fails. I call decode() in my brute force error correction so it should fail on this too.

The code you posted doesn't call decode anywhere that I see.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

If the user provides an identifier parameter of temp that will be the final identifier because relabel is assigned False.

Yes, and despite having a different identifier, the share data will be the same.

They will be different because the temporary identifier isn't assigned until after derive_key, identifier is still '' if it's not provided, then it's assigned temp, vs if you passed temp temp would be used in the scrypt KDF and give a different derived_key

    derived_key = hashlib.scrypt(password=bytes(user_entropy + str(seed_length+k+identifier), "utf"),
                                 salt=bytes(bitcoin_core_entropy, "utf"), n=2 ** 20, r=8, p=1, maxmem=1025 ** 3, dklen=64)
    identifier = 'temp' if not identifier # place holder until we have better info
    relabel = True if not identifier else False

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

They will be different because the temporary identifier isn't assigned until after derive_key, identifier is still '' if it's not provided, then it's assigned temp, vs if you passed temp temp would be used in the scrypt KDF and give a different derived_key

Ah, yes, I see. I didn't really look at derived_key. But now that I do, I think we should also change it to include (a) user_entropy with a length prefix, seed_length in a fixed-width format, and identifier (or NO IDEN) in some sort of fixed-width format.

I still thing we should avoid using an actual identifier for identifier if none is provided.

TBH I have no idea what Python will do with str(seed_length + k + identifier) if identifier is None or if seed_length is an integer. The text of a specification should clarify this.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Yes, I think that would be much easier to follow.

I will try master_seed = bytes(convertbits(ms32_recover(ms32_payload_list), 5, 8, False)) in a few hours.

I will also make a general function ms32_convert_bytes() that converts from bytes to ms32_payloads that adds the padding checksum. Then that can be used in encode() as well as constructing the ms32_payload_list and converting bip32 fingerprints or the share payload KDF hash into bech32 identifiers.

I'll also push the entire library I have to reference/python-codex32/src/lib.py so you can look at it and get oriented or make progress while I'm afk.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

PBKDF.

I'd use the full shares as salt, not just the payload. In particular you want to make sure k is somehow in there so that the total length is unambiguous.

We only have full shares with identifier = 'temp' at this point. This indent block is for finding the unique ID of the payload set. So all compatible sets will always get the same id.
Are you suggesting using the placeholder?

ALSO, getting the default unique identifier for a distinct set of k shares should be in its own function as this is not the only place we'll need it. (For an existing master seed needs it as well if parameter reshare = True.)

(Currently I think you can get the same salt if you have k = 4 and a 128-bit secret as if k = 2 and a 256-bit secret. Or at least, you need to depend on the security of the HMAC to guarantee this doesn't happen, which is a more complicated argument than just having the length be unambiguous.)

That's a good point, those shouldn't give the same identifier because they're incompatible sets.

Here's what you define as a set that can be combined:

In order to recover a master seed, one needs a set of valid codex32 shares such that:

All shares have the same threshold value, the same identifier, and the same length.
All of the share index values are distinct.
The number of codex32 shares is exactly equal to the (common) threshold value.

Here's what we have:

A known threshold value, an unknown identifier, a known length
Known indexes, I made these always be first letters as they must be consistent to give the same identifier for a combinable set.
A number of codex32 payloads exactly equal to the (common) threshold value.

Let's write a function that takes k, seed_length, and a list of indexed payloads of len(indexed_payloads) == k and returns a unique bech32 identifier that will Always be identical if the indexed_payloads belong to the same order k polynomial and seed_length. Regardless of which indexed_payloads are passed. So it will first ms32_interpolate to get ms32_indexed_payload for index a, c, d ... etc until it has k many then convert to bytes to PBKDF2 them along with k and seed_length:

For example:

salt=bytes('Threshold: '+k+', Length: '+str(seed_len)+'-bytes, Payloads: ', 'utf')+b''.join(fixed_index_payloads)

It may be better that the first known index used is 's' followed by 'a', 'c', 'd' ... etc as we cannot set this identifier without knowing master_seed because the data that goes into the hash defines a master_seed and a particular combinable indexed_payload set of it. The codex32 secret is a codex32 string like any others so there's no reason this shouldn't also produce an ID for k = 0, a length, and the 's' indexed_payload master_seed.

What do you think?

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

We only have full shares with identifier = 'temp' at this point. This indent block is for finding the unique ID of the payload set. So all compatible sets will always get the same id.
Are you suggesting using the placeholder?

Ahh, right. I am tempted to use the placeholder, yeah.

In fact, I would define an "entropy header" up-front as the Length:0x20ID:TEMPThreshold:2 string (I'm not married to the exact syntax, but what I do want is (a) no ambiguity between the identifier/no identifier case, (b) all fixed width fields) and then use this repeatedly. Both to derive the share data and then again to salt the PBKDF.

Let's write a function that takes k, seed_length, and a list of indexed payloads of len(indexed_payloads) == k and returns a unique bech32 identifier that will Always be identical if the indexed_payloads belong to the same order k polynomial and seed_length.

Ah, yep, this sounds like a good abstraction. I think it should also take an optional identifier so that it can compute and use the "entropy header".

So it will first ms32_interpolate to get ms32_indexed_payload for index a, c, d ... etc until it has k many then convert to bytes to PBKDF2 them along with k and seed_length:

Nice. I was thinking "just use s" but you're correct that we need to encode k shares. I'm indifferent as to whether we should start with s or not. Probably that's a good idea since it emphasizes that knowing this data requires that you know s, and maybe encourages implementors to be a little more careful with it.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Can you give an example info= string for each case?

  • Length:0x20ID:TEMPThreshold:2
  • Length:0x10NO IDENThreshold:3

Needs share_index or all payloads will be identical.

On fixed length:
We learned the other day, it's password in KDF and key in HMAC that are zero padded (or pre-hashed) to the hash function block size. The salt is passed as the message and can be variable length with no issue.

Did you choose Hex for the length because other languages are more likely to have the length available as hex? Seed_length is an int in my function and putting 'Length: 16', 'Length: 32', 'Length: 64' conveys the same info with less characters.

Here's a counter-proposal:

info = 'Share Length:'+str(seed_length)+'Threshold:'+k+'Index:'+share_index+'Indentifier:'+indentifier

The problem isn't the HMAC, message= is fine with variable length, the problem is decode() needs a valid identifier or it fails. I call decode() in my brute force error correction so it should fail on this too.

The code you posted doesn't call decode anywhere that I see.

encode() calls decode() to self-validate, sorry, I copied pieter's segwit_addr.py as closely as possible when I wrote this.

def encode(hrp, k, identifier, share_index, payload):
    """Encode a codex32 string"""
    if share_index.lower() == 's':  # add double sha256 hash byte to pad seeds
        checksum = hashlib.sha256(hashlib.sha256(payload).digest()).digest()
    else:
        checksum = b'' # TODO: use a reed solomon or bch binary ECCcode for padding.
    data = convertbits(payload + checksum, 8, 5, False)[:len(convertbits(payload, 8, 5))]
    ret = ms32_encode(hrp, [CHARSET.find(x.lower()) for x in k + identifier + share_index] + data)
    if decode(hrp, ret) == (None, None, None, None):
        return None
    return ret

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Needs share_index or all payloads will be identical.

Ah, yep.

On fixed length: We learned the other day, it's password in KDF and key in HMAC that are zero padded (or pre-hashed) to the hash function block size. The salt is passed as the message and can be variable length with no issue.

Fixed length is not about padding. It's to ensure that the data fed into a hash function can be uniquely parsed, without any subtle or fragile analysis.

Did you choose Hex for the length because other languages are more likely to have the length available as hex? Seed_length is an int in my function and putting 'Length: 16', 'Length: 32', 'Length: 64' conveys the same info with less characters.

I chose hex because it's more familiar and easy to user about than decimal. We could drop the 0x prefix. But I guess if our lengths are always between 16 and 64 inclusive, decimal is just as easy to use.

Here's a counter-proposal:

Sounds good! Except identifier is spelled wrong :).

encode() calls decode() to self-validate, sorry, I copied pieter's segwit_addr.py as closely as possible when I wrote this.

This makes sense -- but in general, if something is not intended to be a share, I really think that it should not parse as a share.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Ah, yes, I see. I didn't really look at derived_key. But now that I do, I think we should also change it to include (a) user_entropy with a length prefix, seed_length in a fixed-width format, and identifier (or NO IDEN) in some sort of fixed-width format.

If the user entropy is provided as a utf string, the length prefixing is unnecessary, 'password0' derives a different key from 'password00' and anything else they could type. If you think bytes should be able to be passed to this for some reason (what human speaks bytes?) then it may need a length prefix if it has the same padding/pre-hashing "feature/bug" as PBKDF2.

I'm quite fine with a long user_entropy getting pre-hashed (if scrypt indeed does this), in 99% of circumstances the bitcoin_core_entropy will be more random, unique and worth fully preserving, but in the worst case, the master xprv can be deserialized to swap password and salt without losing any information from either.

seed_length is already passed to scrypt, in that typo you found below:

password=bytes(user_entropy + str(seed_length) + k + identifier), "utf")

I still thing we should avoid using an actual identifier for identifier if none is provided.

TBH I have no idea what Python will do with str(seed_length + k + identifier) if identifier is None or if seed_length is an integer. The text of a specification should clarify this.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

If the user entropy is provided as a utf string, the length prefixing is unnecessary, 'password0' derives a different key from 'password00' and anything else they could type. If you think bytes should be able to be passed to this for some reason (what human speaks bytes?) then it may need a length prefix if it has the same padding/pre-hashing "feature/bug" as PBKDF2.

We should have a length prefix. Otherwise the parse-unambiguity of the data that we are parsing depends on the fixed-lengthedness of our other data, which is subtle and fragile. I also do not think the ambiguity should depend on any assumptions about user behavior.

We could also pre-hash the data, which would have the same effect.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

We should have a length prefix. Otherwise the parse-unambiguity of the data that we are parsing depends on the fixed-lengthedness of our other data, which is subtle and fragile. I also do not think the ambiguity should depend on any assumptions about user behavior.

You would never "parse" password, it is simply the user's keyboard mashing or coinflips or dice rolls, it is forgotten after deriving derive_key, unless someone is auditing their wallet, they'll need to re-enter it on a new wallet/device.

scrypt Parameters

The scrypt function takes several parameters. The passphrase P is
typically a human-chosen password. The salt is normally uniquely and
randomly generated [RFC4086].

If the parameters are unintuitive, it's harder to audit. Password is designed to accept human-chosen data.

We could also pre-hash the data, which would have the same effect.

https://datatracker.ietf.org/doc/html/rfc7914.html#section-6

scrypt behaves as pbkdf2 does, it will prehash passwords longer than the block length of HMAC-SHA256, which I think is 64-bytes. And it will pad shorter ones.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

It's true you would never actually parse it. But if you're putting it into a hash function, it needs to be parseable. This is a general rule of thumb for designing cryptosystems, to ensure that it is impossible for two different things to be encoded the same way (and therefore result in the same hash).

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

In fact, I would define an "entropy header" up-front as the Length:0x20ID:TEMPThreshold:2 string (I'm not married to the exact syntax, but what I do want is (a) no ambiguity between the identifier/no identifier case, (b) all fixed width fields) and then use this repeatedly. Both to derive the share data and then again to salt the PBKDF.

Agreed, we should set this local variable up front and reuse it and/or pass it to the proposed get_identifier() function. Sure it can be fixed length by filling something of equal length without an identifier.

Nice. I was thinking "just use s" but you're correct that we need to encode k shares. I'm indifferent as to whether we should start with s or not. Probably that's a good idea since it emphasizes that knowing this data requires that you know s, and maybe encourages implementors to be a little more careful with it.

We're going to use 's' as the first share_index because it makes existing master seed easier. I'm also sick of parsing the sorted alphabet, so I'll be using CHARSET[i] to get the initial indexes for generating the set or computing the default identifier.

Sounds good! Except identifier is spelled wrong :).

I do this several times a day. Do you have a suggested alternative variable name or just keep spell checking. (Implementers are going to typo it too if using a new language.) id or ident i've considered but I always changed them back.

This makes sense -- but in general, if something is not intended to be a share, I really think that it should not parse as a share.

That's why we need a function that does general bytes to ms32 converting. I'll make it soon as it gets used at least 4 different places: fingerprint>bech32 id, unique combinable share set hash > bech32 id, indexed_payload > ms32_indexed_payload for ms32_recover(), and master_seed > ms32_indexed_payload for ms32_interpolate()

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

We're going to use 's' as the first share_index because it makes existing master seed easier. I'm also sick of parsing the sorted alphabet, so I'll be using CHARSET[i] to get the initial indexes for generating the set or computing the default identifier.

Sounds good to me.

I do this several times a day. Do you have a suggested alternative variable name or just keep spell checking. (Implementers are going to typo it too if using a new language.) id or ident i've considered but I always changed them back.

We could just misspell it, like referer in HTTP ;).

I would suggest ident or id. Either is fine by me, and both are much easier to spell. If you can't decide, let's go with ident.

That's why we need a function that does general bytes to ms32 converting.

Yep, I think that's generally useful. rust-bech32 has such a function (well, an iterator adaptor). I'm not really sure what exists for Python.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

I would suggest ident or id. Either is fine by me, and both are much easier to spell. If you can't decide, let's go with ident.

I will refactor to 'ident'. Saves characters, some of the lines exceed 100 for a single operation. I had not written those lines when I rejected it previously.

That's why we need a function that does general bytes to ms32 converting.

Yep, I think that's generally useful. rust-bech32 has such a function (well, an iterator adaptor). I'm not really sure what exists for Python.

convertbits(your_bytes_to_convert, 8, 5) does it. It doesn't care if it's given bytes or lists of ints 0-255.
I modified it from the segwit_addr.py version to accept any padding as valid rather than only 0s.

But we need to take it to the next step by always applying ECC coding as padding if pad=True and tobits=5.

def convertbits(data, frombits, tobits, pad=True):
    """General power-of-2 base conversion."""
    acc = 0
    bits = 0
    ret = []
    maxv = (1 << tobits) - 1
    max_acc = (1 << (frombits + tobits - 1)) - 1
    for value in data:
        if value < 0 or (value >> frombits):
            return None
        acc = ((acc << frombits) | value) & max_acc
        bits += frombits
        while bits >= tobits:
            bits -= tobits
            ret.append((acc >> bits) & maxv)
    if pad:
        if bits:
            ret.append((acc << (tobits - bits)) & maxv)
    elif bits >= frombits:
        return None
    return ret

Regarding the padding:

  • The 1st and 2nd bit together should guarantee detecting 1 bit-error, 2 if sequential, in 128-bit data.
  • The 3rd bit in combination with the first two should guarantee detecting 2 bit-errors, 3 if sequential and correcting 1 bit-error in 512-bit data.
  • The 4th bit in combination with the first three should guarantee detecting 4 sequential bit-errors.
  1. All of these should be independent of the codex32 checksum to add to its correction capacity.
  2. All of these should ms32_interpolate() and remain valid checksums for the interpolated data. Linear codes???

The pad_len = 1 bit case can truncate the 2-bit code.

I'd rather @apoelstra try to do this, otherwise I'm likely to miss quality 1, 2 or both. I think 1 has something to do with codex32 generator and padding generator not sharing coefficients? But as for 2, ms32_interpolate()ing the 2, 3 or 4 bit code and getting another valid code, I'm lost besides the phrase "linear codes" and would have to trial and error it.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

The 1st and 2nd bit together should guarantee detecting 1 bit-error, 2 if sequential, in 128-bit data.

This is easy; you can, for example, just xor every bit together for bit 1, and xor every other bit for bit 2.

The 3rd bit in combination with the first two should guarantee detecting 2 bit-errors, 3 if sequential and correcting 1 bit-error in 512-bit data.

I don't think this is possible because it would violate the Hamming bound (which is unlimited if you are trying to get distance 1 or 2, i.e. detect 1 error but correct none, but for distance >2 implies that for length 2^(2^n) you need at least n parity bits ... so for length >512 you need 9 parity bits).

Similarly for your goals with the 4th bit.

I think 1 has something to do with codex32 generator and padding generator not sharing coefficients?

Well, the two codes are over different fields so there's no way they can "share coefficients" and really no way to algebraically relate them at all. By the Hamming bound again I definitely can't make the 2-bit code purely additive like you want. The best we can do is make it independent of the original code (which my simple parity code is) so that if you have a random error pattern that somehow tricks the real code, with probability 3/4 it won't trick the small code.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

I tried your formula and looks like 4-bits should be able to correct a bit-error in 256-bit data.

The 3-bit code just needs to detect as many contiguous 3 bit-errors as possible.

By the Hamming bound again I definitely can't make the 2-bit code purely additive like you want.

meaning ms32_interpolate() will produce shares with invalid ecc padding?

Even the parity bit is not additive? In this way? (I'm going to start experimenting when I get home unless you're certain.)

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

I tried your formula and looks like 4-bits should be able to correct a bit-error in 256-bit data.

The actual formula is the Hamming bound. For binary codes q = 2, and n is the total length of your data including checksum (so 260). To correct a single error you need d = 3, so t = 1, and the formula simplifies to give a total code size of 2^260 / (1 + 260) = 2^251.97. Unfortunately not enough to fit 256 bits of data.

The approximate formula I gave was that for 256 = 2^(2^8) you would need 8 bits to get distance 3, so even the approximate one suggests that this won't work.

The 3-bit code just needs to detect as many contiguous 3 bit-errors as possible.

I think the simplest approach is:

  • For one bit do a parity check
  • For two bits, do a parity check on the odds and a parity check on the evens
  • For three, do a parity check every third, then every third +1, then every third +2
  • In general, if you've got n bits, use them to interleave parity checks.

This approach is guaranteed to detect 1 error (which is the best the Hamming bound will let us do, for low numbers of bits) and to detect n consecutive errors.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

ChatGPT explained like I was five that not even the parity bit stays valid after ms32_interpolate().

The problem with adding a checksum that won't interpolate is attacker can guess which shares are derived and which were generated better than chance. That's why I added double_sha256 bits only to master_seed, not every encode()'ed index.

So replacing that on "s" with your parity suggestion is an improvement if what I want (interpolate into more valid binary codes) is impossible.

And in your opinion would you apply the parity to all k generated shares, k-1 generated shares and the codex secret or Only the codex32 secret?

If shares they only help the user if they have that subset, while every recovery will need the secret so it's good to be certain it's valid code there.

How would they know they had an valid ECC share? Seems like the generated share indexes have to be fixed: a, c, d, up to k or k-1 have valid codes, then the remaining indexes are shuffled (as those are the ones that actually reveal the size of the backup, you can infer a exists given b, from the threshold 2.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

I'm not sure. I agree that if it's not preserved by ms32_interpolate, then this is much less useful and also a privacy loss. Will need to do some experiments since I'm not sure analytically how to approach this. Parity checks are definitely additively homomorphic (if you add two shares they're preserved) but I'm unsure how they behave under bech32 multiplication.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

In fact -- even if you set these bits to fixed 0, after ms32_interpolate, they might not be 0!

I checked the BIP and it looks like we (correctly) say to choose these at random, but I had never considered this to be particularly important until now, and I think in my personal shares I may have set these bits to 0... (though for my personal shares I also generated them starting from S, A, C, D, in order, so I already don't have index privacy or privacy against initial/derived discriminators).

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

In fact -- even if you set these bits to fixed 0, after ms32_interpolate, they might not be 0!

There's one last thing to try, xor every 5th bit and see what happens. That should give the exact same result as adding every symbol and truncating so maybe IT will be ms32_interpolate()-able.

Or maybe better, xor the 4th bit of every symbol to pick the 1st bit of padding and the 5th bit of every symbol for the 2nd. 128-bit case. That way we're sampling in the same period of the higher field it might work. This way last partially filled symbol doesn't get included.

zeros do enhance the error correction of the codex32 checksum but only for the last character of generated strings. That's kind of flimsy when they could detect errors anywhere (and enhance chances of error correction more often).

Leaves 9 choices:

Type of padding:

  1. Zeros
  2. double_sha256
  3. parity bits

Where we should pad:

  1. k generated shares
  2. k-1 generated shares and the codex secret
  3. Only the codex32 secret

My original implementation generated len(master_seed)+1 bytes randomly for share payloads and then truncated. If generated shares are padded with coding they need fixed indexes so the user knows they have correction help, this has no impact on redundancy degree privacy. The derived shares still need deterministically random indexes.

If k-of-(k+1) or smaller backups are most common, k shares is best.

If n > k, then the best choice between s & k - 1 shares vs k shares can be calculated by number of sets in permutations(k,n) that have superior ecc. I don't think that's complex math so the function could choose based on k and n what to do. But choosing leaks a bit of redundancy privacy since the likelihood of finding early alphabet shares is k/n vs (k-1)/n and which of those it is suggests a lower or upper bound for n, since it was a function of k vs perfectly private odds if all shares are generated with random padding and indexes.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Basically I'm suggesting calculating the parity symbol in GF(32) excluding the padding symbol, then truncate the first bits to append it as padding and see if that will interpolate and remain a parity symbol for every share.

And then this will bring up question 1. again is it orthogonal to the codex32 checksum.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Intuitively this should work, since you're just adding the symbols in gf32 and "truncating" which is actually taking the value mod x^2 which is linear ... except that these bits aren't sitting isolated outside of the string. They're embedded inside of the last character, where they algebraically interact with the other bits of that character. So I don't think it works.

That is, we're not saying (x1 + x2 + x3) mod x^2 = Y, we're saying (x1 + x2 + x3 mod x^2) + x4 = Y. The former would be compatible with interpolation, the latter won't be.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

When considering among these:

  1. k generated shares
  2. k-1 generated shares and the codex secret
  3. Only the codex32 secret

I realize from pure error detection ability, 1 & 2 are strictly superior, regardless which shares the user wants to recover with. Because they represent k * pad_len total checksum bits as opposed to 2-4. Here the non-linearity works to our advantage as threshold shares will always be able to interpolate back to index 16, 0, ... k - 1 and attempt corrections to make all k of these codes valid before checking an address.

Index 16 "s" is padded first so the unshared secret can also benefit from ecc padding.

Finally, by fixing the index of the ecc generated shares, that doesn't have to have any privacy impact at all (besides sometimes revealing this scheme is in use, which is fine if the standard) if the share set we output is randomly shuffled from the 31 total.

It's not necessary to grab the share with or without this padding in order to use it once you have k shares and can interpolate to the k padded strings.

These strings are all independent so still no error correction guarantees, but my instinct tells me triple the bits for k=3 gives 1/3 the chance of not detecting an error pattern.

The actual formula is the Hamming bound. For binary codes q = 2, and n is the total length of your data including checksum (so 260). To correct a single error you need d = 3, so t = 1, and the formula simplifies to give a total code size of 2^260 / (1 + 260) = 2^251.97. Unfortunately not enough to fit 256 bits of data.

By this formula k=4 and 128-bit shares and k=3 and 256-bit shares may be able to correct one bit error if the other k-1 strings are valid and correct (reasonable assumption if the codex32 checksum is valid).

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Look at this absolute masterpiece:

def fingerprint_identifier(payload_list):
    from electrum.bip32 import BIP32Node
    from electrum.crypto import hash_160
    for data in payload_list:
        root_node = BIP32Node.from_rootseed(data, xtype="stanadrd")
        pubkeys += root_node.eckey.get_public_key_bytes(compressed=True)
    
    return ''.join([CHARSET[d] for d in convertbits(hash_160(pubkeys), 8, 5)])[:4]

Where payload list is each payload in bytes at index 16, then 0, ... k - 1.

For k = 0 or reshare=False, [master_seed] is passed to payload_list and it returns the same fingerprint as prepended in its own descriptors. :) For reshare=True it forms a unique combinable share set ident regardless of seed_length and k thanks to the prehashing. And it's expensive and can help further error detect the k generated shares from above.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Design decision fork:

Should re-shares require new app_entropy and user_entropy to create a fresh derived_key or be determined by master_seed and header alone?

master_seed and header alone means:

  1. a given master_seed has just 2^23 possible corresponding share sets (all thresholds)
  2. shares can be re-derived from master_seed rather than being independent unrecoverable secrets of their own
  3. requires hardening share payloads to be costlier to re-derive than addresses
  4. excellent error detection
  5. re-shares require no new entropy besides selecting a unique ident
  6. ident becomes a nonce to determine share payloads, vs generated payloads determine the ident above
  7. guarantees no collisions of incombinable sets for re-shared master_seed + unique ident vs hash_160(pubkeys) is best effort basis
  8. runs risk of accidentally re-using an ident when re-sharing, breaking security
  9. birthday problem is over 2^20 ident rather than avoided by generating fresh entropy for the new shares
  10. Easier to audit
  11. Less steps required to re-share

If we go header and master_seed alone:
probably rename ident to unique_id or unique_ident to emphasize the criticality it never be reused for the same master_seed unless deliberately trying to add shares to an existing backup.

If we're okay with a risk of share set collisions when the user doesn't know all previously used identifiers to guarantee a unique ident then fresh_master_seed() could simplify to this:

def fresh_master_seed(bitcoin_core_entropy, user_entropy = '', seed_length = 16, k = '2', ident = ''):
    """Derive a fresh master seed of seed length bytes with optional user-provided entropy."""
    import hashlib
    if 16 > seed_length > 64:
        return None
    if 9 < k < 2:
        return None
    master_seed = hashlib.scrypt(password=bytes(user_entropy + str(seed_length+k+ident), "utf"),
                                 salt=bytes(bitcoin_core_entropy, "utf"), n=2 ** 20, r=8, p=1, maxmem=1025 ** 3, dklen=seed_length)
    if not ident:
        ident = fingerprint_ident([master_seed])
    return encode('ms', k, ident, 's', master_seed)

Really, it's fresh_codex32_secret() but it could also return (master_seed, ident) to keep the current name accurate.

It's not terrible if fresh entropy creates an ident that matches an existing one (1% chance after 146 sets) because they have independent payloads and an incorrect recovered secret should fail the hash_160(pubkeys) last resort error detection but it is bad (as discussed earlier), if an ident nonce is accidentally reused to re-share a master_seed, which has the same odds as above.

In the event a previous ident used is not known, the user has to generate fresh entropy anyway (to minimize collision odds). So using header as nonce is really only advantageous when all are known. Which is always true the first re-share.

Because of the small identifier that was not intended to be used as a nonce (or it'd be 32 or 64-bits) this cannot be reduced to insignificant odds. ≈ 2 * 10^-6 is the chance of 2 identifier's colliding when selected randomly. There's a reason why the bech32 checksum is 6 characters. If everyone on earth made a codex32 backup and re-shared an existing master_seed twice, all forgetting the original ident used (having a default may help extend this to a 3rd reshare since we can assume the fresh_codex32_secret() used bip32 fingerprint) 16,000 of them are going to accidentally reshare the same share set even though they did everything correctly (except not forgetting old ident). If the identifier was increased to 6 characters this could be 16 self-victims.

I'm also open to a combination approach where it uses master_seed and header if the user can affirm the new ident is absolutely unique but requires fresh entropy and makes a new derived_key when they cannot make give such assurance. This minimizes the risk of colliding share sets by having 2 ^ ((k - 1) * seed_length) possible re-share sets instead of just 2^20 while also minimizing labor and complexity when the user is absolutely sure they picked a unique id, which is true at least the first re-share and sometimes the 2nd.

ChatGPT informs me appending a unix timestamp to the header before generating new shares for an existing master_seed will prevent share payload collisions without needing to ask the user for fresh entropy. The time used should be displayed unconditionally for auditing purposes though. Malicious implementations could reuse a previous timestamp so it's better to display a human readable date time. Precision down to the minute is enough as no one will write and confirm multiple re-shares per minute. Or maybe just the date is enough and has better odds of being recalled if there's a need to recreate an existing re-share set.

Make the call here and I'll code for_an_existing_master_seed().

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

By this formula k=4 and 128-bit shares and k=3 and 256-bit shares may be able to correct one bit error if the other k-1 strings are valid and correct (reasonable assumption if the codex32 checksum is valid).

This error correction code would only work for a single blessed share set, not arbitrary sets of k. And it would be visible to a privacy attacker which share set was blessed. And it's still not guaranteed that the error we can correct will be distinct from the errors that the full codex32 code can correct. TBH I think this line of reasoning is just too much complexity for too little benefit and we should just use the bits unused.

Look at this absolute masterpiece:

Heh, the overlap in the k=0 case with BIP32 is pretty nice. And it is easy to implement in terms of off-the-shelf BIP32 methods, even though the bare algorithm feels a little overcomplicated.

Should re-shares require new app_entropy and user_entropy to create a fresh derived_key or be determined by master_seed and header alone?

Great question. Thanks for the very thorough post. I am leaning toward the "stick an external counter/timestamp/weak randomness in there" approach. Requiring user_entropy is too much burden on the user (as is requiring the user to keep track of uniqueness, if we were considering that) but I think it's fine to require the app to generate a unique token. I think our specific recommendation should be: either (a) a monotonic counter, or (b) at least 12 bytes (96 bits) of fresh randomness. The UNIX timestamp isn't a great choice because it only changes every second, and it won't be accessible to most HWWs.

Regarding error detection abilities: my feeling is that resharing will be fairly uncommon. So I think there's a reasonable amount of value to having error detection in the initial sharing but greatly diminished value in subsequent sharings. Remember that the checksum itself provides overwhelming protection against random errors; the fingerprint detection gives us the additional ability to detect malicious errors, which is itself a fringe/uncommon case. And of course, we always have this ability if we're willing to derive full addresses.

So basically, if we use the bip32 fingerprint for the initial sharing, this gives us a bunch of nice benefits (cheap/local error detection mainly) at basically no cost. But for re-sharing, the costs become nontrivial ... and especially, there is a risk of ident reuse leading to shareset reuse, which is probable enough that it'll happen to a nonzero number of users.

As a further, point, even with the master_seed/header scheme, the error detection is not great. We no longer have the BIP32 fingerprint to compare against. Instead we have a weird hash of the share data itself, which is not something the wallet can produce on demand if the user has a single share they want to check on. Plus it's extra implementation complexity, reduced auditability, etc. So the benefit isn't even as big.

(I had a much longer reply suggesting the opposite, which took me half an hour to write but I decided that I disagreed with it :). This is quite a time-consuming exercise..)

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

This error correction code would only work for a single blessed share set, not arbitrary sets of k. And it would be visible to a privacy attacker which share set was blessed. And it's still not guaranteed that the error we can correct will be distinct from the errors that the full codex32 code can correct. TBH I think this line of reasoning is just too much complexity for too little benefit and we should just use the bits unused.

This will be my last question on the padding shares topic:

If a secret S, share A, C have ecc padding, the threshold is 3 and the user finds D, E, F but F has an uncorrectable errors, can he ms32_interpolate() using the candidate share F over-corrections to secret S, share A and share C and then check if their ECC padding is valid to save work?

Look at this absolute masterpiece:

Heh, the overlap in the k=0 case with BIP32 is pretty nice. And it is easy to implement in terms of off-the-shelf BIP32 methods, even though the bare algorithm feels a little overcomplicated.

I could've concatenated them as seeds to use def calc_fingerprint_of_this_node() which is a method in every bip32 library but is ambiguous for different seed_length. It doesn't matter as ident is to differentiate incombinable share sets and a length diff does that for free.

I have an even better idea for re-share ident however: encrypt the bip32 fingerprint by the birthdate/timestamp.

Should re-shares require new app_entropy and user_entropy to create a fresh derived_key or be determined by master_seed and header alone?

I am leaning toward the "stick an external counter/timestamp/weak randomness in there" approach.

I coded the unique salt for the derived_key as follows:

# date used to create a unique nonce if user forgets any past identifiers
date = datetime.date.today().strftime("%Y%m%d") if forgot else '19700101'
# must be displayed unconditionally for auditing purposes if forgot=True
fingerprint = fingerprint_ident([master_seed])
salt = codex32_secret[:9] + fingerprint + date
derived_key = hashlib.pbkdf2_hmac('sha512', password=master_seed, salt=salt,
                    iterations=2048, dklen=64)

I used the 8 digit year date with zero padding to maximize the chances someone would be able to recover the salt. While if someone forgets an identifier within 24 hours and reuses it they don't deserve security.

However, I can imagine someone performing this operation more than once in a day so it might need precision down to the hour or minute, although that makes it harder to regenerate this particular set in the future.

I'd appreciate your feedback on the precision of the timestamp (trade off between repeat generation ease and avoiding nonce reuse) and whether the timestamp should just by default (adds extra info to remember/grind to regenerate and audit, but would cover the "forgot I forgot worst case scenario") and eliminate the forgot parameter.

I also changed fingerprint_ident() to not truncate so it adds the whole hash_160 to the salt, a nice slowdown if master_seed is partially leaked and being brute forced against known shares, for free.

Requiring user_entropy is too much burden on the user (as is requiring the user to keep track of uniqueness, if we were considering that) but I think it's fine to require the app to generate a unique token. I think our specific recommendation should be: either (a) a monotonic counter, or (b) at least 12 bytes (96 bits) of fresh randomness. The UNIX timestamp isn't a great choice because it only changes every second, and it won't be accessible to most HWWs.

Who can write and confirm 2 codex32 share backups in 1 second?
If HWWs don't have a clock, why don't we just ask the user for the date (or other unique string they've never typed before, avoids trusting the hardware to not leak data by chosen nonce.) And if the hardware has a clock, display the date used unconditionally so it can be audited.

So I think there's a reasonable amount of value to having error detection in the initial sharing but greatly diminished value in subsequent sharings.

I think we can get it in both now with encryption, by simply remembering or recovering the nonce used on resharing.

I wrote the existing_master_seed() so the master_seed provides all the entropy, even for re-shares. It's just simpler this way, using app entropy + user entropy and deriving shares, index_seed and master_seed from that can never regenerate a share set without a threshold of that particular share set's shares, vs master_seed as the "seed" of the share set allows adding shares to other re-share sets without having to access a threshold of the set, just knowing their threshold&identifier is enough, and possibly now, their birthdate.

(which we can brute force out of the new re-share identifier, with a known bip32 fingerprint maybe, if you don't make it too precise!)

So basically, if we use the bip32 fingerprint for the initial sharing, this gives us a bunch of nice benefits (cheap/local error detection mainly) at basically no cost. But for re-sharing, the costs become nontrivial ... and especially, there is a risk of ident reuse leading to share set reuse, which is probable enough that it'll happen to a nonzero number of users.

As a further, point, even with the master_seed/header scheme, the error detection is not great. We no longer have the BIP32 fingerprint to compare against. Instead we have a weird hash of the share data itself, which is not something the wallet can produce on demand if the user has a single share they want to check on. Plus it's extra implementation complexity, reduced auditability, etc. So the benefit isn't even as big.

Why don't we encrypt the bip32 fingerprint by the timestamp (unique new entropy used for the reshare). Two things the wallet/user can probably can produce on demand (or brute force).

That way we get a random ident and don't lose the error detection the fingerprint provides if we know the birth date.

I think this is a MUCH better idea than that convoluted concatenation of share payloads, and because this "encrypted" fingerprint feeds into the derived_key derivation, it still shuffles the indexes and generates new randomness for payloads just fine.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024
    date = datetime.date.today().strftime("%Y%m%d") if forgot else '19700101'
    # must be displayed unconditionally for auditing purposes if forgot=True
    fingerprint = fingerprint_ident([master_seed])
    salt = codex32_secret[:9] + fingerprint + date
    derived_key = hashlib.pbkdf2_hmac('sha512', password=master_seed, salt=salt,
                        iterations=2048, dklen=64)
    if reshare:
        encryption_key = hashlib.pbkdf2_hmac('sha512', password=date, salt=salt,
                        iterations=2048, dklen=3)

        new_ident = old_ident ^ encryption_key

Optionally, the hash_160(master_pubkey) may be dropped from salt as the wallet might not have access to it. Then the identifier can be decrypted / error detected with the correct fingerprint, the header used to generate the reshare and the birthdate.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

If a secret S, share A, C have ecc padding, the threshold is 3 and the user finds D, E, F but F has an uncorrectable errors, can he ms32_interpolate() using the candidate share F over-corrections to secret S, share A and share C and then check if their ECC padding is valid to save work?

I'm not sure what this means but I don't think so.

I used the 8 digit year date with zero padding to maximize the chances someone would be able to recover the salt. While if someone forgets an identifier within 24 hours and reuses it they don't deserve security.

They don't need to forget it to reuse it. They need to fail to actively override their wallet's default ident computation, because they are aware of this "24 hour" rule that doesn't exist in any other scheme. I think this is too much of a burden on users. And yes, users will want to do this more than once per day (likely more than once per second, if their hardware can compute fast enough).

You also still have the problem that HWWs don't know the time, even to 24-hour precision.

Who can write and confirm 2 codex32 share backups in 1 second?

You don't need to write and confirm them before you start generating the next set. You might generate all the sets you need and then write/confirm them all.

If HWWs don't have a clock, why don't we just ask the user for the date (or other unique string they've never typed before, avoids trusting the hardware to not leak data by chosen nonce.) And if the hardware has a clock, display the date used unconditionally so it can be audited.

Because this is extra work for the user and has a nonobvious application (we are asking for a current date, which seems like an unnecessary and privacy-reducing thing to answer honestly, but actually we just want a unique string). And we can accomplish it by just generating random bytes.

I really don't like any of these date-based ideas.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

They need to fail to actively override their wallet's default ident computation...

A wallet implementation should store the identifiers and bip32 fingerprints, ideally persistently. So that it never reuses them. Perhaps incrementing ident LSB or throwing error messages on collisions.

You also still have the problem that HWWs don't know the time, even to 24-hour precision.

Looks like HWW need (optional?) user provided uniqueness during reshares if all past IDs aren't stored/remembered.

It's the only auditable way. But the odds they'll input a unique string is higher than directly choosing a new ident.

You don't need to write and confirm them before you start generating the next set. You might generate all the sets you need and then write/confirm them all.

Without a persistent counter, complete record of previously used identifiers or a unique input, ident uniqueness and determinism for auditability can not be simultaneously guaranteed (to 1/2^10).

but actually we just want a unique string). And we can accomplish it by just generating random bytes.

Agree completely it should be "unique string" and this is a strange hassle to protect users from forgetfulness. But on the other hand, it's less work than for a fresh master seed. 20-bits of unique user entropy fully protects the reshare vs 128-bits for fresh seeds.

It's unsafe to let a wallet alone generate a random nonce for reshares, it could grind for ones that make your shares and/or the ident leak parts of the secret.

If the random bytes are generated by KDF/HMAC(app_entropy, user_unique_string) with app entropy unconditionally displayed before the user types

This is totally fine, the string "42" or "we" may be unique enough for the purpose.

Anything never typed before is unique to 1/2^10, and perfect uniqueness is only achieved by storing and/or remembering every ident.

I really don't like any of these date-based ideas.

Can date, all we need is a unique input.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

A wallet implementation should store the identifiers and bip32 fingerprints, ideally persistently

Then these would need to be backed up along with the seed data?

If we just used 96 random bits there would be no need for anyone to store anything, persistently or otherwise.

Without a persistent counter, complete record of previously used identifiers or unique input, ident uniqueness and determinism for auditability can not be simultaneously guaranteed (1/2^10).

Right. We should just drop auditability for re-sharing. It's not a common case and all the proposed solutions seem to have serious usability problems.

If the random bytes are generated by HMAC(app_entropy, user_unique_string) with app entropy unconditionally displayed before the user types

This seems good to me. Though I do think we should query for "a unique string" rather than trying to give it any other semantic content. And maybe suggest the user roll dice ten times to get the unique string, or something. If it's too short/predictable the wallet could theoretically still grind.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

Then these would need to be backed up along with the seed data?

In a digital backup, yes. On paper or metal, no.

The list helps prevent ident reuse but it's not essential data, treat it like wallet metadata (ex: prevents address reuse)

it's okay for ident to collide, they're likely to every 2^10 sets generated. It's only bad if payloads collide.

I edited that to mention the app_entropy and unique_user_string should be hardened by PBKDF2 to slow grinding.

How about encrypting (still using PBKDF2) the bip32 fingerprint by the unique string and using cipher text as new_ident?

Then we don't lose the error correction and other benefits as long as user remembers the unique string.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

How about encrypting (still using PBKDF2) the bip32 fingerprint by the unique string and using cipher text as new_ident?

This is a neat idea. Yeah, let's do it. No need to use fancy encryption; just xor with the first 20 bits is fine.

from codex32.

BenWestgate avatar BenWestgate commented on July 22, 2024

This is a neat idea. Yeah, let's do it. No need to use fancy encryption; just xor with the first 20 bits is fine.

XOR is good for encryption algorithm but "unique string" shouldn't be recoverable if someone knows bip32 fingerprint and the ciphertext so PBKDF2("unique string") is needed for key derivation.

Non-zero chance people would enter names, birthdates, street numbers, PINs as a "unique string" and give a thief extra help.

Share payloads, unlike new_ident would incorporate both app entropy and unique_string so those will never collide (if RNG Honest) even if ident collides.

from codex32.

apoelstra avatar apoelstra commented on July 22, 2024

Yep, agreed -- we'd xor with the (truncated) output of the PBKDF2.

from codex32.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.