monerod uses dandelion++ to propagate txs around the network, this happens in 2 stages

[Proposal] Change how transactions are broadcasted to significantly reduce P2P bandwidth usage ,about monero-project/monero

Comments (33)

SyntheticBird45 commented on July 20, 2024 1

I don't think so, for a tx to enter the fluff stage it has to be seen twice in stem or once in fluff

Sending an already existing transaction in stem mode trigger the fluff mode => We imply by this logic that the transaction should already have been fluff to the network otherwise the node wouldn't have it in the first place.

Meaning, we're supposing (from a threat model perspective) that a bad actor shouldn't have the Tx ID in the first place. And I think this is a safe assumption.

How is a bad actor supposed to collect the Tx ID from a stem tx ? It have to be part of the stem path. Best it can do is broadcasting this stem to every node and eventually triggering a fluff from nodes prior to itself in the path. If the stem broadcast triggers multiple nodes, then it can't distinguish which one is the first. If only one node is triggered then it is the first node. But it is already something assumed by D++ and is possible without these additional messages. So it shouldn't cause any privacy downgrades per say

from monero.

jeffro256 commented on July 20, 2024 1

You can just do this with fluffy blocks,

TBF, the node could theoretically check PoW before requesting the missing transactions, so the tracing attack would require doing PoW (a time intensive calculation) before tracing (a time sensitive operation).

from monero.

jeffro256 commented on July 20, 2024 1

IIRC I don't think there is a request for a txpool transaction?

There's NOTIFY_GET_TXPOOL_COMPLEMENT, but that only returns TXIDs that the requester doesn't mention and that are already in fluff phase.

When a tx has been fluffed it is as good as public, no need to lie, only stem txs need to be hidden.

Not being an expert in Dand++, this statement seems like it should be true. The point of Dand++ is to hide the originator node of a transaction, which the stem phase accomplishes. Once it is out of the stem phase and flooding the network, I don't know why we would need to lie about whether we know the tx or not. The only information that would give the attacker is how close my node is to the first fluffer.

from monero.

SChernykh commented on July 20, 2024

Instead of sending the whole tx-blob just send the hash and allow the peer to request the tx if they need it.

Bad wording maybe? What is really needed is to track which peer has sent each tx id to us, and don't send them the full blob back. Because if you send only the tx hash to everyone first, it will spam unnecessary messages before sending full blobs to peers which haven't seen this tx.

Although sending tx hashes to everyone first can save on broadcasts from them to you, so it makes sense.

from monero.

SChernykh commented on July 20, 2024

One issue I see with this approach, is while it saves the bandwidth, it will double the latency. Instead of peer1 -> tx -> peer2,
there will be peer1 -> hash -> peer2 -> request tx -> peer1 -> tx -> peer2 - a full round trip time is added.

Maybe some dynamic algorithm which chooses what to do, based on the current available bandwidth, will be better. For example, sending full tx to 2-3 random peers, and sending only hashes to all other peers - this will ensure fast fluff phase, and a reasonable bandwidth.

from monero.

Boog900 commented on July 20, 2024

What is really needed is to track which peer has sent each tx id to us, and don't send them the full blob back. Because if you send only the tx hash to everyone first, it will spam unnecessary messages before sending full blobs to peers which haven't seen this tx.

Although sending tx hashes to everyone first can save on broadcasts from them to you, so it makes sense.

Also we don't immediately broadcast txs that we receive, we add them to a queue with a randomized timer, so it is still wasteful as some of our peers would potentially have the tx without having sent it yet.

One issue I see with this approach, is while it saves the bandwidth, it will double the latency

This is true, if this is seen as a problem we can decrease the average fluff timer. Although I don't think the added latency would require this.

from monero.

Boog900 commented on July 20, 2024

Maybe some dynamic algorithm which chooses what to do, based on the current available bandwidth, will be better. For example, sending full tx to 2-3 random peers, and sending only hashes to all other peers - this will ensure fast fluff phase, and a reasonable bandwidth.

I think decreasing the fluff timer would be better.

from monero.

Boog900 commented on July 20, 2024

I will just mention stem phase should stay using the current method, so we don't increase latency there, as it would have more impact because the tx is only sent to 1 peer at a time.

from monero.

SChernykh commented on July 20, 2024

I just checked the code, and the fluff timer is currently at 5 seconds, so an additional round trip delay doesn't really matter. But I'd wait for comments from people who know Dandelion++ better.

from monero.

vtnerd commented on July 20, 2024

What about the privacy implications? You'll probably have to "lie" during the stem phase (or the path could be leaked), and during the fluff phase this still might help in identifying the stem path ... no? It's not dead simple, but at the same time it's leaking information that is currently unavailable.

from monero.

SChernykh commented on July 20, 2024

@vtnerd Do you mean some node could spam "Request TX ID" to everyone, and try to observe TX propagation in real time? Yes, that seems too dangerous for Dandelion++ integrity.

from monero.

Boog900 commented on July 20, 2024

What about the privacy implications? You'll probably have to "lie" during the stem phase

Yes you would lie, but it's currently possible to pull this kind of attack of now as monerod will immediately fluff stem txs it receives twice from any connection. Allowing someone to send stem txs that it has received to loads of nodes and seeing which immediately fluff.

IIRC we lie for some other requests as well.

fluff phase this still might help in identifying the stem path ... no

I don't think so, for a tx to enter the fluff stage it has to be seen twice in stem or once in fluff, so I can't see a way to exploit this.

I will just mention that any attack would first require the tx id.

Also this method of tx propagation is what Bitcoin uses, which is what the d++ protocol was made for.

from monero.

Boog900 commented on July 20, 2024

Do you mean some node could spam "Request TX ID" to everyone, and try to observe TX propagation in real time? Yes, that seems too dangerous for Dandelion++ integrity.

You can just do this with fluffy blocks, send a fluffy block containing a tx id and if they don't request it, they have it

from monero.

SChernykh commented on July 20, 2024

Right, so if there is already a way to do this, then replacing all tx broadcasts with "tx hash broadcasts first" should be safe, assuming that nodes lie both in stem and fluff phases.

When in stem, they lie that they don't know this tx (request a full tx every time).
When in fluff, they lie sometimes with some high enough probability

Lying when in fluff is needed to confuse possible attackers - they won't be sure if some node knows a tx or not.

from monero.

SChernykh commented on July 20, 2024

@jeffro256 no need for fluffy blocks and PoW, there is already a protocol message to straight up request a tx id. If I understood it right.

from monero.

Boog900 commented on July 20, 2024

Lying when in fluff is needed to confuse possible attackers - they won't be sure if some node knows a tx or not.

When a tx has been fluffed it is as good as public, no need to lie, only stem txs need to be hidden.

For a tx to be in fluff mode a node must see it in fluff mode or see a stem tx twice, meaning the attacker should not be able to find the stem route.

no need for fluffy blocks and PoW, there is already a protocol message to straight up request a tx id. If I understood it right.

IIRC I don't think there is a request for a txpool transaction?

from monero.

Boog900 commented on July 20, 2024

There's NOTIFY_GET_TXPOOL_COMPLEMENT, but that only returns TXIDs that the requester doesn't mention and that are already in fluff phase.

Ah yeah, that would be even worse as you don't even need the TXID first

from monero.

SChernykh commented on July 20, 2024

In this case yes, lying in fluff phase doesn't make sense because it's already possible to get mempool contents (transactions in fluff phase) from each node.

from monero.

jeffro256 commented on July 20, 2024

Why would it ever make sense to lie in fluff phase theoretically?

from monero.

SChernykh commented on July 20, 2024

I thought it was impossible to know which node knows which tx until it broadcasts it to you. So lying in fluff phase would maintain this property. But it turned out it's already possible.

from monero.

vtnerd commented on July 20, 2024

@SChernykh

@vtnerd Do you mean some node could spam "Request TX ID" to everyone, and try to observe TX propagation in real time? Yes, that seems too dangerous for Dandelion++ integrity.

Yes. An attacker node A would get a notification from node B that a new tx was observed, then node A (attacker) asks node C for the tx. This can be mitigated by only responding to a node that you previously sent the txid too, but requires more state tracking (and memory usage). Since the nodes will "lie" in the stem phase, this may be enough to thwart the attack (and make it similar to the already existing case).

@Boog900

What about the privacy implications? You'll probably have to "lie" during the stem phase

Yes you would lie, but it's currently possible to pull this kind of attack of now as monerod will immediately fluff stem txs it receives twice from any connection. Allowing someone to send stem txs that it has received to loads of nodes and seeing which immediately fluff.

IIRC we lie for some other requests as well.

The target node doesn't respond to the attacker node directly (the tx is never relayed back); the attacker only knows a fluff occurred somewhere in the network. Although a slim chance, another attacker could've targeted another machine in the same way, and so your target wasn't the initiator of the fluffing. The randomized delay timers per link should help hide the source of the fluff too.

After some thought, I think this txid request is no worse than this - iif the node lies during the stem phase.

fluff phase this still might help in identifying the stem path ... no

I don't think so, for a tx to enter the fluff stage it has to be seen twice in stem or once in fluff, so I can't see a way to exploit this.

I will just mention that any attack would first require the tx id.

Yes. I was thinking that an attacking node could fluff, and then try to immediately determine who had the tx. This would be similar to the existing fluff attack you mentioned, except the response would be immediate instead of randomly delayed. I don't think you can do much after some further thought though.

Also this method of tx propagation is what Bitcoin uses, which is what the D++ protocol was made for.

I thought about this, this is another decent argument in favor of the proposal. Although it doesn't mean that the D++ authors overlooked something. We may need to investigate what Bitcoin does here too - does it respond immediately to such a request or does it delay? IIRC, it just does an immediate response.

Do you mean some node could spam "Request TX ID" to everyone, and try to observe TX propagation in real time? Yes, that seems too dangerous for Dandelion++ integrity.

You can just do this with fluffy blocks, send a fluffy block containing a tx id and if they don't request it, they have it

This isn't the same though. You cannot "choose" when todo this attack - you have to wait until you find a valid PoW hash (and if not, the logic probably needs changing).

@jeffro256

Why would it ever make sense to lie in fluff phase theoretically?

It depends on how the RPC is setup. In the most dead-simple way, you ask for a txid, and the node responds if it has the tx. An attacker would then need to be selected in the stem phase (which depends on the D++ settings), and then just probes other nodes for the tx. If the blackhole delay is long enough, the entire stem can be revealed. While it doesn't guarantee finding the source, the D++ talks about hiding txes in the stem phase, etc. The attacker should only know the prior hop in the stem, not the entire stem.

from monero.

jeffro256 commented on July 20, 2024

If the blackhole delay is long enough, the entire stem can be revealed.

Ah, I think I see. If the stem nodes wait too long to tell other nodes they know about the transaction after the fluff period, then they will be the only nodes claiming that they don't know about the transaction?

from monero.

Boog900 commented on July 20, 2024

We may need to investigate what Bitcoin does here too - does it respond immediately to such a request or does it delay

I don't think we need to delay, the sender would have already added their own delay.

Delaying in my mind doesn't do anything, the node will still know we wanted the tx.

This isn't the same though. You cannot "choose" when todo this attack - you have to wait until you find a valid PoW hash (and if not, the logic probably needs changing).

I am pretty sure you can do this without valid PoW.

from monero.

vtnerd commented on July 20, 2024

This isn't the same though. You cannot "choose" when todo this attack - you have to wait until you find a valid PoW hash (and if not, the logic probably needs changing).

I am pretty sure you can do this without valid PoW.

The complement request? I forgot about that one, this new proposal is basically equivalent to that.

from monero.

Boog900 commented on July 20, 2024

The complement request? I forgot about that one, this new proposal is basically equivalent to that.

yes but also you can send a fluffy block and the peer will request missing txs even if the PoW is invalid, the node will lie and say it doesn't have stem txs though.

from monero.

vtnerd commented on July 20, 2024

@Boog900

The complement request? I forgot about that one, this new proposal is basically equivalent to that.

yes but also you can send a fluffy block and the peer will request missing txs even if the PoW is invalid, the node will lie and say it doesn't have stem tts though.

Yes, I see what you are saying now, I would argue this is a bug in the current implementation (the node should check PoW first afaik). Although, I'm not sure of the "fallout" from making this subtle change.

@jeffro256

If the blackhole delay is long enough, the entire stem can be revealed.

Ah, I think I see. If the stem nodes wait too long to tell other nodes they know about the transaction after the fluff period, then they will be the only nodes claiming that they don't know about the transaction?

No, during the stem phase you just probe other nodes for the txid. Only the stem nodes know about the txid and return a response. It requires that the attacker be chosen as a stem node itself though, and the parameters for D++ matter a lot.

A sybil attack on the Grin network showed that this isn't terribly difficult, especially when the fluff probability is low (where the stem is typically longer). In this case, they were trying to undo the MimbleWimble tx combining (where inputs/outputs from multiple txes are combined into a single tx). Since a sybil node was often selected during the stem phase, they were able to undo that privacy feature entirely in many cases. I don't recall the success rate percentage unfortunately, and the writeup was on Twitter (so I am trusting a random Twitter person). However, their technique and numbers were all reasonable.

from monero.

Boog900 commented on July 20, 2024

It depends on how the RPC is setup. In the most dead-simple way, you ask for a txid, and the node responds if it has the tx. An attacker would then need to be selected in the stem phase (which depends on the D++ settings), and then just probes other nodes for the tx. If the blackhole delay is long enough, the entire stem can be revealed. While it doesn't guarantee finding the source, the D++ talks about hiding txes in the stem phase, etc. The attacker should only know the prior hop in the stem, not the entire stem.

I am unsure how this attack would work. Are you suggesting a black hole attack, while constantly probing every peer? This would reveal one of the stems (at random), the first embargo timer to fire, but I don't think it would reveal the others and seems very invasive/ pretty impossible to be constantly probing every peer. I think you could get the same result just by passively connecting to every peer and waiting to see which fluffs first.

The reason I think it would only reveal one stem peer is once the embargo timer fires it will start fluffing to all connections not giving a chance for the other stems embargo timers to fire.

No, during the stem phase you just probe other nodes for the txid. Only the stem nodes know about the txid and return a response. It requires that the attacker be chosen as a stem node itself though, and the parameters for D++ matter a lot.

They shouldn't return a response as the tx would be in their stem pool, right?

The original question was for why we should lie in the fluff stage:

Why would it ever make sense to lie in fluff phase theoretically?

A sybil attack on the Grin network showed that this isn't terribly difficult

I don't know about this particular attack but Grin used (might still use I would have to check) a constant embargo timer, so the original node is always the one to fluff in a black hole.

from monero.

vtnerd commented on July 20, 2024

I am unsure how this attack would work. Are you suggesting a black hole attack, while constantly probing every peer? This would reveal one of the stems (at random) but I don't think it would reveal the others and seems very invasive/ pretty impossible to be constantly probing every peer. I think you could get the same result just by passively connecting to every peer and waiting to see which fluffs first.

I agree this would be a noisy attack. This could be mitigated if you were only interested in certain outputs or IP addresses. FCMP would make this limited approach less viable.

A passive attack should be difficult, that's the purpose of D++.

The original question was for why we should lie in the fluff stage:

Why would it ever make sense to lie in fluff phase theoretically?

There's just never a good reason for a node to request a txid during the stemphase, it will always be from an attacker/spy trying to gain information. A node should only request a txid from a node that sent it a txid fluff first. Enforcing this strict rule is probably too much code and memory, but lying during stemphase is fairly easy.

As per your other comments about it being similar to a fluff leak: a node that receives a stem tx then fluffs was either part of the stem phase or is fluffing all txes. An attacker/spy could send another tx orginating with them to determine which case occurred, but then there is the case that this targeted node recently switched to fluff mode. An attacker could just spam a node to learn about the epoch settings, but this is way more spammy/active attack.

If this leak via spamming txes is a concern, I have a few ideas, but it will break from the algorithms in the D++ paper a bit.

I don't know about this particular attack but Grin used (might still use I would have to check) a constant embargo timer, so the original node is always the one to fluff in a black hole.

I probably shouldn't have brought up Grin, because this attack wasn't really related to the discussion here.

from monero.

Boog900 commented on July 20, 2024

I agree this would be a noisy attack. This could be mitigated if you were only interested in certain outputs or IP addresses. FCMP would make this limited approach less viable.

A passive attack should be difficult, that's the purpose of D++.

I should have been more clear, I meant doing a black hole attack and then passive listening for the first node to fluff would give the same result. Nodes who have the tx in the stem stage will lie and say they do not have the tx so I don't think probing would allow you to gain any more information on the stem path than just passively listening for the first to fluff during a black hole.

A node should only request a txid from a node that sent it a txid fluff first. Enforcing this strict rule is probably too much code and memory, but lying during stemphase is fairly easy.

This is true, however I don't think an attacker would actually gain any information from knowing a node has a tx in the fluff stage.

A node with the tx in the stem phase will lie and a node with the tx in the fluff stage would have to have received it twice or received it from another node that fluffed it.

As per your other comments about it being similar to a fluff leak: a node that receives a stem tx then fluffs was either part of the stem phase or is fluffing all txes. An attacker/spy could send another tx orginating with them to determine which case occurred, but then there is the case that this targeted node recently switched to fluff mode. An attacker could just spam a node to learn about the epoch settings, but this is way more spammy/active attack.

True, although I don't think you would have to send too many txs for this to work. First you would launch a supernode, that connects to all peers and monitor which ones are sending fluff txs, epochs last 10 mins with a little bit of randomization added on. Then when you have a stem tx send it to all nodes you have not seen fluff recently, and for the ones who fluff immediately send a test tx straight after to see if the node is fluffing txs, still invasive to do this for every tx but feasible if you only want to target a subset.

If this leak via spamming txes is a concern, I have a few ideas, but it will break from the algorithms in the D++ paper a bit.

In Cuprate we keep track of nodes who have sent us a stem tx and if the same node sends the tx twice we fluff, this should stop the finding stem routes attack.

from monero.

vtnerd commented on July 20, 2024

This is true, however I don't think an attacker would actually gain any information from knowing a node has a tx in the fluff stage.

I agree.

In Cuprate we keep track of nodes who have sent us a stem tx and if the same node sends the tx twice we fluff, this should stop the finding stem routes attack.

monerod does something similar, except it fluffs when any node sends the tx twice via stem. I believe that came from the original D++ paper - it was the stem loop case.

True, although I don't think you would have to send too many txs for this to work. First you would launch a supernode, that connects to all peers and monitor which ones are sending fluff txs, epochs last 10 mins with a little bit of randomization added on. Then when you have a stem tx send it to all nodes you have not seen fluff recently, and for the ones who fluff immediately send a test tx straight after to see if the node is fluffing txs, still invasive to do this for every tx but feasible if you only want to target a subset.

Yes, this is a weak point in D+. The only mitigation I can think of is to switch to fluff mode in an epoch after you've been forced to fluff once. But this probably has some downsides too.

from monero.

Boog900 commented on July 20, 2024

Yes, this is a weak point in D+. The only mitigation I can think of is to switch to fluff mode in an epoch after you've been forced to fluff once. But this probably has some downsides too.

This attack only works because nodes fluff if they receive a stem tx twice from any connection, if this was no longer possible then the attack doesn't work.

Right now you can blackhole a tx, send it to every node you have not seen fluff recently, then for those that fluff send a test tx to see if they are fluffing. This should reveal the nodes in the stem path.

If we switched to only fluff after the same node sends a tx twice then this should no longer be possible.

monerod does something similar, except it fluffs when any node sends the tx twice via stem. I believe that came from the original D++ paper - it was the stem loop case.

Yeah it was a footnote, one of the D++ authors made this issue related to it though: gfanti/bitcoin#15

from monero.

vtnerd commented on July 20, 2024

This attack only works because nodes fluff if they receive a stem tx twice from any connection, if this was no longer possible then the attack doesn't work.

I see what you mean, but your proposal also explicitly deviates from the paper - this isn't some unplanned edge case neglected by the paper. I need to think about the "fallout" some more.

Right now you can blackhole a tx, send it to every node you have not seen fluff recently, then for those that fluff send a test tx to see if they are fluffing. This should reveal the nodes in the stem path.

The phrase "send it to every node you have not seen fluff recently" downplays the difficulty in determining whether a node is in stem or fluff epoch. You mentioned an active spam attack previously, and I think this is required to determine the state of a node.

Assuming this active attack is worth changing behavior - I think a better fix would be to change how the fluffing occurs. Make it difficult for an attacker to determine where the initial fluff originated. The rule change would be to never fluff back to an upstream stem node peer. Currently a node will fluff to upstream stem node if someone else originated the fluff. I think this the big leak that needs fixing, not the stem-loop attack. We may also want to change the randomized fluff timers, again to help obfuscate the origin of the fluff.

Yeah it was a footnote, one of the D++ authors made this issue related to it though: gfanti/bitcoin#15

Again, I think you just confirmed that monerod is following the spec here.

from monero.

Boog900 commented on July 20, 2024

The phrase "send it to every node you have not seen fluff recently" downplays the difficulty in determining whether a node is in stem or fluff epoch

Yeah my bad, it would not be as easy as I made out there.

Although that part was not necessary for the attack, that was just to weed out some of the nodes, so we have less to test.

Unless I am missing something it would still be possible to send a test tx (unknown to the network) along with the stem tx to every node, nodes in fluff state will fluff both txs and nodes in the stem path would only fluff the stem tx not the test. It would be possible that one of the stem path nodes would receive the test tx before their fluff timer fires, but then we can use the fluff to upstream stem node if someone else originated the fluff info leak to find out if this was the case.

This requires 1 supernode. If from a peer, the node:

only receives the test -> the peer was in the stem path and we caused it to fluff the tx
gets both test and stem -> the peer was not in stem path, they got fluffed from another node
gets neither -> the peer was not in stem path, this peer is in fluff mode.

I don't think any other state is possible, assuming all fluff txs are propagated to every node and the test tx is unknown to the network.

That should only require one test tx for each tx you want to find the nodes in the stem path for, you could probably batch stem txs together in a single message to make it less than 1 test tx per tx.

I think this the big leak that needs fixing, not the stem-loop attack

In isolation I would say neither one is a big leak.

If the fluff to upstream stem node if someone else originated the fluff info leak is fixed, this is just waiting for another way to tell stem-loop peers and fluff peers apart in an efficient way. IMO we should aim to remove all information leaks.

We may also want to change the randomized fluff timers, again to help obfuscate the origin of the fluff.

I would recommend moving to the exponential distribution, it's what Bitcoin uses and should make the fluff times a lot more random. However, it would cause some connections to go a while without fluffing, which means an even longer fluff queue for that connection, while the queue is tx bytes I don't think this would be a good idea.

Moving to the exponential distribution may reduce tx propagation time as well.

your proposal also explicitly deviates from the paper - this isn't some unplanned edge case neglected by the paper

Again, I think you just confirmed that monerod is following the spec here.

It deviates from the prototype implementation documented in the paper, the footnote in question: Our implementation enters fluff mode if there is a loop in the stem. I wouldn't say this is part of the D++ spec.

from monero.

[Proposal] Change how transactions are broadcasted to significantly reduce P2P bandwidth usage about monero HOT 33 OPEN

Comments (33)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent