Comments (14)
Nimbus reports doppelganger detection:
NTC 2024-04-05 14:40:50.011+00:00 Doppelganger detection active - skipping validator duties while observing the network topics="val_pool" validator=867e8956 slot=107 doppelCheck=none() activationEpoch=0
INF 2024-04-05 14:40:54.000+00:00 Slot start slot=108 epoch=3 attestationIn=4s blockIn=1m validators=64 node_status=synced delay=171us45ns
from lodestar.
Is this just the doppelganger? That's intentional. Does --doppelganger-detection=off
work?
from lodestar.
Nimbus doesn't use doppelganger protection on sync committees because they're not slashable.
from lodestar.
@tersec regarding the getAggregate error, also reported in #6631, is Nimbus by any chance requesting an aggregate attestation even if no validator is aggregator? We had the same issue with Lighthouse previously and it was fixed on their end sigp/lighthouse#4712
See #6634 (comment)
from lodestar.
even with --doppelganger-detection=off
I'm getting:
ERR 2024-04-05 18:03:31.097+00:00 Beacon node provides unexpected response reason="Serialization error;200;produceBlockV3(best);unexpected-data" node=http://172.16.0.31:8562[Lodestar/v1.17.0/f2ec0d4] node_index=0 node_roles=AGBSDT
WRN 2024-04-05 18:03:31.097+00:00 Unable to retrieve block data reason="Serialization error;200;produceBlockV3(best);unexpected-data" wall_slot=28 validator_index=78 service=block_service slot=28 validator=8890da28@78
from lodestar.
even with
--doppelganger-detection=off
I'm getting:ERR 2024-04-05 18:03:31.097+00:00 Beacon node provides unexpected response reason="Serialization error;200;produceBlockV3(best);unexpected-data" node=http://172.16.0.31:8562[Lodestar/v1.17.0/f2ec0d4] node_index=0 node_roles=AGBSDT WRN 2024-04-05 18:03:31.097+00:00 Unable to retrieve block data reason="Serialization error;200;produceBlockV3(best);unexpected-data" wall_slot=28 validator_index=78 service=block_service slot=28 validator=8890da28@78
The "unexpected data" might be because we are returning execution_payload_source
metadata field in the response as we allow to produce a blinded local block but this is field is not officially part of the block v3 api and was only proposed in ethereum/beacon-APIs#387 but not included.
See #6634 (comment)
from lodestar.
The "unexpected data" might be because we are returning execution_payload_source metadata field in the response
After reviewing how Nimbus handles this case it does not seem to be the issue as it only uses metadata provided in headers which Lodestar is setting and only parses the response.data
beacon_chain/validator_client/api.nim#L2145
res = decodeBytes(ProduceBlockResponseV3, response.data, response.contentType, version, blinded, executionValue, consensusValue)
And even there they seem to allow unknown fields
beacon_chain/spec/eth2_apis/eth2_rest_serialization.nim#L3950-L3952
ok(RestJson.decode(value, T, requireAllFields = true, allowUnknownFields = true))
Based on just reviewing Lodestar and Nimbus code it's hard to tell what's the actual issue, would need to inspect the data we are sending the further investigate this
from lodestar.
@tersec regarding the getAggregate error, also reported in #6631, is Nimbus by any chance requesting an aggregate attestation even if no validator is aggregator? We had the same issue with Lighthouse previously and it was fixed on their end sigp/lighthouse#4712
This is not the case, Nimbus will only request an aggregated attestation if at least one validator is aggregator
beacon_chain/validator_client/attestation_service.nim#L280
if len(aggregateItems) > 0:
But I think I know what the issue here is, Nimbus uses different strategies per API, and if I interpreted them correctly this is why we are seeing the error:
The submitPoolAttestations
is only sent to the first / primary beacon node (unless an error is encountered)
beacon_chain/validator_client/attestation_service.nim#L66
await vc.submitPoolAttestations(@[attestation], ApiStrategyKind.First)
While the getAggregatedAttestation
is sent to all connected beacon nodes and the "best" response is picked
beacon_chain/validator_client/attestation_service.nim#L283-L284
await vc.getAggregatedAttestation(slot, attestationRoot, ApiStrategyKind.Best)
But Lodestar might not have an aggregated attestation in it's cache if the validator client did not previously submit an attestation for attestation_data_root
. Although this error might be not happen consistently as the beacon node might prepare the aggregate due to receiving the attestation via gossip (gossipHandlers.ts#L490-L492).
We have just improved the error handling if there is no aggregated attestation available #6648 to follow spec and produce a less nosiy error as this is somewhat expected to happen in a setup with multiple nodes.
If my analysis is correct, this is also not an issue as Nimbus would still produce the aggregate attestation if it received the data from the primary node.
@barnabasbusa this should not happen in a 1:1 setup without fallback node, I am assuming in your tests you had multiple bns connected to a single vc
from lodestar.
@nflaig We have 1:1 setups only. We haven't considered any testing for 1:x testing.
from lodestar.
@nflaig We have 1:1 setups only. We haven't considered any testing for 1:x testing.
Then I have no idea why this happens but something to note is that there is also a very strange error which should never happen
Apr-05 14:20:00.472[rest] error: Req req-r produceBlockV2 error - REGEN_ERROR_SLOT_BEFORE_BLOCK_SLOT
Error: REGEN_ERROR_SLOT_BEFORE_BLOCK_SLOT
So maybe there was just a clock issue / time skew between the validator client and beacon node?
from lodestar.
All tests are ran on the same (docker backend) machine, and everyone should be able to replicate it with the config I have posted above.
Doubt that there would be any sort of clock issues between two containers hosted on the same physical node.
from lodestar.
and everyone should be able to replicate it with the config I have posted above.
Will have to do this next, thanks for the detailed infos
from lodestar.
Apr-05 14:20:26.014[rest] error: Req req-5p getAggregatedAttestation error - No attestation for slot=5 dataRoot=0xd947df35713911cac72a977556a795d217eefc7a66bfded54347e18469c675dd
Error: No attestation for slot=5 dataRoot=0xd947df35713911cac72a977556a795d217eefc7a66bfded54347e18469c675dd
The aggregated attestation issue has been fixed by #6668. There still seems to be strange behavior by Nimbus VC that it requests an aggregated attestation for the first 1-5 slots even though it does not submit subnet subscriptions with is_aggregator=true
for those slots beforehand. But other than that looks good, attestations, aggregates, and sync committee works as expected.
The only remaining issue is block production, my best guess right now is that Nimbus does not like that Lodestar returns a JSON payload from produceBlockV3
API.
from lodestar.
The only remaining issue is block production, my best guess right now is that Nimbus does not like that Lodestar returns a JSON payload from produceBlockV3 API.
Confirmed this issue is related us return a JSON payload in response
ERR 2024-05-29 13:37:55.640+00:00 Beacon node provides unexpected response reason="Serialization error;200;produceBlockV3(best);unexpected-data" node=http://172.16.0.15:4000[Lodestar/v1.18.1/8b6ecc4] node_index=0 node_roles=AGBSDT
WRN 2024-05-29 13:37:55.645+00:00 Unable to retrieve block data reason="Serialization error;200;produceBlockV3(best);unexpected-data" wall_slot=12 validator_index=53 service=block_service slot=12 validator=b09cb155@53
Block production with Nimbus VC will be fixed once we merge #6749 but there is another issue publishing blocks as SSZ (while JSON seems work fine there). Does not seem to be a problem with other clients and there is an issue on the Nimbus side to track this status-im/nimbus-eth2#6205 <-- this is an issue on Nimbus BN
from lodestar.
Related Issues (20)
- Delayed block production on builder flow
- Add grafana panel for reorg related metrics
- Lodestar stopped keeping up with head on mainnet HOT 23
- Convenience script to further simplify binary installation
- Historical block / state pruning HOT 2
- Docs: Use a link checker to prevent publishing broken links HOT 2
- Redirects to new docs pages do not work HOT 2
- Add v2 attestation APIs to support EIP-7549 HOT 1
- Revisit chain.archiveStateEpochFrequency hidden status HOT 3
- Specific timeouts per validator duties http request
- Unnecessary log during EL Sync HOT 2
- Event loop lag by slot second
- Update sepolia bootnodes HOT 2
- Error fetching UnavailableBlockInput: Cannot read properties of undefined (reading 'data')
- Finalized header not updated during sync committee change HOT 1
- Archive state optimization
- ethpanda images vs lodestar images HOT 1
- Improve key decryption times for validator client keystores HOT 1
- Set proper user-agent in validator client http requests
- Validator process checks beacon sync state too infrequently HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lodestar.