Comments (8)
I'll try and make the case why I think this behavior is a bug as clearly as I can.
Currently, when OE is syncing snapshots, and network connectivity is disrupted, it will remain in a "stalled/hung" state when network connectivity is restored.
Expected behavior would be any form of continuation. Either continue snapshot sync from where it was when network connectivity dropped, or start over from 0.
I have a hard time seeing behavior where the client "hangs" as working as designed.
from openethereum.
- and 2) appear related, see results of further testing. Snapshot max decreases with each restart, and sync is stalled, until OE after enough restarts gives up on snapshots entirely and syncs from 0.
Stopping OE, clearing data directory, and starting OE appears to restart snapshot sync as expected.
I will monitor and update. Current working hypothesis: Network disruption during snapshot sync gets OE into a state that can only be recovered from by wiping the contents of the data directory and starting fresh.
.. Restart OE, snapshot max goes down once more
thorsten@ethlinux:~$ sudo journalctl -f -u openethereum | grep -i Syncing
Sep 26 09:52:20 ethlinux openethereum[265447]: 2020-09-26 09:52:20 IO Worker #0 INFO import Syncing snapshot 0/560 #0 8/25 peers 832 bytes chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
Sep 26 09:52:25 ethlinux openethereum[265447]: 2020-09-26 09:52:25 IO Worker #2 INFO import Syncing snapshot 0/560 #0 8/25 peers 832 bytes chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
Sep 26 09:52:30 ethlinux openethereum[265447]: 2020-09-26 09:52:30 IO Worker #2 INFO import Syncing snapshot 0/560 #0 13/25 peers 832 bytes chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
.. Restart OE, gives up on snapshots
Sep 26 14:18:01 ethlinux openethereum[315123]: 2020-09-26 14:18:01 Updated conversion rate to Ξ1 = US$353.15 (13484085 wei/gas)
Sep 26 14:18:10 ethlinux openethereum[315123]: 2020-09-26 14:18:10 Syncing #1778 0x7000…9c21 177.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 0 Qed #1778 19/25 peers 2 MiB chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
Sep 26 14:18:15 ethlinux openethereum[315123]: 2020-09-26 14:18:15 Syncing #3166 0x91b0…3ccc 277.60 blk/s 0.0 tx/s 0.0 Mgas/s 3942+ 0 Qed #7112 19/25 peers 3 MiB chain 6 MiB queue RPC: 0 conn, 0 req/s, 0 µs
Sep 26 14:18:20 ethlinux openethereum[315123]: 2020-09-26 14:18:20 Syncing #6615 0x9adb…bd86 689.80 blk/s 0.0 tx/s 0.0 Mgas/s 493+ 0 Qed #7112 19/25 peers 5 MiB chain 776 KiB queue RPC: 0 conn, 0 req/s, 0 µs
.. Stop OE
.. Delete contents of datadir
.. Start OE, starts syncing snapshots w/ max in the 5ks like the first attempt, which was interrupted bcs network issues
Sep 26 14:20:23 ethlinux openethereum[315320]: 2020-09-26 14:20:23 Syncing #0 0xd4e5…8fa3 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+ 0 Qed #0 2/25 peers 832 bytes chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
Sep 26 14:20:28 ethlinux openethereum[315320]: 2020-09-26 14:20:28 Syncing snapshot 0/5706 #0 2/25 peers 832 bytes chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
Sep 26 14:20:33 ethlinux openethereum[315320]: 2020-09-26 14:20:33 Syncing snapshot 2/5706 #0 4/25 peers 832 bytes chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
Sep 26 14:20:38 ethlinux openethereum[315320]: 2020-09-26 14:20:38 Syncing snapshot 6/5706 #0 6/25 peers 832 bytes chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
Sep 26 14:20:43 ethlinux openethereum[315320]: 2020-09-26 14:20:43 Syncing snapshot 17/5706 #0 6/25 peers 832 bytes chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
Sep 26 14:20:48 ethlinux openethereum[315320]: 2020-09-26 14:20:48 Syncing snapshot 28/5706 #0 7/25 peers 832 bytes chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
Sep 26 14:20:53 ethlinux openethereum[315320]: 2020-09-26 14:20:53 Syncing snapshot 45/5706 #0 7/25 peers 832 bytes chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
from openethereum.
@yorickdowne, yes, is not a bug, this is how currently the snapshotting feature works.
from openethereum.
Snapshot starting over I can see as "working as designed", but "interrupted snapshot brings OE into state that can only be recovered from by wiping data directory", wouldn't that be a bug?
To be clear, the behavior is not: "After interrupted snapshot, OE starts snapshotting again at 0"
The behavior is: "After interrupted snapshot, OE stalls when network connectivity resumes. Restarting OE leaves it still stalled. Only removing datadir and restarting OE will bring it to start snapshot again from 0"
from openethereum.
I've had snapshot sync "stall" on me while testing atomic db branch for #69 . Stayed at "Syncing snapshot 1847/5893" for hours. Restarting the client got it out of that state, and back to "Snapshot initializing (0 chunks restored)", followed by "Syncing snapshot 0/6140", and increasing after that.
I doubt a network outage in this case, as this was a test on a VPS.
There's a situation where snapshot sync can stall, and require a client restart. That feels like a bug, at least: No matter what, the client should continue syncing, not stay at the same snapshot "slot" until restarted.
from openethereum.
Another snapshot sync stalled on PR #149 . Client had been syncing for hours, then switched to snapshot. It got "stuck" overnight and needed to be restarted.
eth1_1 | 2020-12-07 00:59:53 UTC Syncing #10887893 0xb2f9…2dd2 0.40 blk/s 74.7 tx/s 4.9 Mgas/s 2953+ 1571 Qed (Ancient:#21384) LI:#10892422 17/25 peers 467 MiB chain 598 MiB queue RPC: 0 conn, 0 req/s, 0 µs
eth1_1 | 2020-12-07 00:59:58 UTC Snapshot initializing (0 chunks restored) (Ancient:#21384) LI:#10892422 22/25 peers 467 MiB chain 603 MiB queue RPC: 0 conn, 0 req/s, 0 µs
...
eth1_1 | 2020-12-07 02:13:45 UTC Syncing snapshot 921/6262 (Ancient:#21384) LI:#10892422 31/50 peers 484 MiB chain 395 MiB queue RPC: 0 conn, 0 req/s, 0 µs
eth1_1 | 2020-12-07 02:13:50 UTC Syncing snapshot 922/6262 (Ancient:#21384) LI:#10892422 31/50 peers 484 MiB chain 395 MiB queue RPC: 0 conn, 0 req/s, 0 µs
eth1_1 | 2020-12-07 02:13:55 UTC Syncing snapshot 923/6262 (Ancient:#21384) LI:#10892422 31/50 peers 484 MiB chain 395 MiB queue RPC: 0 conn, 0 req/s, 0 µs
eth1_1 | 2020-12-07 02:14:00 UTC Syncing snapshot 924/6262 (Ancient:#21384) LI:#10892422 31/50 peers 484 MiB chain 394 MiB queue RPC: 0 conn, 0 req/s, 0 µs
...
eth1_1 | 2020-12-07 11:49:37 UTC Syncing snapshot 924/6262 (Ancient:#21384) LI:#10892422 31/50 peers 489 MiB chain 0 bytes queue RPC: 0 conn, 0 req/s, 0 µs
from openethereum.
I'm having the same/similar issues. Making openethereum unusable for me.
Node had synced and was importing blocks. Had a power cut, no errors printed to the logs, openethereum started syncing from scratch again and is still yet to complete the re-sync.
from openethereum.
This is no longer relevant with sunset imminent
from openethereum.
Related Issues (20)
- Implement EIP-3675
- Implement `engine_newPayloadV1` call
- Implement `engine_forkchoiceUpdatedV1` call
- Implement `engine_exchangeTransitionConfigurationV1` call
- syncing too slow.. more than 30 days
- Port a few helpful RPC APIs to OpenEthereum from Nethermind
- Add built-in Ethstats module
- Proposal to add an ability to restrict some JSON RPC API methods HOT 1
- Build Error : could not compile `oe-rpc-common`
- Output RPC requests statistics to logs like in Nethermind
- Configure CI on GitHub
- Update ReadMe (add memo about stable/master branch)
- Update documentation
- release build error HOT 1
- Why has there been no Windows binary release since v3.3.0-rc.9 HOT 1
- Support Optimism
- Autocreate database snapshot
- Feature request: Sync in archive mode starting from a certain block
- The binary file of "openethereum" can't running HOT 1
- Slow blocks on archive OE node `dc-m3-large-x86-01` on Gnosis Chain
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openethereum.