Stake Wars First Week Retro
Developers
November 11, 2019
Summary:
- Monday Nov 4, 8am PST, we had the first Stake Wars call. 40 people jumped on the call, with most of them filled out genesis form ahead of time.
- We used 0.4.4 version to launch new network, with new genesis created from the form.
- It didn’t take too long for participants to observe issues with the network. People got stuck in different syncing state.
- We also had total of 14 issues reported by 10 participants in https://github.com/nearprotocol/stakewars/issues repo that qualify for points. REMINDER: to qualify for points you need to actually submit issues to github for them to be triaged.
- Also we are listening to your feedback about the process and aiming to improve it and clarify things that are unclear yet. If you have suggestions & questions – reach out on Discord.
- We spent this week doing investigation and fixing issues.
- It looks like some of you were not receiving welcome emails. This has been resolved.
- We have released v0.4.5 with all of the updates and fixes. This is what we will use to run this week!
Issues:
- Node stuck in block sync
- Major issue was due to node trying to fetch chunk only from the validator that produced it. If that validator went offline or for some reason didn’t respond, the node would get stuck in downloading.
- Chunk management was fully refactored https://github.com/nearprotocol/nearcore/pull/1624 to address found issues.
- Node stuck in state sync
- If there is only one validator for a shard and they are offline, the validator for this shard in the next epoch cannot download state for the shard and therefore will get stuck. This should not be an issue if we have enough validators, but could be detrimental if the network does not start properly. Tracking issue: https://github.com/nearprotocol/nearcore/issues/1673
- Memory leaks
- We found two major leaks, both were oversight at implementation time to have a cache collection not be sized.
- This has been fixed in https://github.com/nearprotocol/nearcore/pull/1624 and https://github.com/nearprotocol/nearcore/pull/1647
- Additionally added tooling on our side to monitor memory usage (using Valgrind) and detect other situations like this.
- Collecting genesis has been confusing and error prone.
- We built a custom form that validates data as you enter your account and keys.
- Also this coming week we will not be using this yet and just start with a few validators controlled by us. You can join by signing up in https://wallet.tatooine.nearprotocol.com/ and sending staking transaction.
- On regular TestNet when there is a single validator, all nodes would be fetching data from it
- Relaxed the
most_weight_peers
to use other peers who have very close to the most weight chains to balance requests across nodes that are almost up to date. https://github.com/nearprotocol/nearcore/pull/1674
- Relaxed the
Share this:
Join the community:
Follow NEAR: