CIP-99: Make PoS Force Retire More Tolerant

cip title author status type created
99 Make PoS Force Retire More Tolerant Peilun Li (@peilun-conflux) Final Spec Breaking 2022-06-27

Simple Summary

Allow more not-voting terms before we force-retire a node, and make the unlock period of a retiring node shorter to allow the node to rejoin the PoS voting faster.

Abstract

The current parameters make the node operation prone to causing force-retire and thus interest loss. This CIP proposes to make the parameters more tolerant, i.e., increase the number of not-voting terms needed and decrease the unlock period.

Motivation

The current PoS force-retires a node if a node in the committee does not cast any vote in a term, which is roughly an hour. However, the normal restart process of a node has been about 30-50 minutes, so if anything goes wrong when the node operator restarts/upgrades a node in the committee, this node is likely to be force-retired. And if the host machine encounters some random failure (like power loss), this 1-hour windows makes it almost impossible for the operator to respond in time.

Now since the force-retire is inevitable to happen sometimes, 7-day unlock period also looks too long to penalize an host mistake since there is no interest gain during this period. And this makes debugging the issue that causes force-retire more tiresome since we need to wait for 7 days before we can try again.

Specification

The number of not-voting terms needed to force-retire a node becomes 3 continuous serving terms after the hardfork round. Note that if a node does not vote in the last 2 terms after it has been elected and is not elected into the committee in the next election, the node should not be force-retired.

The unlocking period becomes 1 day (both for normal retirement and force retirement) and the locking period becomes 13 days after the hardfork round.

Rationale

The unlocking period of the normal retirement should not be longer than force retirement so no nodes will actively trigger force retirement.

Since every node serves the committee for 6 terms, checking 3 continuous terms helps regardless of the voting power of a node.

The total time for a node to lock and unlock is still at least 14 days, so this proposal does not introduce more attack chances.

Backwards Compatibility

This is spec-breaking.

Test Cases

N/A.

Implementation

N/A.

Security Considerations

Allowing more not-voting terms make crash failure to affect the system longer. But if the voting power of this not-voting node does not prevent the system from making progress, having it in the committee for two more terms still leaves enough honest voting power.

Another issue is that the account that stakes before the hardfork height and withdraws after the height will be able to withdraw after less than 14 days. Since this only happens during the hardfork and not after, it should be okay.

Copyright

Copyright and related rights waived via CC0.

3 Likes