This is part of the protocol upgrade for 2020-05-15, and in general it
seems to go the direction of "we did this before, lets do this again".
The spec is clear enough, but there is still a lack of questioning and
testing. The problem this attempts to fix has been neutered for years[1].
The spec states:
> The essential idea of SigChecks is to perform counting solely in the
> spending transaction, and count actual executed signature check
> operations.
This, however nobel and logical, ignores that the
check-for-being-too-costly just pulled in a UTXO lookup and the loading
of the output script from the historical chain.
The goal that we protect against CPU over-use may be reached, but the
price is a total system slowdown. You can have multiple CPUs, but the
bus to permanent storage has one, max 2 parallel pipes.
To ensure theHub stays the number one scalable node, I didn't blindly
follow the spec, while making sure that the Hub is correctly going to
follow/reject consensus violations of newly mined blocks.
As a result the implementation in Flowee the Hub:
* does not check sigcheck-counts on historical blocks (more than 1000
blocks in the past).
This may increase the risk of chain-splits ever so slightly, but the cost
of disk-IO would be too high.
* No longer stores the value in the mempool, nor uses it for the
CPU-miner.
* Ties the sigcheck-limits to the user-set block-size-accept-limit.
This is contrary to the spec which mistakenly thinks that BCH has a
max block-size in the consensus rules. The effect is the same, though.
* The per-intput standardness suggestion is not implemented because
standardness checks don't currently fetch the previous outputs and
that would be too expensive to add.
* Standardness rules for the whole transaction are moved to the
mempool-acceptance logic instead. The cost would be too great
otherwise, similar to the previous point.
Again, the effect is the same as likely intented.
---
1) since the intro of the CachingTransactionSignatureChecker
It was only called twice, and in very close proximity. The class didn't
add anything.
This improves readability and with the new state its easier to write
too.
Merkle-block and merkle-tree classes and methods are pretty much stand-
alone and can be moved with no efforts.
Also move the relevent unit test to qtestlib.
Allow user to shutdown a connection, making it instantly invalid.
Allow user to register a callback for errors.
And fix pinging to be disabled on legacyP2P style connections.
This was always the intention, but the satoshi code was stupid and buggy.
First, the 'version has been seen' flag was set even if there was a parsing
error in it.
Second, ignoring messages (up to 100) until a version message was seen
makes no sense. Just disconnect instantly.
The API changes in boost between 1.66 and later was the need
for the boost_compat.h header file.
Its been a long time since Flowee started demanding 1.67 minimum
for boost, making this compat obsolete.
Using the forget() method too many times could lead to an invalid
(negative size) ConstBuffer being created.
This fixes and immediately copies an assert used in many other places in
the code already.
When the UTXO saved checkpoints this change makes sure we also store the
index-db changes.
Since we stopped saving simple state changes fromt the index-db, the
only real data we save is the 'undo-block-index', as such this will be
relatively cheap to save.
Without an undo block position we will currently fail to verify those
blocks and as such it is useful to save all at the same time to actually
have a state we can start from.
This allows the pruner to be used on the 'tip' DB file, at which point it
will set the filesize to be the default 2GB one.
Previously pruning the tip would confuse the Hub with a smaller file
size.
To share the downloaded blocks between instances on Linux it is common
to sym-link the blk files from read-only storage.
The Hub would fail to write to the last file due to the file being on
read-only storage and the Hub would shut down.
This change makes sure we instead create a new file instead of trying to
write to a symlinked one.
Doing a garbage collect of the Sha256 based databases means we remove
all the records that have been deleted from our file.
We also sort the file to have all the jump-tables at the end, making it
much cheaper on memory-locality to find (or not) items in the DB.
The downsides are that this prune step takes time, writes dozens of MBs
and that we lose checkpoints. The latter means we no longer can rollback
to a safe position, simply because we flushed those records.
So we want to do this often enough to avoid fragmentation but not too
often because it creates a greater risk on data consistency.
This algoritm checks the actual data and calculates the fragmentation of
the jump-tables to decide if we want to start a GC.
When we do GC, we try to do as many files as makes sense, to make sure we
can wait quite a long time before we need to do a new GC.
Boost throws an exception when the resize fails, which would cause
a total shutdown of the client. So make sure we catch it in the
scheduled task to avoid this problem.
In some unit tests I noticed that we write a block that was just loaded
from disk, this check avoids this overhead.
Not sure how relevant this is for normal operations.
The recovering of orphans was recursive and that meant there was a max
length of headers we could process with a gap in the chain due to normal
stack-depth for recursivity (approx 50k).
As headers are being provided to us from external peers this could be a
DOS vector.
This implementation avoids this problem by not being recursive.
When, on loading, the blockindex and the UTXO don't agree then try to find an older UTXO
state where they do agree.
The most typical state issue is where a block stored in the blocksdb is not available in
the index due to corruption or similar.