The 'server' library has always been a catch-all and
ideally only the hub links it in (far future goal).
In line with this I move a list of files out of server
into the utils lib.
I choose 'utils' because all these are plain old data
objects that many crypto apps will find useful.
now in utils/primitives/
* CScript
* CPubKey
* CTransaction
* CBlock
* FastTransaction
* FastBlock
* CScript
streams.h is now in utils/streaming/
hash.h is now in utils/
This is really only relevant for people using btrfs on Linux, we will
set the utxo dir to be no-copy-on-write which causes all new databases
to follow this flag and be faster due to lack of copying on write.
The reason there are no standard library versions of lock-free
containers is because you want to always take full advantage of
the details in question.
In this case (read millions of times for each modification) it makes
no sense to use anything other than a standard container, but put in
a copy-on-write block. Simple and easy.
Replace the m_buckets unsorted map with a lock-free version
based on atomic pointers. (BucketMap)
remove the m_leafs and move those into the bucket struct.
Make the access to the jumptable transactional to avoid one big lock
over all datastructures.
On my threadripper 2990WX the entire 150GB BCH blockchain was
parsed and imported in under 3 hours.
At least 50% of the UTXO entries reuse the posOnDisk
because they are outputs on the same tx. This changes
such entries from using 2 bytes to 1 byte only when saved
in a bucket.
On low core-count machines the request to save could end up being
delayed since all threads are busy doing things like validating
transactions.
So when a really big block came in I ended up throttling while there
was no save method running in parallel at all.
This fixes this and also makes throtteling be a per-thread thing
instead of doing it inside the mutex.
The current system writes based on the amount of changes, which is a bit
risky as long as we have small blocks as the amount of changes may for
hours be below the "lets write" limit.
So also write every 5 minutes, just to make sure we won't keep data in
memory longer than needed. Also this allows pruning to happen for people
that don't run their node all the time.
we only save a subsection of stuff in memory on each round of saving,
but I set the counter to zero on how long to wait for the next save.
This makes us save more often till we actually saved most stuff, avoiding
buildups of caches until we do the periodic big flush.
This change makes the limits be static variables, which means they are
only settable once for a process. The only usecase so far is to use much
smaller limits in testing situations.
This allows queries like RPC and mempool-accept to continue
even during a prune (we just use the 'old' DB) and when its
done the last active query will delete the old DB.
This assumes a sane filesystem where you can rename a file that
is in use.
When pruning we sort leafs by txid / output and refrain from
writing the txid repeatedly for outputs of the same tx.
Additionally, use the fact that we only ever get to a leaf via
a bucket and since the first 64 bits of a txid is there, skip
repeating them when writing to disk.
Last, make pruning have different strategies.
This should shrink the utxo by about 40%
* remove unused code
* reorg code to make the access to disk be outside the mutex
* add detection of slow disk-writes and slow down data coming in
* Update and fix the rmHint
together with blockFinished() as 'commit' this introduces the ability
to modify the in-memory utxo which can be rolled-back with ease.
This allows us to "modify" the utxo while validating a block in an
optimistic manner and only spend extra resources doing a rollback
should the block end up not being Ok.
The new UnspentOutputDatabase classes are only very loosly a database, we
purely register and store unspent outputs there. But unline a DB we don't
allow modification (just insert and delete).
This replaces the coin-db (which was based on leveldb) and as first goal
this gives us a higher level of stability. The level-database was known
to give corruption issues.
Notice that users will need to do a manual reindex on first update.
The old was based on levelDB which has scalability issues and stability
issues. (most often cited problem is corrupted database..)
This unspent output database I wrote is based on the idea that we need
never actually update any rows, which makes most old fashioned databases
a bad fit.
All we do is create rows and we forget rows. So lets design a DB to fit
that pattern.