commit 168c53b8233d8c19d54d1f2c68aaa1234126d8b3 Author: Neil Booth Date: Mon Nov 7 05:28:06 2016 +0900 Create docs directory diff --git a/ARCHITECTURE.rst b/ARCHITECTURE.rst new file mode 100644 index 0000000..3af6c99 --- /dev/null +++ b/ARCHITECTURE.rst @@ -0,0 +1,99 @@ +Components +========== + +The components of the server are roughly like this:: + + ------- + - Env - + ------- + + ------- + - IRC - + ------- + < + ------------- ------------ + - ElectrumX -<<<<<- LocalRPC - + ------------- ------------ + < > + ---------- ------------------- ---------- + - Daemon -<<<<<<<<- Block processor ->>>>- Caches - + ---------- ------------------- ---------- + < < > < + -------------- ----------- + - Prefetcher - - Storage - + -------------- ----------- + + +Env +--- + +Holds configuration taken from the environment. Handles defaults +appropriately. Generally passed to the constructor of other +components which take their settings from it. + + +LocalRPC +-------- + +Handles local JSON RPC connections querying ElectrumX server state. +Not started until the block processor has caught up with the daemon. + +ElectrumX +--------- + +Handles JSON Electrum client connections over TCP or SSL. One +instance per client session. Should be the only component concerned +with the details of the Electrum wire protocol. Responsible for +caching of client responses. Not started until the block processor +has caught up with the daemon. Logically, if not yet in practice, a +coin-specific class. + +Daemon +------ + +Used by the block processor, ElectrumX servers and prefetcher. +Encapsulates daemon RPC wire protcol. Logically, if not yet in +practice, a coin-specific class. + +Block Processor +--------------- + +Responsible for managing block chain state (UTXO set, history, +transaction and undo information) and processing towards the chain +tip. Uses the caches for in-memory state caching. Flushes state to +the storage layer. Reponsible for handling block chain +reorganisations. Once caught up maintains a representation of daemon +mempool state. + +Caches +------ + +The file system cache and the UTXO cache are implementation details of +the block processor, nothing else should interface with them. + +Storage +------- + +Backend database abstraction. Along with the host filesystem, used by +the block processor (and therefore its caches) to store chain state. + +Prefetcher +---------- + +Used by the block processor to asynchronously prefetch blocks from the +daemon. Holds fetched block height. Once it has caught up +additionally obtains daemon mempool tx hashes. Serves blocks and +mempool hashes to the block processor via a queue. + +IRC +--- + +Not currently imlpemented; will handle IRC communication for the +ElectrumX servers. + +Controller +---------- + +A historical artefact that currently coordinates some of the above +components. Not pictured as it is doesn't seem to have a logical +place and so is probably going away. diff --git a/HOWTO.rst b/HOWTO.rst new file mode 100644 index 0000000..fc0997c --- /dev/null +++ b/HOWTO.rst @@ -0,0 +1,278 @@ +Prerequisites +============= + +ElectrumX should run on any flavour of unix. I have run it +successfully on MaxOSX and DragonFlyBSD. It won't run out-of-the-box +on Windows, but the changes required to make it do so should be +small - patches welcome. + ++ Python3: ElectrumX uses asyncio. Python version >=3.5 is required. ++ plyvel: Python interface to LevelDB. I am using plyvel-0.9. ++ aiohttp: Python library for asynchronous HTTP. ElectrumX uses it for + communication with the daemon. Version >= 1.0 required; I am + using 1.0.5. + +While not requirements for running ElectrumX, it is intended to be run +with supervisor software such as Daniel Bernstein's daemontools, +Gerald Pape's runit package or systemd. These make administration of secure +unix servers very easy, and I strongly recommend you install one of these +and familiarise yourself with them. The instructions below and sample +run scripts assume daemontools; adapting to runit should be trivial +for someone used to either. + +When building the database form the genesis block, ElectrumX has to +flush large quantities of data to disk and to leveldb. You will have +a much nicer experience if the database directory is on an SSD than on +an HDD. Currently to around height 434,000 of the Bitcoin blockchain +the final size of the leveldb database, and other ElectrumX file +metadata comes to just over 17GB. Leveldb needs a bit more for brief +periods, and the block chain is only getting longer, so I would +recommend having at least 30-40GB free space. + +Database Engine +=============== + +You can choose from RocksDB, LevelDB or LMDB to store transaction +information on disk. Currently, the fastest seems to be RocksDB with +LevelDB being about 10% slower. LMDB is slowest but that is because it +is not yet efficiently abstracted. + +You will need to install one of: + ++ `plyvel `_ for LevelDB ++ `pyrocksdb `_ for RocksDB ++ `lmdb `_ for LMDB + +Running +======= + +Install the prerequisites above. + +Check out the code from Github:: + + git clone https://github.com/kyuupichan/electrumx.git + cd electrumx + +You can install with setup.py, or run the code from the source tree or +a copy of it. + +You should create a standard user account to run the server under; +your own is probably adequate unless paranoid. The paranoid might +also want to create another user account for the daemontools logging +process. The sample scripts and these instructions assume it is all +under one account which I have called 'electrumx'. + +Next create a directory where the database will be stored and make it +writeable by the electrumx account. I recommend this directory live +on an SSD:: + + mkdir /path/to/db_directory + chown electrumx /path/to/db_directory + + +Using daemontools +----------------- + +Next create a daemontools service directory; this only holds symlinks +(see daemontools documentation). The 'svscan' program will ensure the +servers in the directory are running by launching a 'supervise' +supervisor for the server and another for its logging process. You +can run 'svscan' under the electrumx account if that is the only one +involved (server and logger) otherwise it will need to run as root so +that the user can be switched to electrumx. + +Assuming this directory is called service, you would do one of:: + + mkdir /service # If running svscan as root + mkdir ~/service # As electrumx if running svscan as that a/c + +Next create a directory to hold the scripts that the 'supervise' +process spawned by 'svscan' will run - this directory must be readable +by the 'svscan' process. Suppose this directory is called scripts, you +might do:: + + mkdir -p ~/scripts/electrumx + +Then copy the all sample scripts from the ElectrumX source tree there:: + + cp -R /path/to/repo/electrumx/samples/scripts ~/scripts/electrumx + +This copies 4 things: the top level server run script, a log/ directory +with the logger run script, an env/ directory, and a NOTES file. + +You need to configure the environment variables under env/ to your +setup, as explained in NOTES. ElectrumX server currently takes no +command line arguments; all of its configuration is taken from its +environment which is set up according to env/ directory (see 'envdir' +man page). Finally you need to change the log/run script to use the +directory where you want the logs to be written by multilog. The +directory need not exist as multilog will create it, but its parent +directory must exist. + +Now start the 'svscan' process. This will not do much as the service +directory is still empty:: + + svscan ~/service & disown + +svscan is now waiting for services to be added to the directory:: + + cd ~/service + ln -s ~/scripts/electrumx electrumx + +Creating the symlink will kick off the server process almost immediately. +You can see its logs with:: + + tail -F /path/to/log/dir/current | tai64nlocal + + +Using systemd +------------- + +This repository contains a sample systemd unit file that you can use to +setup ElectrumX with systemd. Simply copy it to :code:`/etc/systemd/system`:: + + cp samples/systemd-unit /etc/systemd/system/electrumx.service + +The sample unit file assumes that the repository is located at +:code:`/home/electrumx/electrumx`. If that differs on your system, you need to +change the unit file accordingly. + +You need to set a few configuration variables in :code:`/etc/electrumx.conf`, +see `samples/NOTES` for the list of required variables. + +Now you can start ElectrumX using :code:`systemctl`:: + + systemctl start electrumx + +You can use :code:`journalctl` to check the log output:: + + journalctl -u electrumx -f + +Once configured, you may want to start ElectrumX at boot:: + + systemctl enable electrumx + + +Sync Progress +============= + +Speed indexing the blockchain depends on your hardware of course. As +Python is single-threaded most of the time only 1 core is kept busy. +ElectrumX uses Python's asyncio to prefill a cache of future blocks +asynchronously; this keeps the CPU busy processing the chain and not +waiting for blocks to be delivered. I therefore doubt there will be +much boost in performance if the daemon is on the same host: indeed it +may even be beneficial to have the daemon on a separate machine so the +machine doing the indexing is focussing on the one task and not the +wider network. + +The HIST_MB and CACHE_MB environment variables control cache sizes +before they spill to disk; see the NOTES file under samples/scripts. + +Here is my experience with the current codebase, to given heights and +rough wall-time:: + + Machine A Machine B DB + Metadata + 181,000 7m 09s 0.4 GiB + 255,000 1h 02m 2.7 GiB + 289,000 1h 46m 3.3 GiB + 317,000 2h 33m + 351,000 3h 58m + 377,000 6h 06m 6.5 GiB + 403,400 8h 51m + 436,196 14h 03m 17.3 GiB + +Machine A: a low-spec 2011 1.6GHz AMD E-350 dual-core fanless CPU, 8GB +RAM and a DragonFlyBSD HAMMER fileystem on an SSD. It requests blocks +over the LAN from a bitcoind on machine B. + +Machine B: a late 2012 iMac running El-Capitan 10.11.6, 2.9GHz +quad-core Intel i5 CPU with an HDD and 24GB RAM. Running bitcoind on +the same machine. HIST_MB of 350, UTXO_MB of 1,600. LevelDB. + +For chains other than bitcoin-mainnet sychronization should be much +faster. + + +Terminating ElectrumX +===================== + +The preferred way to terminate the server process is to send it the +TERM signal. For a daemontools supervised process this is best done +by bringing it down like so:: + + svc -d ~/service/electrumx + +If processing the blockchain the server will start the process of +flushing to disk. Once that is complete the server will exit. Be +patient as disk flushing can take many minutes. + +ElectrumX flushes to leveldb using its transaction functionality. The +plyvel documentation claims this is atomic. I have written ElectrumX +with the intent that, to the extent this atomicity guarantee holds, +the database should not get corrupted even if the ElectrumX process if +forcibly killed or there is loss of power. The worst case is losing +unflushed in-memory blockchain processing and having to restart from +the state as of the prior successfully completed UTXO flush. + +If you do have any database corruption as a result of terminating the +process (without modifying the code) I would be interested in the +details. + +Once the process has terminated, you can start it up again with:: + + svc -u ~/service/electrumx + +You can see the status of a running service with:: + + svstat ~/service/electrumx + +Of course, svscan can handle multiple services simultaneously from the +same service directory, such as a testnet or altcoin server. See the +man pages of these various commands for more information. + + +Understanding the Logs +====================== + +You can see the logs usefully like so:: + + tail -F /path/to/log/dir/current | tai64nlocal + +Here is typical log output on startup:: + + 2016-10-14 20:22:10.747808500 Launching ElectrumX server... + 2016-10-14 20:22:13.032415500 INFO:root:ElectrumX server starting + 2016-10-14 20:22:13.032633500 INFO:root:switching current directory to /Users/neil/server-btc + 2016-10-14 20:22:13.038495500 INFO:DB:created new database Bitcoin-mainnet + 2016-10-14 20:22:13.038892500 INFO:DB:Bitcoin/mainnet height: -1 tx count: 0 flush count: 0 utxo flush count: 0 sync time: 0d 00h 00m 00s + 2016-10-14 20:22:13.038935500 INFO:DB:flushing all after cache reaches 2,000 MB + 2016-10-14 20:22:13.038978500 INFO:DB:flushing history cache at 400 MB + 2016-10-14 20:22:13.039076500 INFO:BlockCache:using RPC URL http://user:password@192.168.0.2:8332/ + 2016-10-14 20:22:13.039796500 INFO:BlockCache:catching up, block cache limit 10MB... + 2016-10-14 20:22:14.092192500 INFO:DB:cache stats at height 0 daemon height: 434,293 + 2016-10-14 20:22:14.092243500 INFO:DB: entries: UTXO: 1 DB: 0 hist count: 1 hist size: 1 + 2016-10-14 20:22:14.092288500 INFO:DB: size: 0MB (UTXOs 0MB hist 0MB) + 2016-10-14 20:22:32.302394500 INFO:UTXO:duplicate tx hash d5d27987d2a3dfc724e359870c6644b40e497bdc0589a033220fe15429d88599 + 2016-10-14 20:22:32.310441500 INFO:UTXO:duplicate tx hash e3bf3d07d4b0375638d5f1db5255fe07ba2c4cb067cd81b84ee974b6585fb468 + 2016-10-14 20:23:14.094855500 INFO:DB:cache stats at height 125,278 daemon height: 434,293 + 2016-10-14 20:23:14.095026500 INFO:DB: entries: UTXO: 191,155 DB: 0 hist count: 543,455 hist size: 1,394,187 + 2016-10-14 20:23:14.095028500 INFO:DB: size: 172MB (UTXOs 44MB hist 128MB) + +Under normal operation these cache stats repeat roughly every minute. +Flushes can take many minutes and look like this:: + + 2016-10-14 21:30:29.085479500 INFO:DB:flushing UTXOs: 22,910,848 txs and 254,753 blocks + 2016-10-14 21:32:05.383413500 INFO:UTXO:UTXO cache adds: 55,647,862 spends: 48,751,219 + 2016-10-14 21:32:05.383460500 INFO:UTXO:UTXO DB adds: 6,875,315 spends: 0. Collisions: hash168: 268 UTXO: 0 + 2016-10-14 21:32:07.056008500 INFO:DB:6,982,386 history entries in 1,708,991 addrs + 2016-10-14 21:32:08.169468500 INFO:DB:committing transaction... + 2016-10-14 21:33:17.644296500 INFO:DB:flush #11 to height 254,752 took 168s + 2016-10-14 21:33:17.644357500 INFO:DB:txs: 22,910,848 tx/sec since genesis: 5,372, since last flush: 3,447 + 2016-10-14 21:33:17.644536500 INFO:DB:sync time: 0d 01h 11m 04s ETA: 0d 11h 22m 42s + +After flush-to-disk you may see an aiohttp error; this is the daemon +timing out the connection while the disk flush was in progress. This +is harmless. + +The ETA is just a guide and can be quite volatile around flushes. diff --git a/PERFORMANCE-NOTES b/PERFORMANCE-NOTES new file mode 100644 index 0000000..629cc96 --- /dev/null +++ b/PERFORMANCE-NOTES @@ -0,0 +1,26 @@ +Just some notes on performance with Python 3.5. I am taking this into +account in the code. + +- 60% faster to create lists with [] list comprehensions than tuples + or lists with tuple(), list(). Of those list is 10% faster than + tuple. + +- however when not initializing from a generator, a fixed-length tuple + is at least 80% faster than a list. + +- an implicit default argument is ~5% faster than passing the default + explicitly + +- using a local variable x rather than self.x in loops and list + comprehensions is over 50% faster + +- struct.pack, struct.unpack are over 60% faster than int.to_bytes and + int.from_bytes. They are faster little endian (presumably because + it matches the host) than big endian regardless of length. + +- single-item list and tuple unpacking. Suppose b = (1, ) + + a, = b is a about 0.4% faster than (a,) = b + and about 45% faster than a = b[0] + +- multiple assignment is faster using tuples only for 3 or more items \ No newline at end of file diff --git a/RELEASE-NOTES b/RELEASE-NOTES new file mode 100644 index 0000000..c1b7d90 --- /dev/null +++ b/RELEASE-NOTES @@ -0,0 +1,71 @@ +Version 0.2 +----------- + +- update sample run script, remove empty addresses from mempool + +Version 0.1 +------------ + +- added setup.py, experimental. Because of this server_main.py renamed + electrumx_server.py, and SERVER_MAIN environment variable was renamed + to ELECTRUMX. The sample run script was updated to match. +- improvements to logging of daemon connection issues +- removal of old reorg test code +- hopefully more accurate sync ETA + +Version 0.07 +------------ + +- fixed a bug introduced in 0.06 at the last minute + +Version 0.06 +------------ + +- mempool support. ElectrumX maintains a representation of the daemon's + mempool and serves unconfirmed transactions and balances to clients. + +Version 0.05 +------------ + +- fixed a bug in 0.04 that stopped ElectrumX serving once synced + +Version 0.04 +------------ + +- made the DB interface a little faster for LevelDB and RocksDB; this was + a small regression in 0.03 +- fixed a bug that prevented block reorgs from working +- implement and enable client connectivity. This is not yet ready for + public use for several reasons. Local RPC, and remote TCP and SSL + connections are all supported in the same way as Electrum-server. + ElectrumX does not begin listening for incoming connections until it + has caught up with the daemon's height. Which ports it is listening + on will appear in the logs when it starts listening. The complete + Electrum wire protocol is implemented, so it is possible to now use + as a server for your own Electrum client. Note that mempools are + not yet handled so unconfirmed transactions will not be notified or + appear; they will appear once they get in a block. Also no + responses are cached, so performance would likely degrade if used by + many clients. I welcome feedback on your experience using this. + + +Version 0.03 +------------ + +- merged bauerj's abstracted DB engine contribution to make it easy to + play with different backends. In addition to LevelDB this adds + support for RocksDB and LMDB. We're interested in your comparitive + performance experiences. + + +Version 0.02 +------------ + +- fix bug where tx counts were incorrectly saved +- large clean-up and refactoring of code, breakout into new files +- several efficiency improvements +- initial implementation of chain reorg handling +- work on RPC and TCP server functionality. Code committed but not + functional, so currently disabled +- note that some of the enivronment variables have been renamed, + see samples/scripts/NOTES for the list \ No newline at end of file