diff --git a/contributors.md b/contributors.md new file mode 100644 index 0000000..9c05d05 --- /dev/null +++ b/contributors.md @@ -0,0 +1,9 @@ +# Contributors + +This specification was initially created by [Software Verde, LLC](https://softwareverde.com/), the creators of Bitcoin Verde, with funding from [Bitcoin Unlimited](https://www.bitcoinunlimited.info/). + +The contributor list below shows a list of those that have contributed content. If you have contributed content and would like to be listed below, feel free to add your name! + + - __Andrew Stone__, Bitcoin Unlimited Lead Developer + - __Joshua Green__, Bitcoin Verde Lead Developer + - __Andrew Groot__, Bitcoin Verde Developer diff --git a/history/protocol-version.md b/history/protocol-version.md new file mode 100644 index 0000000..f7dd7af --- /dev/null +++ b/history/protocol-version.md @@ -0,0 +1,10 @@ +# Network Protocol Version History + +| Version Number | Proposed In | Summary | +|--|--|--| +| 106 | ??? | Added the following fields to the `version` message: `address-from`, `nonce`, `user-agent`, `current block height` | +| 209 | ??? | `address` message may accept a list of network addresses. | +|31 402| ??? | Time field added to `address` messages. | +| | [BIP-0014](/history/bips) | Network Version decoupled from Block Version. User-agent replaced sub-version number. | +| 60 000 | [BIP-0031](/history/bips) | Added pong. | +| 70 001 | [BIP-0037](/history/bips) | Added `relay` flag to the `version` message.| \ No newline at end of file diff --git a/home.md b/home.md index bc18add..d5deef2 100644 --- a/home.md +++ b/home.md @@ -1,6 +1,7 @@ - Home -- Contributors -- Target Audience +- [Style Guide](style-guide) +- [Contributors](contributors) +- [Target Audience](target-audience) - Wiki History - Protocol - Blockchain @@ -60,14 +61,19 @@ - Stratum Protocol - Mining Pools - Forks - - Bip16 - - Bip34 - - Bip65 - - Bip66 - - Bip68 - - Bip112 - - Bip113 - - BCH UAHF (BUIP 55) + - Bip-16 + - Bip-34 + - [Bip-37](/protocol/forks/bip-0037) + - [Bip-64](/protocol/forks/bip-0064) + - Bip-65 + - Bip-66 + - Bip-68 + - Bip-112 + - Bip-113 + - [Bip-157](/protocol/forks/bip-0157) + - [Bip-158](/protocol/forks/bip-0158) + - [Bip-159](/protocol/forks/bip-0159) + - BCH UAHF (BUIP-55) - HF20171113 - HF20180515 - HF20181115 @@ -75,7 +81,7 @@ - HF20191115 - Peer-to-Peer Network - [Messages](/protocol/network/messages) - - version + - [Handshake: Version (“version”)](/protocol/network/messages/version) - verack - ping - pong @@ -109,4 +115,7 @@ - Simple Ledger Protocol - Cash Address - Miscellaneous - - “Bitcoin Sign Message” \ No newline at end of file + - “Bitcoin Sign Message” + - History + - Bips + - Protocol Version \ No newline at end of file diff --git a/protocol/blockchain/hash.md b/protocol/blockchain/hash.md index 00967b3..b7230e1 100644 --- a/protocol/blockchain/hash.md +++ b/protocol/blockchain/hash.md @@ -1,5 +1,5 @@ -# Hash +# Hashes - SHA256 - - RIPEMD160 - - Murmur \ No newline at end of file + - RIPEMD-160 + - Murmur diff --git a/protocol/forks/bip-0037.md b/protocol/forks/bip-0037.md new file mode 100644 index 0000000..0da01f2 --- /dev/null +++ b/protocol/forks/bip-0037.md @@ -0,0 +1,191 @@ +
+ BIP: 37 + Layer: Peer Services + Title: Connection Bloom filtering + Author: Mike Hearn+ +## Abstract + +This BIP adds new support to the peer-to-peer protocol that allows peers to reduce the amount of transaction data they are sent. Peers have the option of setting ''filters'' on each connection they make after the version handshake has completed. A filter is defined as a [Bloom filter](http://en.wikipedia.org/wiki/Bloom_filter) on data derived from transactions. A Bloom filter is a probabilistic data structure which allows for testing set membership - they can have false positives but not false negatives. + +This document will not go into the details of how Bloom filters work and the reader is referred to Wikipedia for an introduction to the topic. + +## Motivation + +As Bitcoin grows in usage the amount of bandwidth needed to download blocks and transaction broadcasts increases. Clients implementing ''simplified payment verification'' do not attempt to fully verify the block chain, instead just checking that block headers connect together correctly and trusting that the transactions in a chain of high difficulty are in fact valid. See the Bitcoin paper for more detail on this mode. + +Today, [clients](https://bitcoin.org/en/developer-guide#simplified-payment-verification-spv|SPV) have to download the entire contents of blocks and all broadcast transactions, only to throw away the vast majority of the transactions that are not relevant to their wallets. This slows down their synchronization process, wastes users bandwidth (which on phones is often metered) and increases memory usage. All three problems are triggering real user complaints for the Android "Bitcoin Wallet" app which implements SPV mode. In order to make chain synchronization fast, cheap and able to run on older phones with limited memory we want to have remote peers throw away irrelevant transactions before sending them across the network. + +## Design rationale + +The most obvious way to implement the stated goal would be for clients to upload lists of their keys to the remote node. We take a more complex approach for the following reasons: + +* Privacy: Because Bloom filters are probabilistic, with the false positive rate chosen by the client, nodes can trade off precision vs bandwidth usage. A node with access to lots of bandwidth may choose to have a high FP rate, meaning the remote peer cannot accurately know which transactions belong to the client and which don't. A node with very little bandwidth may choose to use a very accurate filter meaning that they only get sent transactions actually relevant to their wallet, but remote peers may be able to correlate transactions with IP addresses (and each other). +* Bloom filters are compact and testing membership in them is fast. This results in satisfying performance characteristics with minimal risk of opening up potential for DoS attacks. + +## Specification + +### New messages + +We start by adding three new messages to the protocol: + +*+ Matt Corallo + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0037 + Status: Final + Type: Standards Track + Created: 2012-10-24 + License: PD +
filterload, which sets the current Bloom filter on the connection
+* filteradd, which adds the given data element to the connections current filter without requiring a completely new one to be set
+* filterclear, which deletes the current filter and goes back to regular pre-BIP37 usage.
+
+Note that there is no filterremove command because by their nature, Bloom filters are append-only data structures. Once an element is added it cannot be removed again without rebuilding the entire structure from scratch.
+
+The filterload command is defined as follows:
+
+
+| Field Size | Description | Data type | Comments|
+|--|--|--|--|
+| ? | filter | uint8_t[] | The filter itself is simply a bit field of arbitrary byte-aligned size. The maximum size is 36,000 bytes.|
+| 4 | nHashFuncs | uint32_t | The number of hash functions to use in this filter. The maximum value allowed in this field is 50.|
+| 4 | nTweak | uint32_t | A random value to add to the seed value in the hash function used by the bloom filter.|
+| 1 | nFlags | uint8_t | A set of flags that control how matched items are added to the filter.|
+
+See below for a description of the Bloom filter algorithm and how to select nHashFuncs and filter size for a desired false positive rate.
+
+Upon receiving a filterload command, the remote peer will immediately restrict the broadcast transactions it announces (in inv packets) to transactions matching the filter, where the matching algorithm is specified below. The flags control the update behaviour of the matching algorithm.
+
+The filteradd command is defined as follows:
+
+| Field Size | Description | Data type | Comments
+|--|--|--|--|
+| ? | data | uint8_t[] | The data element to add to the current filter.|
+
+The data field must be smaller than or equal to 520 bytes in size (the maximum size of any potentially matched object).
+
+The given data element will be added to the Bloom filter. A filter must have been previously provided using filterload. This command is useful if a new key or script is added to a clients wallet whilst it has connections to the network open, it avoids the need to re-calculate and send an entirely new filter to every peer (though doing so is usually advisable to maintain anonymity).
+
+The filterclear command has no arguments at all.
+
+After a filter has been set, nodes don't merely stop announcing non-matching transactions, they can also serve filtered blocks. A filtered block is defined by the merkleblock message and is defined like this:
+
+| Field Size | Description | Data type | Comments |
+|--|--|--|--|
+| 4 | version | uint32_t | Block version information, based upon the software version creating this block|
+| 32 | prev_block | char[32] | The hash value of the previous block this particular block references|
+| 32 | merkle_root | char[32] | The reference to a Merkle tree collection which is a hash of all transactions related to this block|
+| 4 | timestamp | uint32_t | A timestamp recording when this block was created (Limited to 2106!)|
+| 4 | bits | uint32_t | The calculated difficulty target being used for this block|
+| 4 | nonce | uint32_t | The nonce used to generate this block… to allow variations of the header and compute different hashes|
+| 4 | total_transactions | uint32_t | Number of transactions in the block (including unmatched ones)|
+| ? | hashes | uint256[] | hashes in depth-first order (including standard varint size prefix)|
+| ? | flags | byte[] | flag bits, packed per 8 in a byte, least significant bit first (including standard varint size prefix)|
+
+See below for the format of the partial merkle tree hashes and flags.
+
+Thus, a merkleblock message is a block header, plus a part of a merkle tree which can be used to extract identifying information for transactions that matched the filter and prove that the matching transaction data really did appear in the solved block. Clients can use this data to be sure that the remote node is not feeding them fake transactions that never appeared in a real block, although lying through omission is still possible.
+
+### Extensions to existing messages
+
+The version command is extended with a new field:
+
+| Field Size | Description | Data type | Comments|
+|--|--|--|--|
+| 1 byte | fRelay | bool | If false then broadcast transactions will not be announced until a filter{load,add,clear} command is received. If missing or true, no change in protocol behaviour occurs.|
+
+SPV clients that wish to use Bloom filtering would normally set fRelay to false in the version message, then set a filter based on their wallet (or a subset of it, if they are overlapping different peers). Being able to opt-out of inv messages until the filter is set prevents a client being flooded with traffic in the brief window of time between finishing version handshaking and setting the filter.
+
+The getdata command is extended to allow a new type in the inv submessage. The type field can now be MSG_FILTERED_BLOCK (== 3) rather than MSG_BLOCK. If no filter has been set on the connection, a request for filtered blocks is ignored. If a filter has been set, a merkleblock message is returned for the requested block hash. In addition, because a merkleblock message contains only a list of transaction hashes, transactions matching the filter should also be sent in separate tx messages after the merkleblock is sent. This avoids a slow roundtrip that would otherwise be required (receive hashes, didn't see some of these transactions yet, ask for them). Note that because there is currently no way to request transactions which are already in a block from a node (aside from requesting the full block), the set of matching transactions that the requesting node hasn't either received or announced with an inv must be sent and any additional transactions which match the filter may also be sent. This allows for clients (such as the reference client) to limit the number of invs it must remember a given node to have announced while still providing nodes with, at a minimum, all the transactions it needs.
+
+### Filter matching algorithm
+
+The filter can be tested against arbitrary pieces of data, to see if that data was inserted by the client. Therefore the question arises of what pieces of data should be inserted/tested.
+
+To determine if a transaction matches the filter, the following algorithm is used. Once a match is found the algorithm aborts.
+
+1. Test the hash of the transaction itself.
+2. For each output, test each data element of the output script. This means each hash and key in the output script is tested independently. '''Important:''' if an output matches whilst testing a transaction, the node might need to update the filter by inserting the serialized COutPoint structure. See below for more details.
+3. For each input, test the serialized COutPoint structure.
+4. For each input, test each data element of the input script (note: input scripts only ever contain data elements).
+5. Otherwise there is no match.
+
+In this way addresses, keys and script hashes (for P2SH outputs) can all be added to the filter. You can also match against classes of transactions that are marked with well known data elements in either inputs or outputs, for example, to implement various forms of [[https://en.bitcoin.it/wiki/Smart_Property|Smart property]].
+
+The test for outpoints is there to ensure you can find transactions spending outputs in your wallet, even though you don't know anything about their form. As you can see, once set on a connection the filter is '''not static''' and can change throughout the connections lifetime. This is done to avoid the following race condition:
+
+1. A client sets a filter matching a key in their wallet. They then start downloading the block chain. The part of the chain that the client is missing is requested using getblocks.
+2. The first block is read from disk by the serving peer. It contains TX 1 which sends money to the clients key. It matches the filter and is thus sent to the client.
+3. The second block is read from disk by the serving peer. It contains TX 2 which spends TX 1. However TX 2 does not contain any of the clients keys and is thus not sent. The client does not know the money they received was already spent.
+
+By updating the bloom filter atomically in step 2 with the discovered outpoint, the filter will match against TX 2 in step 3 and the client will learn about all relevant transactions, despite that there is no pause between the node processing the first and second blocks.
+
+The nFlags field of the filter controls the nodes precise update behaviour and is a bit field.
+
+* BLOOM_UPDATE_NONE (0) means the filter is not adjusted when a match is found.
+* BLOOM_UPDATE_ALL (1) means if the filter matches any data element in a scriptPubKey the outpoint is serialized and inserted into the filter.
+* BLOOM_UPDATE_P2PUBKEY_ONLY (2) means the outpoint is inserted into the filter only if a data element in the scriptPubKey is matched, and that script is of the standard "pay to pubkey" or "pay to multisig" forms.
+
+These distinctions are useful to avoid too-rapid degradation of the filter due to an increasing false positive rate. We can observe that a wallet which expects to receive only payments of the standard pay-to-address form doesn't need automatic filter updates because any transaction that spends one of its own outputs has a predictable data element in the input (the pubkey that hashes to the address). If a wallet might receive pay-to-address outputs and also pay-to-pubkey or pay-to-multisig outputs then BLOOM_UPDATE_P2PUBKEY_ONLY is appropriate, as it avoids unnecessary expansions of the filter for the most common types of output but still ensures correct behaviour with payments that explicitly specify keys.
+
+Obviously, nFlags \=\= 1 or nFlags \=\= 2 mean that the filter will get dirtier as more of the chain is scanned. Clients should monitor the observed false positive rate and periodically refresh the filter with a clean one.
+
+### Partial Merkle branch format
+
+A ''Merkle tree'' is a way of arranging a set of items as leaf nodes of tree in which the interior nodes are hashes of the concatenations of their child hashes. The root node is called the ''Merkle root''. Every Bitcoin block contains a Merkle root of the tree formed from the blocks transactions. By providing some elements of the trees interior nodes (called a ''Merkle branch'') a proof is formed that the given transaction was indeed in the block when it was being mined, but the size of the proof is much smaller than the size of the original block.
+
+#### Constructing a partial merkle tree object
+
+* Traverse the merkle tree from the root down, and for each encountered node:
+ * Check whether this node corresponds to a leaf node (transaction) that is to be included OR any parent thereof:
+ * If so, append a '1' bit to the flag bits
+ * Otherwise, append a '0' bit.
+ * Check whether this node is a internal node (non-leaf) AND is the parent of an included leaf node:
+ * If so:
+ * Descend into its left child node, and process the subtree beneath it entirely (depth-first).
+ * If this node has a right child node too, descend into it as well.
+ * Otherwise: append this node's hash to the hash list.
+
+#### Parsing a partial merkle tree object
+
+As the partial block message contains the number of transactions in the entire block, the shape of the merkle tree is known before hand. Again, traverse this tree, computing traversed node's hashes along the way:
+* Read a bit from the flag bit list:
+ * If it is '0':
+ * Read a hash from the hashes list, and return it as this node's hash.
+ * If it is '1' and this is a leaf node:
+ * Read a hash from the hashes list, store it as a matched txid, and return it as this node's hash.
+ * If it is '1' and this is an internal node:
+ * Descend into its left child tree, and store its computed hash as L.
+ * If this node has a right child as well:
+ * Descend into its right child, and store its computed hash as R.
+ * If L == R, the partial merkle tree object is invalid.
+ * Return Hash(L || R).
+ * If this node has no right child, return Hash(L || L).
+
+The partial merkle tree object is only valid if:
+* All hashes in the hash list were consumed and no more.
+* All bits in the flag bits list were consumed (except padding to make it into a full byte), and no more.
+* The hash computed for the root node matches the block header's merkle root.
+* The block header is valid, and matches its claimed proof of work.
+* In two-child nodes, the hash of the left and right branches was never equal.
+
+### Bloom filter format
+
+A Bloom filter is a bit-field in which bits are set based on feeding the data element to a set of different hash functions. The number of hash functions used is a parameter of the filter. In Bitcoin we use version 3 of the 32-bit Murmur hash function. To get N "different" hash functions we simply initialize the Murmur algorithm with the following formula:
+
+nHashNum * 0xFBA4C795 + nTweak
+
+i.e. if the filter is initialized with 4 hash functions and a tweak of 0x00000005, when the second function (index 1) is needed h1 would be equal to 4221880218.
+
+When loading a filter with the filterload command, there are two parameters that can be chosen. One is the size of the filter in bytes. The other is the number of hash functions to use. To select the parameters you can use the following formulas:
+
+Let N be the number of elements you wish to insert into the set and P be the probability of a false positive, where 1.0 is "match everything" and zero is unachievable.
+
+The size S of the filter in bytes is given by (-1 / pow(log(2), 2) * N * log(P)) / 8. Of course you must ensure it does not go over the maximum size (36,000: selected as it represents a filter of 20,000 items with false positive rate of < 0.1% or 10,000 items and a false positive rate of < 0.0001%).
+
+The number of hash functions required is given by S * 8 / N * log(2).
+
+## Copyright
+
+This document is placed in the public domain.
\ No newline at end of file
diff --git a/protocol/forks/bip-0064.md b/protocol/forks/bip-0064.md
new file mode 100644
index 0000000..7095031
--- /dev/null
+++ b/protocol/forks/bip-0064.md
@@ -0,0 +1,106 @@
++ BIP: 64 + Layer: Peer Services + Title: getutxo message + Author: Mike Hearn+ +==Abstract== + +This document describes a small P2P protocol extension that performs UTXO lookups given a set of outpoints. + +==Motivation== + +All full Bitcoin nodes maintain a database called the unspent transaction output set. This set is +how double spending is checked for: to be valid a transaction must identify unspent outputs in this +set using an identifier called an "outpoint", which is merely the hash of the output's containing +transaction plus an index. + +The ability to query this can sometimes be useful for a lightweight/SPV client which does not have +the full UTXO set at hand. For example, it can be useful in applications implementing assurance +contracts to do a quick check when a new pledge becomes visible to test whether that pledge was +already revoked via a double spend. Although this message is not strictly necessary because e.g. +such an app could be implemented by fully downloading and storing the block chain, it is useful for +obtaining acceptable performance and resolving various UI cases. + +Another example of when this data can be useful is for performing floating fee calculations in an +SPV wallet. This use case requires some other changes to the Bitcoin protocol however, so we will +not dwell on it here. + +==Specification== + +Two new messages are defined. The "getutxos" message has the following structure: + +{|class="wikitable" +! Field Size !! Description !! Data type !! Comments +|- +| 1 || check mempool || bool || Whether to apply mempool transactions during the calculation, thus exposing their UTXOs and removing outputs that they spend. +|- +| ? || outpoints || vector+ Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0064 + Status: Draft + Type: Standards Track + Created: 2014-06-10 +
+ BIP: 157 + Layer: Peer Services + Title: Client Side Block Filtering + Author: Olaoluwa Osuntokun+ + +== Abstract == + +This BIP describes a new light client protocol in Bitcoin that improves upon +currently available options. The standard light client protocol in use today, +defined in BIP +37https://github.com/bitcoin/bips/blob/master/bip-0037.mediawiki, has +known flaws that weaken the security and privacy of clients and allow +denial-of-service attack vectors on full +nodeshttps://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-May/012636.html. +The new protocol overcomes these issues by allowing light clients to obtain +compact probabilistic filters of block content from full nodes and download full +blocks if the filter matches relevant data. + +New P2P messages empower light clients to securely sync the blockchain without +relying on a trusted source. This BIP also defines a filter header, which serves +as a commitment to all filters for previous blocks and provides the ability to +efficiently detect malicious or faulty peers serving invalid filters. The +resulting protocol guarantees that light clients with at least one honest peer +are able to identify the correct block filters. + +== Motivation == + +Bitcoin light clients allow applications to read relevant transactions from the +blockchain without incurring the full cost of downloading and validating all +data. Such applications seek to simultaneously minimize the trust in peers and +the amount of bandwidth, storage space, and computation required. They achieve +this by downloading all block headers, verifying the proofs of work, and +following the longest proof-of-work chain. Since block headers are a fixed +80-bytes and are generated every 10 minutes on average, the bandwidth required +to sync the block headers is minimal. Light clients then download only the +blockchain data relevant to them directly from peers and validate inclusion in +the header chain. Though clients do not check the validity of all blocks in the +longest proof-of-work chain, they rely on miner incentives for security. + +BIP 37 is currently the most widely used light client execution mode for +Bitcoin. With BIP 37, a client sends a Bloom filter it wants to watch to a full +node peer, then receives notifications for each new transaction or block that +matches the filter. The client then requests relevant transactions from the peer +along with Merkle proofs of inclusion in the blocks containing them, which are +verified against the block headers. The Bloom filters match data such as client +addresses and unspent outputs, and the filter size must be carefully tuned to +balance the false positive rate with the amount of information leaked to peer. It +has been shown, however, that most implementations available offer virtually +''zero privacy'' to wallets and other +applicationshttps://eprint.iacr.org/2014/763.pdfhttps://jonasnick.github.io/blog/2015/02/12/privacy-in-bitcoinj/. +Additionally, malicious full nodes serving light clients can omit critical data +with little risk of detection, which is unacceptable for some applications +(such as Lightning Network clients) that must respond to certain on-chain +events. Finally, honest nodes servicing BIP 37 light clients may incur +significant I/O and CPU resource usage due to maliciously crafted Bloom filters, +creating a denial-of-service (DoS) vector and disincentizing node operators from +supporting the +protocolhttps://github.com/bitcoin/bips/blob/master/bip-0111.mediawiki. + +The alternative detailed in this document can be seen as the opposite of BIP 37: +instead of the client sending a filter to a full node peer, full nodes generate +deterministic filters on block data that are served to the client. A light +client can then download an entire block if the filter matches the data it is +watching for. Since filters are deterministic, they only need to be constructed +once and stored on disk, whenever a new block is connected to the chain. This +keeps the computation required to serve filters minimal, and eliminates the I/O +asymmetry that makes BIP 37 enabled nodes vulnerable. Clients also get better +assurance of seeing all relevant transactions because they can check the +validity of filters received from peers more easily than they can check +completeness of filtered blocks. Finally, client privacy is improved because +blocks can be downloaded from ''any source'', so that no one peer gets complete +information on the data required by a client. Extremely privacy conscious light +clients may opt to anonymously fetch blocks using advanced techniques such a +Private Information +Retrievalhttps://en.wikipedia.org/wiki/Private_information_retrieval. + +== Definitions == + ++ Alex Akselrod + Jim Posen + Comments-Summary: None yet + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0157 + Status: Draft + Type: Standards Track + Created: 2017-05-24 + License: CC0-1.0 +
[]byte represents a vector of bytes.
+
+[N]byte represents a fixed-size byte array with length N.
+
+''CompactSize'' is a compact encoding of unsigned integers used in the Bitcoin
+P2P protocol.
+
+''double-SHA256'' is a hash algorithm defined by two invocations of SHA-256:
+double-SHA256(x) = SHA256(SHA256(x)).
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in RFC 2119.
+
+== Specification ==
+
+=== Filter Types ===
+
+For the sake of future extensibility and reducing filter sizes, there are
+multiple ''filter types'' that determine which data is included in a block
+filter as well as the method of filter construction/querying. In this model,
+full nodes generate one filter per block per filter type supported.
+
+Each type is identified by a one byte code, and specifies the contents and
+serialization format of the filter. A full node MAY signal support for
+particular filter types using service bits. The initial filter types are defined
+separately in [[bip-0158.mediawiki|BIP 158]], and one service bit is allocated
+to signal support for them.
+
+=== Filter Headers ===
+
+This proposal draws inspiration from the headers-first mechanism that Bitcoin
+nodes use to sync the block
+chainhttps://bitcoin.org/en/developer-guide#headers-first. Similar to
+how block headers have a Merkle commitment to all transaction data in the block,
+we define filter headers that have commitments to the block filters. Also like
+block headers, filter headers each have a commitment to the preceding one.
+Before downloading the block filters themselves, a light client can download all
+filter headers for the current block chain and use them to verify the
+authenticity of the filters. If the filter header chains differ between multiple
+peers, the client can identify the point where they diverge, then download the
+full block and compute the correct filter, thus identifying which peer is
+faulty.
+
+The canonical hash of a block filter is the double-SHA256 of the serialized
+filter. Filter headers are 32-byte hashes derived for each block filter. They
+are computed as the double-SHA256 of the concatenation of the filter hash with
+the previous filter header. The previous filter header used to calculate that of
+the genesis block is defined to be the 32-byte array of 0's.
+
+=== New Messages ===
+
+==== getcfilters ====
+getcfilters is used to request the compact filters of a particular
+type for a particular range of blocks. The message contains the following
+fields:
+
+{| class="wikitable"
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| FilterType
+| byte
+| 1
+| Filter type for which headers are requested
+|-
+| StartHeight
+| uint32
+| 4
+| The height of the first block in the requested range
+|-
+| StopHash
+| [32]byte
+| 32
+| The hash of the last block in the requested range
+|}
+
+# Nodes SHOULD NOT send getcfilters unless the peer has signaled support for this filter type. Nodes receiving getcfilters with an unsupported filter type SHOULD NOT respond.
+# StopHash MUST be known to belong to a block accepted by the receiving peer. This is the case if the peer had previously sent a headers or inv message with that block or any descendents. A node that receives getcfilters with an unknown StopHash SHOULD NOT respond.
+# The height of the block with hash StopHash MUST be greater than or equal to StartHeight, and the difference MUST be strictly less than 1000.
+# The receiving node MUST respond to valid requests by sending one cfilter message for each block in the requested range, sequentially in order by block height.
+
+==== cfilter ====
+cfilter is sent in response to getcfilters, one for
+each block in the requested range. The message contains the following fields:
+
+{| class="wikitable"
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| FilterType
+| byte
+| 1
+| Byte identifying the type of filter being returned
+|-
+| BlockHash
+| [32]byte
+| 32
+| Block hash of the Bitcoin block for which the filter is being returned
+|-
+| NumFilterBytes
+| CompactSize
+| 1-5
+| A variable length integer representing the size of the filter in the following field
+|-
+| FilterBytes
+| []byte
+| NumFilterBytes
+| The serialized compact filter for this block
+|}
+
+# The FilterType SHOULD match the field in the getcfilters request, and BlockHash must correspond to a block that is an ancestor of StopHash with height greater than or equal to StartHeight.
+
+==== getcfheaders ====
+getcfheaders is used to request verifiable filter headers for a
+range of blocks. The message contains the following fields:
+
+{| class="wikitable"
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| FilterType
+| byte
+| 1
+| Filter type for which headers are requested
+|-
+| StartHeight
+| uint32
+| 4
+| The height of the first block in the requested range
+|-
+| StopHash
+| [32]byte
+| 32
+| The hash of the last block in the requested range
+|}
+
+# Nodes SHOULD NOT send getcfheaders unless the peer has signaled support for this filter type. Nodes receiving getcfheaders with an unsupported filter type SHOULD NOT respond.
+# StopHash MUST be known to belong to a block accepted by the receiving peer. This is the case if the peer had previously sent a headers or inv message with that block or any descendents. A node that receives getcfheaders with an unknown StopHash SHOULD NOT respond.
+# The height of the block with hash StopHash MUST be greater than or equal to StartHeight, and the difference MUST be strictly less than 2,000.
+
+==== cfheaders ====
+cfheaders is sent in response to getcfheaders. Instead
+of including the filter headers themselves, the response includes one filter
+header and a sequence of filter hashes, from which the headers can be derived.
+This has the benefit that the client can verify the binding links between the
+headers. The message contains the following fields:
+
+{| class="wikitable"
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| FilterType
+| byte
+| 1
+| Filter type for which hashes are requested
+|-
+| StopHash
+| [32]byte
+| 32
+| The hash of the last block in the requested range
+|-
+| PreviousFilterHeader
+| [32]byte
+| 32
+| The filter header preceding the first block in the requested range
+|-
+| FilterHashesLength
+| CompactSize
+| 1-3
+| The length of the following vector of filter hashes
+|-
+| FilterHashes
+| [][32]byte
+| FilterHashesLength * 32
+| The filter hashes for each block in the requested range
+|}
+
+# The FilterType and StopHash SHOULD match the fields in the getcfheaders request.
+# FilterHashesLength MUST NOT be greater than 2,000.
+# FilterHashes MUST have one entry for each block on the chain terminating with tip StopHash, starting with the block at height StartHeight. The entries MUST be the filter hashes of the given type for each block in that range, in ascending order by height.
+# PreviousFilterHeader MUST be set to the previous filter header of first block in the requested range.
+
+==== getcfcheckpt ====
+getcfcheckpt is used to request filter headers at evenly spaced
+intervals over a range of blocks. Clients may use filter hashes from
+getcfheaders to connect these checkpoints, as is described in the
+[[#client-operation|Client Operation]] section below. The
+getcfcheckpt message contains the following fields:
+
+{| class="wikitable"
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| FilterType
+| byte
+| 1
+| Filter type for which headers are requested
+|-
+| StopHash
+| [32]byte
+| 32
+| The hash of the last block in the chain that headers are requested for
+|}
+
+# Nodes SHOULD NOT send getcfcheckpt unless the peer has signaled support for this filter type. Nodes receiving getcfcheckpt with an unsupported filter type SHOULD NOT respond.
+# StopHash MUST be known to belong to a block accepted by the receiving peer. This is the case if the peer had previously sent a headers or inv message with any descendent blocks. A node that receives getcfcheckpt with an unknown StopHash SHOULD NOT respond.
+
+==== cfcheckpt ====
+cfcheckpt is sent in response to getcfcheckpt. The
+filter headers included are the set of all filter headers on the requested chain
+where the height is a positive multiple of 1,000. The message contains the
+following fields:
+
+{| class="wikitable"
+! Field Name
+! Data Type
+! Byte Size
+! Description
+|-
+| FilterType
+| byte
+| 1
+| Filter type for which headers are requested
+|-
+| StopHash
+| [32]byte
+| 32
+| The hash of the last block in the chain that headers are requested for
+|-
+| FilterHeadersLength
+| CompactSize
+| 1-3
+| The length of the following vector of filter headers
+|-
+| FilterHeaders
+| [][32]byte
+| FilterHeadersLength * 32
+| The filter headers at intervals of 1,000
+|}
+
+# The FilterType and StopHash SHOULD match the fields in the getcfcheckpt request.
+# FilterHeaders MUST have exactly one entry for each block on the chain terminating in StopHash, where the block height is a multiple of 1,000 greater than 0. The entries MUST be the filter headers of the given type for each such block, in ascending order by height.
+
+=== Node Operation ===
+
+Full nodes MAY opt to support this BIP and generate filters for any of the
+specified filter types. Such nodes SHOULD treat the filters as an additional
+index of the blockchain. For each new block that is connected to the main chain,
+nodes SHOULD generate filters for all supported types and persist them. Nodes
+that are missing filters and are already synced with the blockchain SHOULD
+reindex the chain upon start-up, constructing filters for each block from
+genesis to the current tip. They also SHOULD keep every checkpoint header in
+memory, so that getcfcheckpt requests do not result in many
+random-access disk reads.
+
+Nodes SHOULD NOT generate filters dynamically on request, as malicious peers may
+be able to perform DoS attacks by requesting small filters derived from large
+blocks. This would require an asymmetical amount of I/O on the node to compute
+and serve, similar to attacks against BIP 37 enabled nodes noted in BIP 111.
+
+Nodes MAY prune block data after generating and storing all filters for a block.
+
+=== Client Operation ===
+
+This section provides recommendations for light clients to download filters with
+maximal security.
+
+Clients SHOULD first sync the entire block header chain from peers using the
+standard headers-first syncing mechanism before downloading any block filters or
+filter headers. Clients configured with trusted checkpoints MAY only sync
+headers started from the last checkpoint. Clients SHOULD disconnect any outbound
+peers whose best chain has significantly less work than the known longest
+proof-of-work chain.
+
+Once a client's block headers are in sync, it SHOULD download and verify filter
+headers for all blocks and filter types that it might later download. The client
+SHOULD send getcfheaders messages to peers and derive and store the
+filter headers for each block. The client MAY first fetch headers at evenly
+spaced intervals of 1,000 by sending getcfcheckpt. The header
+checkpoints allow the client to download filter headers for different intervals
+from multiple peers in parallel, verifying each range of 1,000 headers against
+the checkpoints.
+
+Unless securely connected to a trusted peer that is serving filter headers, the
+client SHOULD connect to multiple outbound peers that support each filter type
+to mitigate the risk of downloading incorrect headers. If the client receives
+conflicting filter headers from different peers for any block and filter type,
+it SHOULD interrogate them to determine which is faulty. The client SHOULD use
+getcfheaders and/or getcfcheckpt to first identify
+the first filter headers that the peers disagree on. The client then SHOULD
+download the full block from any peer and derive the correct filter and filter
+header. The client SHOULD ban any peers that sent a filter header that does not
+match the computed one.
+
+Once the client has downloaded and verified all filter headers needed, ''and''
+no outbound peers have sent conflicting headers, the client can download the
+actual block filters it needs. The client MAY backfill filter headers before the
+first verified one at this point if it only downloaded them starting at a later
+point. Clients SHOULD persist the verified filter headers for last 100 blocks in
+the chain (or whatever finality depth is desired), to compare against headers
+received from new peers after restart. They MAY store more filter headers to
+avoid redownloading them if a rescan is later necessary.
+
+Starting from the first block in the desired range, the client now MAY download
+the filters. The client SHOULD test that each filter links to its corresponding
+filter header and ban peers that send incorrect filters. The client MAY download
+multiple filters at once to increase throughput, though it SHOULD test the
+filters sequentially. The client MAY check if a filter is empty before
+requesting it by checking if the filter header commits to the hash of the empty
+filter, saving a round trip if that is the case.
+
+Each time a new valid block header is received, the client SHOULD request the
+corresponding filter headers from all eligible peers. If two peers send
+conflicting filter headers, the client should interrogate them as described
+above and ban any peers that send an invalid header.
+
+If a client is fetching full blocks from the P2P network, they SHOULD be downloaded
+from outbound peers at random to mitigate privacy loss due to transaction
+intersection analysis. Note that blocks may be downloaded from peers that do not
+support this BIP.
+
+== Rationale ==
+
+The filter headers and checkpoints messages are defined to help clients identify
+the correct filter for a block when connected to peers sending conflicting
+information. An alternative solution is to require Bitcoin blocks to include
+commitments to derived block filters, so light clients can verify authenticity
+given block headers and some additional witness data. This would require a
+network-wide change to the Bitcoin consensus rules, however, whereas this
+document proposes a solution purely at the P2P layer.
+
+The constant interval of 1,000 blocks between checkpoints was chosen so that,
+given the current chain height and rate of growth, the size of a
+cfcheckpt message is not drastically from a
+cfheaders between two checkpoints. Also, 1,000 is a nice round
+number, at least to those of us who think in decimal.
+
+== Compatibility ==
+
+This light client mode is not compatible with current node deployments and
+requires support for the new P2P messages. The node implementation of this
+proposal is not incompatible with the current P2P network rules (ie. doesn't
+affect network topology of full nodes). Light clients may adopt protocols based
+on this as an alternative to the existing BIP 37. Adoption of this BIP may
+result in reduced network support for BIP 37.
+
+== Acknowledgments ==
+
+We would like to thank bfd (from the bitcoin-dev mailing list) for bringing the
+basis of this BIP to our attention, Joseph Poon for suggesting the filter header
+chain scheme, and Pedro Martelletto for writing the initial indexing code for
+btcd.
+
+We would also like to thank Dave Collins, JJ Jeffrey, Eric Lombrozo, and Matt
+Corallo for useful discussions.
+
+== Reference Implementation ==
+
+Light client: [https://github.com/lightninglabs/neutrino]
+
+Full-node indexing: https://github.com/Roasbeef/btcd/tree/segwit-cbf
+
+Golomb-Rice Coded sets: https://github.com/Roasbeef/btcutil/tree/gcs/gcs
+
+== References ==
+
++ BIP: 158 + Layer: Peer Services + Title: Compact Block Filters for Light Clients + Author: Olaoluwa Osuntokun+ + +== Abstract == + +This BIP describes a structure for compact filters on block data, for use in the +BIP 157 light client protocolbip-0157.mediawiki. The filter +construction proposed is an alternative to Bloom filters, as used in BIP 37, +that minimizes filter size by using Golomb-Rice coding for compression. This +document specifies one initial filter type based on this construction that +enables basic wallets and applications with more advanced smart contracts. + +== Motivation == + +[[bip-0157.mediawiki|BIP 157]] defines a light client protocol based on +deterministic filters of block content. The filters are designed to +minimize the expected bandwidth consumed by light clients, downloading filters +and full blocks. This document defines the initial filter type ''basic'' +that is designed to reduce the filter size for regular wallets. + +== Definitions == + ++ Alex Akselrod + Comments-Summary: None yet + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0158 + Status: Draft + Type: Standards Track + Created: 2017-05-24 + License: CC0-1.0 +
[]byte represents a vector of bytes.
+
+[N]byte represents a fixed-size byte array with length N.
+
+''CompactSize'' is a compact encoding of unsigned integers used in the Bitcoin
+P2P protocol.
+
+''Data pushes'' are byte vectors pushed to the stack according to the rules of
+Bitcoin script.
+
+''Bit streams'' are readable and writable streams of individual bits. The
+following functions are used in the pseudocode in this document:
+* new_bit_stream instantiates a new writable bit stream
+* new_bit_stream(vector) instantiates a new bit stream reading data from vector
+* write_bit(stream, b) appends the bit b to the end of the stream
+* read_bit(stream) reads the next available bit from the stream
+* write_bits_big_endian(stream, n, k) appends the k least significant bits of integer n to the end of the stream in big-endian bit order
+* read_bits_big_endian(stream, k) reads the next available k bits from the stream and interprets them as the least significant bits of a big-endian integer
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in RFC 2119.
+
+== Specification ==
+
+=== Golomb-Coded Sets ===
+
+For each block, compact filters are derived containing sets of items associated
+with the block (eg. addresses sent to, outpoints spent, etc.). A set of such
+data objects is compressed into a probabilistic structure called a
+''Golomb-coded set'' (GCS), which matches all items in the set with probability
+1, and matches other items with probability 1/M for some
+integer parameter M. The encoding is also parameterized by
+P, the bit length of the remainder code. Each filter defined
+specifies values for P and M.
+
+At a high level, a GCS is constructed from a set of N items by:
+# hashing all items to 64-bit integers in the range [0, N * M)
+# sorting the hashed values in ascending order
+# computing the differences between each value and the previous one
+# writing the differences sequentially, compressed with Golomb-Rice coding
+
+The following sections describe each step in greater detail.
+
+==== Hashing Data Objects ====
+
+The first step in the filter construction is hashing the variable-sized raw
+items in the set to the range [0, F), where F = N *
+M. Customarily, M is set to 2^P. However, if
+one is able to select both Parameters independently, then more optimal values
+can be
+selectedhttps://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845.
+Set membership queries against the hash outputs will have a false positive rate
+of M. To avoid integer overflow, the number of items N
+MUST be <2^32 and M MUST be <2^32.
+
+The items are first passed through the pseudorandom function ''SipHash'', which
+takes a 128-bit key k and a variable-sized byte vector and produces
+a uniformly random 64-bit output. Implementations of this BIP MUST use the
+SipHash parameters c = 2 and d = 4.
+
+The 64-bit SipHash outputs are then mapped uniformly over the desired range by
+multiplying with F and taking the top 64 bits of the 128-bit result. This
+algorithm is a faster alternative to modulo reduction, as it avoids the
+expensive division
+operationhttps://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/.
+Note that care must be taken when implementing this reduction to ensure the
+upper 64 bits of the integer multiplication are not truncated; certain
+architectures and high level languages may require code that decomposes the
+64-bit multiplication into four 32-bit multiplications and recombines into the
+result.
+
++hash_to_range(item: []byte, F: uint64, k: [16]byte) -> uint64: + return (siphash(k, item) * F) >> 64 + +hashed_set_construct(raw_items: [][]byte, k: [16]byte, M: uint) -> []uint64: + let N = len(raw_items) + let F = N * M + + let set_items = [] + + for item in raw_items: + let set_value = hash_to_range(item, F, k) + set_items.append(set_value) + + return set_items ++ +==== Golomb-Rice Coding ==== + +Instead of writing the items in the hashed set directly to the filter, greater +compression is achieved by only writing the differences between successive +items in sorted order. Since the items are distributed uniformly, it can be +shown that the differences resemble a geometric +distributionhttps://en.wikipedia.org/wiki/Geometric_distribution. +''Golomb-Rice'' +''coding''https://en.wikipedia.org/wiki/Golomb_coding#Rice_coding +is a technique that optimally compresses geometrically distributed values. + +With Golomb-Rice, a value is split into a quotient and remainder modulo +
2^P, which are encoded separately. The quotient q is
+encoded as ''unary'', with a string of q 1's followed by one 0. The
+remainder r is represented in big-endian by P bits. For example,
+this is a table of Golomb-Rice coded values using P=2:
+
+{| class="wikitable"
+! n !! (q, r) !! c
+|-
+| 0 || (0, 0) || 0 00
+|-
+| 1 || (0, 1) || 0 01
+|-
+| 2 || (0, 2) || 0 10
+|-
+| 3 || (0, 3) || 0 11
+|-
+| 4 || (1, 0) || 10 00
+|-
+| 5 || (1, 1) || 10 01
+|-
+| 6 || (1, 2) || 10 10
+|-
+| 7 || (1, 3) || 10 11
+|-
+| 8 || (2, 0) || 110 00
+|-
+| 9 || (2, 1) || 110 01
+|}
+
++golomb_encode(stream, x: uint64, P: uint): + let q = x >> P + + while q > 0: + write_bit(stream, 1) + q-- + write_bit(stream, 0) + + write_bits_big_endian(stream, x, P) + +golomb_decode(stream, P: uint) -> uint64: + let q = 0 + while read_bit(stream) == 1: + q++ + + let r = read_bits_big_endian(stream, P) + + let x = (q << P) + r + return x ++ +==== Set Construction ==== + +A GCS is constructed from four parameters: +*
L, a vector of N raw items
+* P, the bit parameter of the Golomb-Rice coding
+* M, the target false positive rate
+* k, the 128-bit key used to randomize the SipHash outputs
+
+The result is a byte vector with a minimum size of N * (P + 1)
+bits.
+
+The raw items in L are first hashed to 64-bit unsigned integers as
+specified above and sorted. The differences between consecutive values,
+hereafter referred to as ''deltas'', are encoded sequentially to a bit stream
+with Golomb-Rice coding. Finally, the bit stream is padded with 0's to the
+nearest byte boundary and serialized to the output byte vector.
+
++construct_gcs(L: [][]byte, P: uint, k: [16]byte, M: uint) -> []byte: + let set_items = hashed_set_construct(L, k, M) + + set_items.sort() + + let output_stream = new_bit_stream() + + let last_value = 0 + for item in set_items: + let delta = item - last_value + golomb_encode(output_stream, delta, P) + last_value = item + + return output_stream.bytes() ++ +==== Set Querying/Decompression ==== + +To check membership of an item in a compressed GCS, one must reconstruct the +hashed set members from the encoded deltas. The procedure to do so is the +reverse of the compression: deltas are decoded one by one and added to a +cumulative sum. Each intermediate sum represents a hashed value in the original +set. The queried item is hashed in the same way as the set members and compared +against the reconstructed values. Note that querying does not require the entire +decompressed set be held in memory at once. + +
+gcs_match(key: [16]byte, compressed_set: []byte, target: []byte, P: uint, N: uint, M: uint) -> bool: + let F = N * M + let target_hash = hash_to_range(target, F, k) + + stream = new_bit_stream(compressed_set) + + let last_value = 0 + + loop N times: + let delta = golomb_decode(stream, P) + let set_item = last_value + delta + + if set_item == target_hash: + return true + + // Since the values in the set are sorted, terminate the search once + // the decoded value exceeds the target. + if set_item > target_hash: + break + + last_value = set_item + + return false ++ +Some applications may need to check for set intersection instead of membership +of a single item. This can be performed far more efficiently than checking each +item individually by leveraging the sorted structure of the compressed GCS. +First the query elements are all hashed and sorted, then compared in order +against the decompressed GCS contents. See +[[#golomb-coded-set-multi-match|Appendix B]] for pseudocode. + +=== Block Filters === + +This BIP defines one initial filter type: +* Basic (
0x00)
+** M = 784931
+** P = 19
+
+==== Contents ====
+
+The basic filter is designed to contain everything that a light client needs to
+sync a regular Bitcoin wallet. A basic filter MUST contain exactly the
+following items for each transaction in a block:
+* The previous output script (the script being spent) for each input, except
+ for the coinbase transaction.
+* The scriptPubKey of each output, aside from all OP_RETURN output
+ scripts.
+
+Any "nil" items MUST NOT be included into the final set of filter elements.
+
+We exclude all outputs that start with OP_RETURN in order to allow
+filters to easily be committed to in the future via a soft-fork. A likely area
+for future commitments is an additional OP_RETURN output in the
+coinbase transaction similar to the current witness commitment
+https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki. By
+excluding all OP_RETURN outputs we avoid a circular dependency
+between the commitment, and the item being committed to.
+
+==== Construction ====
+
+The basic type is constructed as Golomb-coded sets with the following
+parameters.
+
+The parameter P MUST be set to 19, and the parameter
+M MUST be set to 784931. Analysis has shown that if
+one is able to select P and M independently, then
+setting M=1.497137 * 2^P is close to optimal
+https://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845.
+
+Empirical analysis also shows that was chosen as these parameters minimize the
+bandwidth utilized, considering both the expected number of blocks downloaded
+due to false positives and the size of the filters themselves.
+
+The parameter k MUST be set to the first 16 bytes of the hash
+(in standard little-endian representation) of the block for which the filter is
+constructed. This ensures the key is deterministic while still varying from
+block to block.
+
+Since the value N is required to decode a GCS, a serialized GCS
+includes it as a prefix, written as a CompactSize. Thus, the
+complete serialization of a filter is:
+* N, encoded as a CompactSize
+* The bytes of the compressed filter itself
+
+==== Signaling ====
+
+This BIP allocates a new service bit:
+
+{| class="wikitable"
+|-
+| NODE_COMPACT_FILTERS
+| style="white-space: nowrap;" | 1 << 6
+| If enabled, the node MUST respond to all BIP 157 messages for filter type 0x00
+|}
+
+== Compatibility ==
+
+This block filter construction is not incompatible with existing software,
+though it requires implementation of the new filters.
+
+== Acknowledgments ==
+
+We would like to thank bfd (from the bitcoin-dev mailing list) for bringing the
+basis of this BIP to our attention, Greg Maxwell for pointing us in the
+direction of Golomb-Rice coding and fast range optimization, Pieter Wullie for
+his analysis of optimal GCS parameters, and Pedro
+Martelletto for writing the initial indexing code for btcd.
+
+We would also like to thank Dave Collins, JJ Jeffrey, and Eric Lombrozo for
+useful discussions.
+
+== Reference Implementation ==
+
+Light client: [https://github.com/lightninglabs/neutrino]
+
+Full-node indexing: https://github.com/Roasbeef/btcd/tree/segwit-cbf
+
+Golomb-Rice Coded sets: https://github.com/btcsuite/btcutil/blob/master/gcs
+
+== Appendix A: Alternatives ==
+
+A number of alternative set encodings were considered before Golomb-coded
+sets were settled upon. In this appendix section, we'll list a few of the
+alternatives along with our rationale for not pursuing them.
+
+==== Bloom Filters ====
+
+Bloom Filters are perhaps the best known probabilistic data structure for
+testing set membership, and were introduced into the Bitcoin protocol with BIP
+37. The size of a Bloom filter is larger than the expected size of a GCS with
+the same false positive rate, which is the main reason the option was rejected.
+
+==== Cryptographic Accumulators ====
+
+Cryptographic
+accumulatorshttps://en.wikipedia.org/wiki/Accumulator_(cryptography)
+are a cryptographic data structures that enable (amongst other operations) a one
+way membership test. One advantage of accumulators are that they are constant
+size, independent of the number of elements inserted into the accumulator.
+However, current constructions of cryptographic accumulators require an initial
+trusted set up. Additionally, accumulators based on the Strong-RSA Assumption
+require mapping set items to prime representatives in the associated group which
+can be preemptively expensive.
+
+==== Matrix Based Probabilistic Set Data Structures ====
+
+There exist data structures based on matrix solving which are even more space
+efficient compared to Bloom
+filtershttps://arxiv.org/pdf/0804.1845.pdf. We instead opted for our
+GCS-based filters as they have a much lower implementation complexity and are
+easier to understand.
+
+== Appendix B: Pseudocode ==
+
+=== Golomb-Coded Set Multi-Match ===
+
++gcs_match_any(key: [16]byte, compressed_set: []byte, targets: [][]byte, P: uint, N: uint, M: uint) -> bool: + let F = N * M + + // Map targets to the same range as the set hashes. + let target_hashes = [] + for target in targets: + let target_hash = hash_to_range(target, F, k) + target_hashes.append(target_hash) + + // Sort targets so matching can be checked in linear time. + target_hashes.sort() + + stream = new_bit_stream(compressed_set) + + let value = 0 + let target_idx = 0 + let target_val = target_hashes[target_idx] + + loop N times: + let delta = golomb_decode(stream, P) + value += delta + + inner loop: + if target_val == value: + return true + + // Move on to the next set value. + else if target_val > value: + break inner loop + + // Move on to the next target value. + else if target_val < value: + target_idx++ + + // If there are no targets left, then there are no matches. + if target_idx == len(targets): + break outer loop + + target_val = target_hashes[target_idx] + + return false ++ +== Appendix C: Test Vectors == + +Test vectors for basic block filters on five testnet blocks, including the filters and filter headers, can be found [[bip-0158/testnet-19.json|here]]. The code to generate them can be found [[bip-0158/gentestvectors.go|here]]. + +== References == + +
+ BIP: 159 + Layer: Peer Services + Title: NODE_NETWORK_LIMITED service bit + Author: Jonas Schnelli <dev@jonasschnelli.ch> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0159 + Status: Draft + Type: Standards Track + Created: 2017-05-11 + License: BSD-2-Clause ++ +# Abstract + +Define a service bit that allow pruned peers to signal their limited services + +# Motivation + +Pruned peers can offer the same services as traditional peer except of serving all historical blocks. +Bitcoin right now only offers the NODE_NETWORK service bit which indicates that a peer can serve +all historical blocks. +1. Pruned peers can relay blocks, headers, transactions, addresses and can serve a limited number of historical blocks, thus they should have a way how to announce their service(s) +2. Peers no longer in initial block download should consider connecting some of its outbound connections to pruned peers to allow other peers to bootstrap from non-pruned peers + +# Specification + +## New service bit + +This BIP proposes a new service bit +| | | | +|--|--|--| +| NODE_NETWORK_LIMITED | bit 10 (0x400) | If signaled, the peer MUST be capable of serving at least the last 288 blocks (~2 days). | + +A safety buffer of 144 blocks to handle chain reorganizations SHOULD be taken into account when connecting to a peer signaling the
NODE_NETWORK_LIMITED service bit.
+
+### Address relay
+
+Full nodes following this BIP SHOULD relay address/services (addr message) from peers they would connect to (including peers signaling NODE_NETWORK_LIMITED).
+
+### Counter-measures for peer fingerprinting
+
+Peers may have different prune depths (depending on the peers configuration, disk space, etc.) which can result in a fingerprinting weakness (finding the prune depth through getdata requests). NODE_NETWORK_LIMITED supporting peers SHOULD avoid leaking the prune depth and therefore not serve blocks deeper than the signaled NODE_NETWORK_LIMITED threshold (288 blocks).
+
+### Risks
+
+Pruned peers following this BIP may consume more outbound bandwidth.
+
+Light clients (and such) who are not checking the nServiceFlags (service bits) from a relayed addr-message may unwillingly connect to a pruned peer and ask for (filtered) blocks at a depth below their pruned depth. Light clients should therefore check the service bits (and eventually connect to peers signaling NODE_NETWORK_LIMITED if they require [filtered] blocks around the tip). Light clients obtaining peer IPs though DNS seed should use the DNS filtering option.
+
+## Compatibility
+
+This proposal is backward compatible.
+
+## Reference implementation
+
+* https://github.com/bitcoin/bitcoin/pull/11740 (signaling)
+* https://github.com/bitcoin/bitcoin/pull/10387 (connection and relay)
+
+## Copyright
+
+This BIP is licensed under the 2-clause BSD license.
\ No newline at end of file
diff --git a/protocol/misc/endian.md b/protocol/misc/endian.md
new file mode 100644
index 0000000..8973d7a
--- /dev/null
+++ b/protocol/misc/endian.md
@@ -0,0 +1,525 @@
+# Endianness
+
+f
+
+fwe
+f
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+
+## Little
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+
+
+
+
+
+## Big
+
+fewfewf
+f
+wef
+e
+wf
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+fewfewf
+f
+wef
+e
+wf
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
+
+
+
+ewf
+fe
+f
+ef
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+f
+ff
diff --git a/protocol/network/messages.md b/protocol/network/messages.md
index 3d5bfe1..ca1e7ee 100644
--- a/protocol/network/messages.md
+++ b/protocol/network/messages.md
@@ -1,5 +1,5 @@
# Standard Messages
-## version
+## [Handshake: Version](/protocol/network/messages/version) ("version")
## verack
## ping
## pong
diff --git a/protocol/network/messages/version.md b/protocol/network/messages/version.md
new file mode 100644
index 0000000..28d316c
--- /dev/null
+++ b/protocol/network/messages/version.md
@@ -0,0 +1,89 @@
+
+# Handshake: Version (“version”)
+
+The version message is a part of the node connection [handshake](/protocol/network/node-handshake) and indicates various connection settings, networking information, and the services provided by the sending node (see Services Bitmask [below](#services-bitmask)).
+
+The node connection is not considered established until both nodes have sent and received both a version and [verack](/protocol/network/messages/verack) message.
+
+## Message Format
+
+| Field | Length | Format | Description |
+|--|--|--|--|
+| version | 4 bytes | unsigned integer[(LE)](/protocol/misc/endian/little) | The version number supported by the sending node. |
+| services | 8 bytes | bitmask[(LE)](/protocol/misc/endian/little) | An indication of the services supported by the sending node. See Services Bitmask section below. |
+| timestamp | 8 bytes | unix timestamp[(LE)](/protocol/misc/endian/little) | The time the message was generated on the sending node. |
+| remote address | 26 bytes | [network address](/protocol/formats/network-address) | The network address of the remote node. _NOTE: this does not contain the timestamp normally included with network addresses._
| +| local address | 26 bytes | [network address](/protocol/formats/network-address) | The network address of the sending node._NOTE: this does not contain the timestamp normally included with network addresses._
| +| nonce | 8 bytes | bytes[(LE)](/protocol/misc/endian/little) | Random nonce for the connection, used to detect connections to self. | +| user agent | variable | [variable length string](/protocol/formats/variable-length-string) | A user agent string identifying the node implementation. | +| block height | 4 bytes | unsigned integer[(LE)](/protocol/misc/endian/little) | The height of the block with the highest height known to the sending node. | +| relay flag | 1 byte | boolean | Indicates whether the sending node would like all broadcasted transactions relayed to it. See [BIP-37](/protocol/forks/bip-0037). | + +## Version + +The most recent version of the network protocol is `70015`. The `version` value often correlates to new behavior, parsing formats, and available services; for more details review the network protocol's [version history](/history/protocol-version). Nodes should use `version` and the `services` bitmask to determine if the node should accept the incoming connection. Related: [node connection handshake](/protocol/network/node-handshake). + +## Services Bitmask +The services field is an 8 byte little-endian-serialized bitfield that described peer capabilities. The following capabilities are defined, by bit position: + +### Standard Services +* 0: NODE_NETWORK + The node is capable of serving the complete block chain. It is currently set by all full nodes, and is unset by SPV clients or other peers that just want network services but don't provide them. + +* 2: NODE_BLOOM + The node is capable and willing to handle bloom-filtered connections. + +* 3: NODE_WITNESS + Indicates that a node can be asked for blocks and transactions including witness data. + *Bitcoin Cash nodes do not have witness data so this flag should be ignored on receipt and set to 0 when sent* + +* 5: NODE_BITCOIN_CASH + The node supports the BCH chain. This is intended to be just a temporary service bit until the BTC/BCH fork actually happens. + +* 24-31: Reserved for experimental changes + These bits are reserved for temporary experiments. Just pick a bit that isn't getting used, or one not being used much, and notify the community. Remember that service bits are just unauthenticated advertisements, so your code must be robust against collisions and other cases where nodes may be advertising a service they do not actually support. + +### Node Specific Messages + +#### Bitcoin ABC + +* 10: NODE_NETWORK_LIMITED
+ This means the same as NODE_NETWORK with the limitation of only serving a small subset of the blockchain. See [BIP159](/protocol/forks/bip-0159) for details on how this is implemented.
+
+#### Bitcoin Unlimited
+
+* 4: NODE_XTHIN
+ The node supports Xtreme Thinblocks
+
+* 6: NODE_GRAPHENE
+ The node supports Graphene blocks. If this is turned off then the node will not service graphene requests nor make graphene requests.
+
+* 10: NODE_NETWORK_LIMITED
+ This means the same as NODE_NETWORK with the limitation of only serving a small subset of the blockchain. See [BIP159](/protocol/forks/bip-0159) for details on how this is implemented.
+
+#### Bitcoin Verde
+
+* 7: BLOCKCHAIN_INDEX_ENABLED
+ Indicates that the node is an indexing node and supports returning information custom to the requesting user's addresses.
+
+* 8: SLP_INDEX_ENABLED
+ Indicates that the node tracks Simple Ledger Protocol validity and supports returning this status for individual transactions.
+
+#### Other Proposed/Previously Used Service Flags
+
+* 1: NODE_GETUTXO
+The node is capable of responding to the getutxo protocol request. See [BIP 64](/protocol/forks/bip-0064) for details on how this is implemented. _Was previously supported by Bitcoin XT only._
+
+* 7: NODE_WEAKBLOCKS
+ The node supports Storm weak block (currently no node supports these in production, so this is a placeholder).
+
+* 8: NODE_CF
+ Indicates the node is capable of serving compact block filters to SPV clients, AKA the "Neutrino" protocol ([BIP157](/protocol/forks/bip-0157), and [BIP158](/protocol/forks/bip-0158)).
+
+## Node-Specific Behavior
+
+Generally, though node implementations may be aware of services they do not provide, they generally ignore those they don't supported. Any notable deviations from that behavior are documented below.
+
+### Bitcoin ABC
+
+Bitcoin ABC nodes may, once they have reached their maximum number of peers, selectively disconnect from nodes that do not supported "desired services", though it appears currently this just NODE_NETWORK and/or NODE_NETWORK_LIMITED. That is, it may prefer nodes that store and serve blocks.
\ No newline at end of file
diff --git a/style-guide.md b/style-guide.md
new file mode 100644
index 0000000..2ad3b0a
--- /dev/null
+++ b/style-guide.md
@@ -0,0 +1,57 @@
+# Style Guide
+
+This page lays out the rules for contributing to the Bitcoin Cash specification. This is not intended to be a comprehensive list of rules, please use common sense.
+
+## General Content Guidelines
+
+All contributions should be:
+
+ - As accurate as possible (see below)
+ - As impartial as possible
+ - As impersonal as possible
+
+They should not:
+
+- Indicate a preference for a particular feature, configuration, or node implementation
+- Call others out by name or imply your personal involvement in making the change
+
+When referencing accuracy above, the intent is to be as true to the current state of the Bitcoin Cash protocol as possible, though this obviously has many possible interpretations. One basic litmus test is whether a brand new implementation would need to have a certain change in order to operate without utilizing deprecated/fall-back functionality. However, if a feature is implemented, and considered functionally complete, by the majority of active node implementations, that may also serve as a sufficient test of currency. As such, this specification may reflect everything from the basic necessary elements of the Bitcoin Cash protocol to those that are in widespread, but not universal, use.
+
+## Node-Specific Content Guidelines
+
+In additional to documenting the "current" state of Bitcoin Cash, this specification also seeks to track partially implemented and experimental features, particularly where it may benefit node developers to avoid stepping on one another's toes.
+
+When it comes to documenting these changes, there is again a range of definitions for what may be considered worth including. Generally, the set of features that are currently developed and in the master (or similar) branch of a given implementations code base is probably the best place to start. However, there may be cases where documenting changes earlier than this phase may be useful. In general, use common sense and try to keep experimental or node-specific content minimal while still including information that may benefit others viewing this specification.
+
+Furthermore, node-specific documentation may be best left to the developers of the implementations. If you have a change you would like to make about node-specific functionality of an implementation you have not contributed to, at least check the master branch of that implementation's codebase. When in doubt, reach out to a node developer to discuss the accuracy of, or best way to phrase, your contribution. In general, content from that node implementation's developers is preferred but if they are not willing or able to, make the change yourself.
+
+Finally, to distinguish this node-specific content from more standard, node-independent, content, please designate such content with both text and the following icon:
. This icon will serve as a simple visual indicator that the user may have ventured into territory that is experimental or otherwise not yet fully supported.
+
+## Creating Pages and Links
+
+While it is difficult to make hard-and-fast rules regarding the organization of something as complex as the Bitcoin Cash protocol, please take the time to observe the current organization of files (including URL paths and linking) before adding or moving pages.
+
+### Pages
+
+First, consider whether you need to create a new page or whether your content belongs in an existing page. This probably have to be decided on a case-by-case basis but some general guidelines may help:
+
+ - If your content is documenting a brand new process that is not directly related to any existing content, create a new page.
+ - If your content is documenting a new object but where similar object already exist (e.g. a new message type) follow the convention of how other objects of that type are already documented. If they are all already on a single page, add it there. If they all already have their own pages, create a new page alongside the existing ones.
+ - If your content is documenting a new object but no similar objects exist yet, start by making a new single page for this object in a new directory for the set of similar objects. If you have several to add, create separate pages for each under the same new directory. Merging many objects into a single page is a judgement call that can occur once the set is deemed to be complete and small enough for a single page.
+ - If you're still not sure, use your judgement or reach out to someone who may be able to provide guidance (e.g. a node developer)
+
+When creating pages, consider which part of the protocol the content you wish to add belongs to, and determine which directory it belongs under.
+
+As a convention, URL components (i.e. directories and pages) should contain only lower-case letters, numbers, and hyphens. Full words, separated by hyphens, are preferred over abbreviations, acronyms, or the use of punctuation or spaces. For example, the page for the version message exists at [/protocol/network/messages/version](/protocol/network/messages/version). A page relating to SLP (Simple Ledger Protocol) may exist at a path like /protocol/simple-ledger-protocol/[page-name].
+
+Once you have created the page, consider which existing pages should link to the new page and add links as appropriate.
+
+### Links
+
+When referencing external content (hosted on other sites/platforms), prefer the following actions, in order:
+
+ 1. Copying the content to the site (please observe content licenses for the source platforms) and linking it internally
+ 2. Paraphrasing all vital content locally and then linking externally
+ 3. Linking externally with sufficient context that if link breaks users have something to Google (e.g. if documenting a website itself, rather than content on it)
+
+If you're familiar with Stack Overflow's etiquette for posting answers with links, the same logic applies here. The primary goal is for this specification to be the only site a user needs to access to understand the entirety of the Bitcoin Cash protocol.
\ No newline at end of file
diff --git a/target-audience.md b/target-audience.md
new file mode 100644
index 0000000..5d02e6c
--- /dev/null
+++ b/target-audience.md
@@ -0,0 +1,17 @@
+# Target Audience
+
+This specification is intended to meet a variety of needs, depending on their level of comfort with the material involved. In order precedence, the goal of this specification is to meet the needs of:
+
+ - Node Developers
+ - Provide all the details necessary to write a Bitcoin Cash Node implementation from scratch.
+ - Provide a common space for documenting changes as they occur to continually represent a current model of the Bitcoin Cash protocol
+ - Provide each node implementation with an opportunity to describe what they do different and why
+ - Script Developers
+ - Provide all the details necessary to gain a complete understanding of script execution and the available scripting operations
+ - Provide examples of commonly used scripts and an explanation of the patterns which which they are created
+ - Bitcoin Cash Service Providers
+ - Provide information on how to create and submit transactions to the Bitcoin Cash network
+ - Provide information on how to collect and parse information about the state of the Bitcoin Cash blockchain and network
+ - Interested Non-Technical Parties
+ - Provide a high-level description of the major terms and concepts required to understand how Bitcoin Cash operates and its current operational state
+ - Provide insight into how Bitcoin Cash differs from Bitcoin Core