diff --git a/protocol/forks/2019-05-15-schnorr.md b/protocol/forks/2019-05-15-schnorr.md new file mode 100644 index 0000000..05f2bbd --- /dev/null +++ b/protocol/forks/2019-05-15-schnorr.md @@ -0,0 +1,252 @@ +# 2019-MAY-15 Schnorr Signature specification + + layout: specification + title: 2019-MAY-15 Schnorr Signature specification + date: 2019-02-15 + category: spec + activation: 1557921600 + version: 0.5 + author: Mark B. Lundeberg + +# Summary + +Four script opcodes that verify single ECDSA signatures will be overloaded to also accept Schnorr signatures: + +* `OP_CHECKSIG`, `OP_CHECKSIGVERIFY` +* `OP_CHECKDATASIG`, `OP_CHECKDATASIGVERIFY` + +The other two ECDSA opcodes, `OP_CHECKMULTISIG` and `OP_CHECKMULTISIGVERIFY`, will *not* be upgraded to allow Schnorr signatures and in fact will be modified to refuse Schnorr-sized signatures. + +* [Summary](#summary) +* [Motivation](#motivation) +* [Specification](#specification) + * [Public keys](#public-keys) + * [Signature verification algorithm](#signature-verification-algorithm) + * [Message m calculation](#message-m-calculation) + * [OP_CHECKMULTISIG/VERIFY](#op_checkmultisigverify) +* [Recommended practices for secure signature generation](#recommended-practices-for-secure-signature-generation) +* [Rationale and commentary on design decisions](#rationale-and-commentary-on-design-decisions) + * [Schnorr variant](#schnorr-variant) + * [Overloading of opcodes](#overloading-of-opcodes) + * [Re-use of keypair encodings](#re-use-of-keypair-encodings) + * [Non-inclusion of OP_CHECKMULTISIG](#non-inclusion-of-op_checkmultisig) + * [Lack of flag byte -- ECDSA / Schnorr ambiguity](#lack-of-flag-byte----ecdsa--schnorr-ambiguity) + * [Miscellaneous](#miscellaneous) +* [Acknowledgements](#acknowledgements) + +# Motivation + +(for more detail, see Motivation and Applications sections of [Pieter Wuille's Schnorr specification](https://github.com/sipa/bips/blob/bip-schnorr/bip-schnorr.mediawiki)) + +Schnorr signatures have some slightly improved properties over the ECDSA signatures currently used in bitcoin: + +* Known cryptographic proof of security. +* Proven that there are no unknown third-party malleability mechanisms. +* Linearity allows some simple multi-party signature aggregation protocols. (compactness / privacy / malleability benefits) +* Possibility to do batch validation, resulting a slight speedup during validation of large transactions or initial block download. + +# Specification + +Current ECDSA opcodes accept DER signatures (format: `0x30 (N+M+4) 0x02 N 0x02 M [hashtype byte]`) from the stack. +This upgrade will allow a Schnorr signature to be substituted in any place where an ECDSA DER signature is accepted. +Schnorr signatures taken from stack will have the following 65-byte form for OP_CHECKSIG/VERIFY: + +| 32 bytes | 32 bytes | 1 byte | +|----------|----------|-------------| +| r | s | hashtype | + +and 64 bytes for OP_CHECKDATASIG/VERIFY: + +| 32 bytes | 32 bytes | +|----------|----------| +| r | s | + +* `r` is the unsigned big-endian 256-bit encoding of the Schnorr signature's *r* integer. +* `s` is the unsigned big-endian 256-bit encoding of the Schnorr signature's *s* integer. +* `hashtype` informs OP_CHECKSIG/VERIFY [mechanics](/protocol/forks/replay-protected-sighash). + +These constant length signatures can be contrasted to ECDSA signatures which have variable length (typically 71-72 bytes but in principle may be as short as 8 bytes). + +Upon activation, all 64-byte signatures passed to OP_CHECKDATASIG/VERIFY will be processed as Schnorr signatures, and all 65-byte signatures passed to OP_CHECKSIG/VERIFY will be processed as Schnorr signatures. +65-byte signatures passed to OP_CHECKMULTISIG/VERIFY will trigger script failure (see below for more detailss). + +## Public keys + +All valid ECDSA public keys are also valid Schnorr public keys: compressed (starting byte 2 or 3) and uncompressed (starting byte 4), see [SEC1 §2.3.3](http://www.secg.org/sec1-v2.pdf#subsubsection.2.3.3). +The formerly supported ECDSA hybrid keys (see [X9.62 §4.3.6](citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.202.2977&rep=rep1&type=pdf#page=22)) would also be valid, except that these have already been forbidden by the STRICTENC rule that was activated long ago on BCH. + +(Schnorr private keys are also identical to the ECDSA private keys.) + +## Signature verification algorithm + +We follow essentially what is an older variant of Pieter Wuille's [BIP-Schnorr](https://github.com/sipa/bips/blob/bip-schnorr/bip-schnorr.mediawiki). +Notable design choices: + +* Operates on secp256k1 curve. +* Uses (*R*,*s*) Schnorr variant, not (*e*,*s*) variant. +* Uses pubkey-prefixing in computing the internal hash. +* The Y coordinate of *R* is dropped, so just its X coordinate, *r*, is serialized. +The Y coordinate is uniquely reconstructed from *r* by choosing the quadratic residue. +* Unlike the currently proposed BIP-Schnorr, we use full public keys that do *not* have the Y coordinate removed; this distinction is maintained in the calculation of *e*, below, which makes the resulting signatures from the algorithms incompatible. +We do this so that all existing keys can use Schnorr signatures, and both compressed and uncompressed keys are allowed as inputs (though are converted to compressed when calculating *e*). + +In detail, the Schnorr signature verification algorithm takes a message byte string `m`, public key point *P*, and nonnegative integers *r*, *s* as inputs, and does the following: + +1. Fail if point *P* is not actually on the curve, or if it is the point at infinity. +2. Fail if *r* >= *p*, where *p* is the field size used in secp256k1. +3. Fail if *s* >= *n*, where *n* is the order of the secp256k1 curve. +4. Let `BP` be the 33-byte encoding of *P* as a compressed point. +5. Let `Br` be the 32-byte encoding of *r* as an unsigned big-endian 256-bit integer. +6. Compute integer *e* = *H*(`Br | BP | m`) mod *n*. +Here `|` means byte-string concatenation and function *H*() takes the SHA256 hash of its 97-byte input and returns it decoded as a big-endian unsigned integer. +7. Compute elliptic curve point *R*' = *sG* - *eP*, where *G* is the secp256k1 generator point. +8. Fail if *R*' is the point at infinity. +9. Fail if the X coordinate of *R*' is not equal to *r*. +10. Fail if the Jacobi symbol of the Y coordinate of *R*' is not 1. +11. Otherwise, the signature is valid. + +We stress that bytestring `BP` used in calculating *e* shall always be the *compressed* encoding of the public key, which is not necessarily the same as the encoding taken from stack (which could have been uncompressed). + +## Message `m` calculation + +In all cases, `m` is 32 bytes long. + +For OP_CHECKSIG/VERIFY, `m` is obtained according to the [sighash digest algorithm](/protocol/forks/replay-protected-sighash#digest-algorithm) as informed by the `hashtype` byte, and involves hashing **twice** with SHA256. + +For OP_CHECKDATASIG/VERIFY, `m` is obtained by popping `msg` from stack and hashing it **once** with SHA256. + +This maintains the same relative hash-count semantics as with [the ECDSA versions of OP_CHECKSIG and OP_CHECKDATASIG](protocol/forks/op_checkdatasig). +Although there is an additional SHA256 in step 6 above, it can be considered as being internal to the Schnorr algorithm and it is shared by both opcodes. + +## OP_CHECKMULTISIG/VERIFY + +Due to complex conflicts with batch verification (see rationale below), OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY are not permitted to accept Schnorr signatures for the time being. + +After activation, signatures of the same length as Schnorr (=65 bytes: signature plus hashtype byte) will be disallowed and cause script failure, regardless of the signature contents. + +* OP_CHECKDATASIG before upgrade: 64 byte signature is treated as ECDSA. +* OP_CHECKDATASIG after upgrade: 64 byte signature is treated as *Schnorr*. +* OP_CHECKSIG before upgrade: 65 byte signature is treated as ECDSA. +* OP_CHECKSIG after upgrade: 65 byte signature is treated as *Schnorr*. +* OP_CHECKMULTISIG before upgrade: 65 byte signature is treated as ECDSA. +* OP_CHECKMULTISIG after upgrade: 65 byte signature *causes script failure*. + +Signatures shorter or longer than this exact number will continue to be treated as before. +Note that it is very unlikely for a wallet to produce a 65 byte ECDSA signature (see later section "Lack of flag byte..."). + +# Recommended practices for secure signature generation + +Signature generation is not part of the consensus change, however we would like to provide some security guidelines for wallet developers when they opt to implement Schnorr signing. + +In brief, creation of a signature starts with the generation of a unique, unpredictable, secret nonce *k* value (0 < *k* < *n*). +This produces *R* = *k*'*G* where *k*' = ±*k*, the sign chosen so that the Y coordinate of *R* has Jacobi symbol 1. +Its X coordinate, *r*, is now known and in turn *e* is calculable as above. +The signature is completed by calculating *s* = *k*' + *ex* mod *n* where *x* is the private key (i.e., *P* = *xG*). + +As in ECDSA, there are security concerns arising in nonce generation. +Improper nonce generation can in many cases lead to compromise of the private key *x*. +A fully random *k* is secure, but unfortunately in many cases a cryptographically secure random number generator (CSRNG) is not available or not fully trusted/auditable. + +A deterministic *k* (pseudorandomly derived from *x* and `m`) may be generated using an algorithm like [RFC6979](https://tools.ietf.org/html/rfc6979)(*modified*) or the algorithm suggested in Pieter Wuille's specification. However: + +* Signers MUST NOT use straight RFC6979, since this is already used in many wallets doing ECDSA. + * Suppose the same unsigned transaction were accidentally passed to both ECDSA and Schnorr wallets holding same key, which in turn were to generate the same RFC6979 *k*. + This would be obvious (same *r* values) and in turn allow recovery of the private key from the distinct Schnorr *s* and ECDSA *s*' values: *x* = (±*ss*'-*z*)/(*r*±*s*'*e*) mod *n*. + * We suggest using the RFC6979 sec 3.6 'additional data' mechanism, by appending the 16-byte ASCII string "Schnorr+SHA256␣␣" (here ␣ represents 0x20 -- ASCII space). + The popular library libsecp256k1 supports passing a parameter `algo16` to `nonce_function_rfc6979` for this purpose. +* When making aggregate signatures, in contrast, implementations MUST NOT naively use deterministic *k* generation approaches, as this creates a vulnerability to nonce-reuse attacks from signing counterparties (see [MuSig paper section 3.2](https://eprint.iacr.org/2018/068)). + +Hardware wallets SHOULD use deterministic nonce due to the lack of CSRNG and also for auditability reasons (to prove that kleptographic key leakage firmware is not installed). +Software implementations are also recommended to use deterministic nonces even when CSRNG are available, as deterministic nonces can be unit tested. + +# Rationale and commentary on design decisions + +## Schnorr variant + +Using the secp256k1 curve means that bitcoin's ECDSA keypairs (P,x) can be re-used as Schnorr keypairs. +This has advantages in reducing the codebase, but also allows the opcode overloading approach described above. + +This Schnorr variant has two advantages inherited from the EdDSA Schnorr algorithms: + +* (R,s) signatures allow batch verification. +* Pubkey prefixing (in the hash) stops some related-key attacks. +This is particularly relevant in situations when additively-derived keys (like in unhardened BIP32) are used in combination with OP_CHECKDATASIG (or with a possible future SIGHASH_NOINPUT). + +The mechanism of Y coordinate stripping and Jacobi symbol symmetry breaking originates from Pieter Wuille and Greg Maxwell: + +* It is important for batch verification that each *r* quickly maps to the intended *R*. +It turns out that a natural choice presents itself during 'decompression' of X coordinate *r*: the default decompressed Y coordinate, *y* = (*r*3 + 7)(*p*+1)/4 mod *p* appears, which is a quadratic residue and has Jacobi symbol 1. +(The alternative Y coordinate, -*y*, is always a quadratic nonresidue and has Jacobi symbol -1.) +* During single signature verification, Jacobian coordinates are typically used for curve operations. +In this case it is easier to calculate the Jacobi symbol of the Y coordinate of *R*', than to perform an affine conversion to get its parity or sign. +* As a result this ends up slightly *more* efficient, both in bit size and CPU time, than if the parity or sign of Y were retained in the signature. + +## Overloading of opcodes + +We have chosen to *overload* the OP_CHECKSIG opcode since this means that a "Schnorr P2PKH address" looks just like a regular P2PKH address. + +If we used a new opcode, this would also would prevent the advantages of keypair reuse, described below: + +## Re-use of keypair encodings + +An alternative overloading approach might have been to allocate a different public key prefix byte (0x0a, 0x0b) for Schnorr public keys, that distinguishes them from ECDSA public keys (prefixes 2,3,4,6,7). +This would at least allow Schnorr addresses to appear like normal P2PKH addresses. + +The advantage of re-using the same encoding (and potentially same keypairs) is that it makes Schnorr signatures into a 'drop-in-place' alternative to ECDSA: + +* Existing wallet software can trivially switch to Schnorr signatures at their leisure, without even requiring users to generate new wallets. +* Does not create more confusion with restoration of wallet seeds / derivation paths ("was it an ECDSA or Schnorr wallet?"). +* No new "Schnorr WIF private key" version is required. +* No new xpub / xprv versions are required. +* Protocols like BIP47 payment codes and stealth addresses continue to work unchanged. +* No security-weakening interactions exist between the ECDSA and Schnorr schemes, so key-reuse is not a concern. +* It may be possible eventually to remove ECDSA support (and thereby allow fully batched verification), without blocking any old coins. + +There is a theoretical disadvantage in re-using keypairs. +In the case of a severe break in the ECDSA or Schnorr algorithm, all addresses may be vulnerable whether intended solely for Schnorr or ECDSA --- "the security of signing becomes as weak as the weakest algorithm".[ref](https://lists.bitcoinunlimited.info/pipermail/bch-dev/2018-December/000002.html) + +For privacy reasons, it may be beneficial for wallet developers to coordinate a 'Schnorr activation day' where all wallets simultaneously switch to produce Schnorr signatures by default. + +## Non-inclusion of OP_CHECKMULTISIG + +The design of OP_CHECKMULTISIG is strange, in that it requires checking a given signature against possibly multiple public keys in order to find a possible match. This approach unfortunately conflicts with batch verification where it is necessary to know ahead of time, which signature is supposed to match with which public key. + +Going forward we would like to permanently support OP_CHECKMULTISIG, including Schnorr signature support but in a modified form that is compatible with batch verification. There are simple ways to do this, however the options are still being weighed and there is insufficient time to bring the new approach to fruition in time for the May 2019 upgrade. + +In this upgrade we have chosen to take a 'wait and see' approach, by simply forbidding Schnorr signatures (and Schnorr-size signatures) in OP_CHECKMULTISIG for the time being. Schnorr multisignatures will still be possible through aggregation, but they are not a complete drop-in replacement for OP_CHECKMULTISIG. + +## Lack of flag byte -- ECDSA / Schnorr ambiguity + +In a previous version of this proposal, a flag byte (distinct from ECDSA's 0x30) was prepended for Schnorr signatures. There are some slight disadvantages in not using such a distinguishing byte: + +* After the upgrade, if a user generates a 65-byte ECDSA signature (64-byte in CHECKDATASIG), then this will be interpreted as a Schnorr signature and thus unexpectedly render the transaction invalid. +* A flag byte could be useful if yet another signature protocol were to be added, to help distinguish a third type of signature. + +However, these considerations were deemed to be of low significance: + +* The probability of a user accidentally generating such a signature is 2-49, or 1 in a quadrillion (1015). +It is thus unlikely that such an accident will occur to *any* user. +Even if it happens, that individual can easily move on with a new signature. +* A flag byte distinction would only be relevant if a new protocol were to also use the secp256k1 curve. +The next signature algorithm added to bitcoin will undoubtedly be something of a higher security level, in which case the *public key* would be distinguished, not the signature. +* Omitting the flag byte does save 1 byte per signature. +This can be compared to the overall per-input byte size of P2PKH spending, which is currently ~147.5 for ECDSA signatures, and will be 141 bytes for Schnorr signatures as specified here. + +Without a flag byte, however, implementors must take additional care in how signature byte blobs are treated. +In particular, a malicious actor creating a short valid 64/65-byte ECDSA signature before the upgrade must not cause the creation of a cache entry wherein the same signature data would be incorrectly remembered as valid Schnorr signature, after the upgrade. + +## Miscellaneous + +* Applications that copy OP_CHECKSIG signatures into OP_CHECKDATASIG (such as zero-conf forfeits and self-inspecting transactions/covenants) will be unaffected as the semantics are identical, in terms of hash byte placement and number of hashes involved. +* As with ECDSA, the flexibility in nonce *k* means that Schnorr signatures are not *unique* signatures and are a source of first-party malleability. +Curiously, however, aggregate signatures cannot be "second-party" malleated; producing a distinct signature requires the entire signing process to be restarted, with the involvement of all parties. + +# Implementation / unit tests + +The Bitcoin ABC implementation involved a number of Diffs: https://reviews.bitcoinabc.org/T527 + +Pieter Wuille's specification comes with a handy set of test vectors for checking cryptographic corner cases: https://github.com/sipa/bips/blob/bip-schnorr/bip-schnorr/test-vectors.csv + +# Acknowledgements + +Thanks to Amaury Séchet, Shammah Chancellor, Antony Zegers, Tomas van der Wansem, Greg Maxwell for helpful discussions. diff --git a/protocol/forks/2019-05-15-segwit-recovery.md b/protocol/forks/2019-05-15-segwit-recovery.md new file mode 100644 index 0000000..cefd1b0 --- /dev/null +++ b/protocol/forks/2019-05-15-segwit-recovery.md @@ -0,0 +1,138 @@ +# Segwit Recovery Specification + + layout: specification + title: 2019-MAY-15 Segwit Recovery Specification + date: 2019-05-13 + category: spec + activation: 1557921600 + version: 0.4 + +## Motivation + +Prior to the [November 2018 upgrade](protocol/forks/hf-201811115), miners were able to recover coins accidentally sent to segwit pay-to-script-hash [(P2SH)](https://github.com/bitcoin/bips/blob/master/bip-0016.mediawiki) addresses. +These P2SH addresses have a two-push redeem script that contains no signature checks, and they were thus spendable by any miner (though not spendable by normal users due to relay rules). +In practice, such coins were sometimes recovered by the intended recipient with the help of miners, and sometimes recovered by anonymous miners who simply decided to assert ownership of these anyone-can-spend coins. + +In November 2018, the CLEANSTACK consensus rule was activated, with the intent of reducing malleability mechanisms. +This had the unfortunate side effect of also making these segwit scripts *unspendable*, since attempting to spend these coins would always leave two items on the stack. + +Starting in May 2019, transactions spending segwit P2SH coins will be allowed once again to be included in blocks. + +## Specification + +A transaction input + +1. that spends a P2SH coin (scriptPubKey=`OP_HASH160 OP_EQUAL`); and +2. where the scriptSig only pushes one item onto the stack: a redeem script that correctly hashes to the value in the scriptPubKey; and +3. where the redeem script is a witness program; + +shall be considered valid under the consensus rules to be activated in May 2019. + +A witness program has a 1-byte push opcode (for a number between 0 and 16, inclusive) followed by a data push between 2 and 40 bytes (inclusive), both in minimal form. +Equivalently, a witness program can be identified by examining the length and the first two bytes of the redeem script: + +* The redeem script byte-length is at least 4 and at most 42. +* The first byte is 0x00, or in the range 0x51 – 0x60. (OP_0, or OP_1 – OP_16). +* The second byte is equal to to the redeem script byte-length, minus two. + +All witness-like scripts will be considered valid, even if their execution would normally result in an invalid transaction (e.g. due to a zero value on the stack). +Note that because the witness program contains only push operations (among other restrictions), the P2SH script matching the provided hash is the only meaningful validation criteria. +The only consequence of this specification is that an intentionally unspendable script resembling a witness program may now be spendable. + +This exemption should not be applied for the acceptance of transactions from network peers (i.e., only to acceptance of new blocks), so that segwit recovery transactions remain non-standard (and thus require a miner's cooperation to perform). + +## Test cases + +### Valid segwit recoveries + +V1) Recovering v0 P2SH-P2WPKH: + + scriptSig: 0x16 0x001491b24bf9f5288532960ac687abb035127b1d28a5 + scriptPubKey: OP_HASH160 0x14 0x17743beb429c55c942d2ec703b98c4d57c2df5c6 OP_EQUAL + +V2) Recovering v0 P2SH-P2WSH: + + scriptSig: 0x22 0x00205a0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f + scriptPubKey: OP_HASH160 0x14 0x17a6be2f8fe8e94f033e53d17beefda0f3ac4409 OP_EQUAL + +V3) Max allowed version, v16: + + scriptSig: 0x22 0x60205a0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f + scriptPubKey: OP_HASH160 0x14 0x9b0c7017004d3818b7c833ddb3cb5547a22034d0 OP_EQUAL + +V4) Max allowed length, 42 bytes: + + scriptSig: 0x2a 0x00285a0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f2021222324252627 + scriptPubKey: OP_HASH160 0x14 0xdf7b93f88e83471b479fb219ae90e5b633d6b750 OP_EQUAL + +V5) Min allowed length, 4 bytes: + + scriptSig: 0x04 0x00025a01 + scriptPubKey: OP_HASH160 0x14 0x86123d8e050333a605e434ecf73128d83815b36f OP_EQUAL + +V6) Valid in spite of a false boolean value being left on stack, 0: + + scriptSig: 0x04 0x00020000 + scriptPubKey: OP_HASH160 0x14 0x0e01bcfe7c6f3fd2fd8f81092299369744684733 OP_EQUAL + +V7) Valid in spite of a false boolean value being left on stack, minus 0: + + scriptSig: 0x04 0x00020080 + scriptPubKey: OP_HASH160 0x14 0x10ddc638cb26615f867dad80efacced9e73766bc OP_EQUAL + +### Invalid segwit recoveries + +I1) Non-P2SH output: + + scriptSig: 0x16 0x001491b24bf9f5288532960ac687abb035127b1d28a5 + scriptPubKey: OP_TRUE + +I2) Redeem script hash does not match P2SH output: + + scriptSig: 0x16 0x001491b24bf9f5288532960ac687abb035127b1d28a5 + scriptPubKey: OP_HASH160 0x14 0x17a6be2f8fe8e94f033e53d17beefda0f3ac4409 OP_EQUAL + +I3) scriptSig pushes two items onto the stack: + + scriptSig: OP_0 0x16 0x001491b24bf9f5288532960ac687abb035127b1d28a5 + scriptPubKey: OP_HASH160 0x14 0x17743beb429c55c942d2ec703b98c4d57c2df5c6 OP_EQUAL + +I4) Invalid witness program, non-minimal push in version field: + + scriptSig: 0x17 0x01001491b24bf9f5288532960ac687abb035127b1d28a5 + scriptPubKey: OP_HASH160 0x14 0x0718743e67c1ef4911e0421f206c5ff81755718e OP_EQUAL + +I5) Invalid witness program, non-minimal push in program field: + + scriptSig: 0x05 0x004c0245aa + scriptPubKey: OP_HASH160 0x14 0xd3ec673296c7fd7e1a9e53bfc36f414de303e905 OP_EQUAL + +I6) Invalid witness program, too short, 3 bytes: + + scriptSig: 0x03 0x00015a + scriptPubKey: OP_HASH160 0x14 0x40b6941895022d458de8f4bbfe27f3aaa4fb9a74 OP_EQUAL + +I7) Invalid witness program, too long, 43 bytes: + + scriptSig: 0x2b 0x00295a0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728 + scriptPubKey: OP_HASH160 0x14 0x13aa4fcfd630508e0794dca320cac172c5790aea OP_EQUAL + +I8) Invalid witness program, version -1: + + scriptSig: 0x22 0x4f205a0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f + scriptPubKey: OP_HASH160 0x14 0x97aa1e96e49ca6d744d7344f649dd9f94bcc35eb OP_EQUAL + +I9) Invalid witness program, version 17: + + scriptSig: 0x23 0x0111205a0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f + scriptPubKey: OP_HASH160 0x14 0x4b5321beb1c09f593ff3c02be4af21c7f949e101 OP_EQUAL + +I10) Invalid witness program, OP_RESERVED in version field: + + scriptSig: 0x22 0x50205a0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f + scriptPubKey: OP_HASH160 0x14 0xbe02794ceede051da41b420e88a86fff2802af06 OP_EQUAL + +I11) Invalid witness program, more than 2 stack items: + + scriptSig: 0x23 0x00205a0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f51 + scriptPubKey: OP_HASH160 0x14 0x8eb812176c9e71732584123dd06d3246e659b199 OP_EQUAL diff --git a/protocol/forks/2019-11-15-minimaldata.md b/protocol/forks/2019-11-15-minimaldata.md new file mode 100644 index 0000000..6c4eb42 --- /dev/null +++ b/protocol/forks/2019-11-15-minimaldata.md @@ -0,0 +1,182 @@ +# 2019-NOV-15 minimal push and minimal number encoding rules + + layout: specification + title: 2019-NOV-15 minimal push and minimal number encoding rules + date: 2019-08-11 + category: spec + activation: 1573819200 + version: 1.0 + author: Mark B. Lundeberg + +## Summary + +In the November 2019 upgrade, two new consensus rules are introduced to Bitcoin Cash: + +- during script execution, executed push opcodes are restricted to be the minimal form for the resultant stack element. +- during script execution, the decoding of stack elements as numbers are restricted to only allow minimal forms, in most cases. + +## Motivation + +Third-party malleation is when anyone, such as an uninvolved miner, is able to modify parts of a transaction while keeping it valid, yet changing the transaction identifier. +The validity of child transactions is contingent on having the correct transaction identifier for the parent, and so third-party malleability threatens to invalidate chains of transactions, whether they are held in secret, in mempool, or even already confirmed (i.e., during blockchain reorganization). +A variety of past consensus rule changes have tried to address third-party malleability vectors: BIP66 strict ECDSA encoding, the ECDSA low-S rule, the strict encoding rule for hashtype, the scriptSig push-only rule, and the cleanstack rule. +This effort is incomplete, as there remains a significant malleability vector that means that currently, *all* transactions on BCH are still third-party malleable: + +* The push opcodes used during scriptSig execution can be modified. +For example, the length-one stack element `{0x81}` can be equivalently pushed using any of the following five script phrases (in hex): `4f`, `0181`, `4c0181`, `4d010081`, `4e0100000081`. +A third party can substitute any of these for each other. + +For some transactions, an additional malleability mechanism is also present: + +* Some smart contracts perform operations on numbers that are taken from the scriptSig, and numbers in bitcoin's Script language are allowed to have multiple representations on stack. +The number -1, for example, can be represented by `{0x81}`, `{0x01, 0x80}`, `{0x01, 0x00, 0x80}`, `{0x01, 0x00, 0x00, 0x80}`. + +For years now, the "MINIMALDATA" flag, which restricts both of the aforementioned malleability vectors, has been active at the mempool layer of most nodes but not at the consensus layer. +The upgrade converts the existing MINIMALDATA rules to consensus. +For reference, this document contains a full specification of these rules. + +It is of course impossible to completely remove third-party malleability in bitcoin (not even using techniques like SegWit) since a transaction can be made that involves no signature or where the signing key is not a secret, or where permutations are permitted (e.g., [SINGLE|ANYONECANPAY](https://github.com/bitcoin/bips/blob/master/bip-0143.mediawiki#specification)). +We can, however, remove it for large classes of transactions, and this has been the goal of the past upgrades. +Bringing MINIMALDATA to the consensus layer, along with the [dummy element restrictions in the OP_CHECKMULTISIG upgrade](/protocol/forks/2019-11-15-schnorrmultisig), finally achieves the goal of removing third-party malleability from the vast majority of transactions performed on BCH. + +## Technical background + +**Push opcodes** — Bitcoin's Script system is a stack-based language. +The stack elements are simply byte arrays of length 0 to 520. +Push opcodes append a byte array onto the stack, and there are a variety of different opcodes for pushing arbitrary data of various lengths, or pushing specific one-byte arrays: + +* Opcode 0 (OP_0) pushes an empty element onto the stack. +* Opcodes 1 through to 75 push an arbitrary byte array of the corresponding length onto the stack. +* Opcode 76 (PUSHDATA1) takes a one-byte length as parameter, and then pushes an arbitrary byte-array of that length. +* Opcode 77 (PUSHDATA2) takes a two-byte length as parameter, and then pushes an arbitrary byte-array of that length. +* Opcode 78 (PUSHDATA2) takes a four-byte length as parameter, and then pushes an arbitrary byte-array of that length. +* Opcode 79 (OP_1NEGATE) pushes the one-byte element `{0x81}`. +* Opcode 81 (OP_1) pushes the one-byte element `{0x01}`. +* Opcode 82 (OP_2) pushes the one-byte element `{0x02}`. +* ... +* Opcode 95 (OP_15) pushes the one-byte element `{0x0f}`. +* Opcode 96 (OP_16) pushes the one-byte element `{0x10}`. + +It can be seen from the above list that any given byte array can be pushed in a variety of ways. +However, for any given byte array there is a unique shortest possible way to push the byte array. + +**Number representation** — Although bitcoin's stack is just a sequence of byte arrays, there are numerous Script opcodes that expect to take integers from the stack, which means they decode the byte array to an integer before logically using the integer. +The way Script represents numbers as byte arrays is using a variable-length, little-endian [sign-and-magnitude representation](https://en.wikipedia.org/wiki/Signed_number_representations#Signed_magnitude_representation_(SMR)). +This is typical for a multiprecision or 'bignum' arithmetic computing environment, but may be unfamiliar for programmers who are used to 'bare-metal' integer computing that uses fixed-width two's complement (or rarely, ones' complement) representation. + +Currently, the only consensus restriction is that the byte arrays used during number decoding shall be at most 4 bytes in length (except for four special cases, noted in the specification below). +This restricts the range of numbers to be \[-231 + 1 ... 2^31 - 1\] (inclusive), but does not pose any further restrictions on encoding. +So, there are various ways to encode a given number as a stack element by padding the number with excess groups of zero bits just before the sign bit. +For example, the number -6844 can be represented in three valid ways: `{0xbc, 0x9a}`, `{0xbc, 0x1a, 0x80}`, `{0xbc, 0x1a, 0x00, 0x80}`. +The number 39612 can be represented as `{0xbc, 0x9a, 0x00}` or `{0xbc, 0x9a, 0x00, 0x00}`. +The number 0 has nine valid representations. +While all opcodes that output numbers will minimally encode said output, at the current time they are happy to accept any representation for a numeric input. + +For any given number, there is exactly one minimal (shortest) representation. +A simple test can be applied to a byte array to see whether it is the minimal encoding of the corresponding number: + +* The byte array holds a minimally encoded number if any of the following apply: + * The byte array has length 0. (this is the minimal representation of the number 0) + * The byte array has length of 1 or larger, and the the last byte has any bits set besides the high bit (the sign bit). + * The byte array has length of 2 or larger, and the *second-to-last* byte has its high bit set. +* If none of the above apply, the byte array holds a non-minimal encoding of the given number. + +Note that bitcoin's number system treats "negative 0" encodings such as `{0x80}`, `{0x00, 0x80}`, etc. +as a representation of 0, and the minimal encoding of 0 is an empty byte array: `{}`. +The above rules indicate that neither `{0x80}` nor `{0x00}` are minimal encodings. + +## Specification + +Though conventionally appearing under one flag "MINIMALDATA", there are two unrelated rules that do not interact. The specifications have been accordingly split into two sections. + +### Minimal push rule + +Upon the execution of a push opcode (be it during scriptSig, scriptPubKey, or P2SH redeemScript execution), the data pushed on stack shall be examined in order to decide if the just-executed push opcode was minimal: + +* An empty stack element `{}` must be pushed using OP_0. +* A one-byte element must be pushed using opcode 1 followed by the given byte, *except* for the following 17 special cases where a special opcode must be used instead: + * `{0x81}` must be pushed using OP_1NEGATE + * `{0x01}` must be pushed using OP_1 + * `{0x02}` must be pushed using OP_2 + * ... + * `{0x0f}` must be pushed using OP_15 + * `{0x10}` must be pushed using OP_16 +* An element of length N=2 to length N=75 must be pushed using opcode N. +* An element of length 76 to 255 must be pushed using PUSHDATA1. +* An element of length 256 to 65535 must be pushed using PUSHDATA2. + +In practice, PUSHDATA2 can only push lengths up to 520, but in case script is upgraded one day, the limit for PUSHDATA2 remains at 65535. +Since the above rules cover all possible stack element lengths, this means that PUSHDATA4 cannot appear in executed parts of scripts (it must still, however, be *parsed* correctly in an unexecuted branch). + +It is worth emphasizing that the above rules only apply at the moment when push opcodes are actually *executed*, i.e., when data is actually being placed onto the stack. +Thus: + +* These rules do *not* apply to push opcodes found in unexecuted branches (those behind OP_IF/OP_NOTIF) of executed scripts. +* These rules do *not* apply to scripts appearing in transaction outputs, as they have not yet been executed. +* These rules do *not* apply to coinbase scriptSigs, which are not executed. +Note that BIP34 imposes a (slightly distinct) encoding requirement for the mandatory height push at the start of the coinbase scriptSig. + +### Minimal number encoding + +Most opcodes that take numbers from the stack shall require the stack element to be a minimally encoded representation. +To be specific, these operands must be minimally encoded numbers: + +* The single operand of OP_PICK and OP_ROLL. +* The single operand of OP_1ADD, OP_1SUB, OP_NEGATE, OP_ABS, OP_NOT, OP_0NOTEQUAL. +* Both operands of OP_ADD, OP_SUB, OP_DIV, OP_MOD, OP_BOOLAND, OP_BOOLOR, OP_NUMEQUAL, OP_NUMEQUALVERIFY, OP_NUMNOTEQUAL, OP_LESSTHAN, OP_GREATERTHAN, OP_LESSTHANOREQUAL, OP_GREATERTHANOREQUAL, OP_MIN, OP_MAX. +* All three operands of OP_WITHIN. +* The "keys count" and "signatures count" operands of OP_CHECKMULTISIG, OP_CHECKMULTISIGVERIFY. +* The second operand ("position") of OP_SPLIT. +* The second operand ("size") of OP_NUM2BIN, *but not the first (see below)*. +* In general, all number-accepting opcodes added in future will require minimal encoding as well. + +However, four opcodes are special in the numeric inputs they accept: + +* OP_CHECKLOCKTIMEVERIFY and OP_CHECKSEQUENCEVERIFY both take up to **5-byte** numbers from the stack, a deviation from the usual 4-byte limit. +Regardless, we shall require that these 5-byte numbers also be minimally encoded. +* The first operand of OP_NUM2BIN and the single operand of OP_BIN2NUM will continue to have *no minimal encoding restrictions* and *no length restrictions* (see [their specification](may-2018-reenabled-opcodes.md) for more information). + +The following opcodes notably do not appear in the above lists since they do *not* decode their inputs as numbers, and thus they have no minimal number encoding rules: OP_IF, OP_NOTIF, OP_VERIFY, OP_IFDUP, OP_AND, OP_OR, OP_XOR. + +## Rationale and commentary on design decisions + +### Over-restrictions on minimal push + +To prevent push malleability, it is only necessary to restrict the scriptSig. +The push forms used during scriptPubKey and P2SH redeemScript execution cannot be malleated, since they are committed by hashing into the prior transaction's identifier. +Thus it may seem like 'overkill' to restrict these as well. + +Despite this, the MINIMALDATA standardness rule has applied these restrictions to scriptPubKey and redeemScript for quite a while now, and it does not appear to be causing an issue. +In addition, it is technically cleaner in some ways, if the same script interpretation rules can be applied to all executing scripts. + +### Restrictions of number encoding + +By far, the most common usage of numbers is in OP_CHECKMULTISIG where they are provided in the locking script and cannot be malleated. +Only rare smart contracts take numbers from the scriptSig, and in fact, smart contracts that require minimal number encoding could easily enforce this themselves, by using tricks such as `OP_DUP OP_DUP OP_0 OP_ADD OP_EQUALVERIFY` (taking advantage of the fact that adding 0 to a number returns its minimal encoding), or more recently: `OP_DUP OP_DUP OP_BIN2NUM OP_EQUALVERIFY`. + +However, the number encoding rule has been standard for quite some time, and adopting it now should cause no issue. +It also makes it so that smart contract authors can save their limited opcodes for more valuable tasks, and need not use such tricks. + +### Not restricting boolean encodings + +Four opcodes interpret their input as a boolean without any restriction: OP_IF, OP_NOTIF, OP_VERIFY, OP_IFDUP. +Any byte array of any length that is all zeros, or that is all zeros besides a final byte of 0x80, is interpreted as 'false', and any other byte array is interpreted as 'true'. +The script interpreter also accepts such unrestricted boolean representations for the final stack value used to determine pass/fail of a script. + +Two additional 'boolean' opcodes (OP_BOOLAND, OP_BOOLOR) have a semi-restricted input, as they interpret their inputs as numbers. +These must be at most 4 bytes long, and as mentioned above they will be restricted according to the number encoding rules. +However, while there will be only one valid representation for 'false' (the number 0, i.e., `{}`), any nonzero number can be used as 'true'. + +In theory, we could restrict all of these boolean-expecting operations to accept only `{}` for 'false', and `{0x01}` for 'true'; this would be analogous to the number encoding restrictions. +However, no such standardness rule exists at this time so it would be too sudden to impose any hard rule for this upgrade. + +Also, it is easier for scripts to avoid malleable boolean inputs without having to use up additional opcodes, as demonstrated by the following example. +Among smart contracts, it is common to see a construction of a form like `OP_IF pubkey_A OP_CHECKSIGVERIFY OP_ELSE pubkey_B OP_CHECKSIGVERIFY OP_ENDIF`. +Transactions spending such smart contracts will remain malleable, since the input to OP_IF comes from scriptSig. +However, it is easy for script programmers to tweak such smart contracts to a non-malleable form: `pubkey_A OP_CHECKSIG OP_IF OP_ELSE pubkey_B OP_CHECKSIGVERIFY OP_ENDIF`. +This takes advantage of the fact that OP_CHECKSIG simply returns false if the provided signature is not valid. +Due to the already-adopted NULLFAIL rule, `{}` is the only permitted invalid signature, and cannot be malleated. + +## Acknowledgements + +Thanks to Antony Zegers and Amaury Sechet for valuable feedback. diff --git a/protocol/forks/2019-11-15-schnorrmultisig.md b/protocol/forks/2019-11-15-schnorrmultisig.md new file mode 100644 index 0000000..6e6cb3b --- /dev/null +++ b/protocol/forks/2019-11-15-schnorrmultisig.md @@ -0,0 +1,243 @@ +# 2019-NOV-15 Schnorr OP_CHECKMULTISIG specification + + layout: specification + title: 2019-NOV-15 Schnorr OP_CHECKMULTISIG specification + date: 2019-08-11 + category: spec + activation: 1573819200 + version: 1.0 + author: Mark B. Lundeberg + +## Summary + +OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY will be upgraded to accept Schnorr signatures in a way that increases verification efficiency and is compatible with batch verification. + +*note: this document assumes knowledge of [the prior Schnorr signature upgrade](/protocol/forks/2019-05-15-schnorr).* + +## Motivation + +In [the last upgrade](/protocol/forks/hf-20190515), we added Schnorr support to OP_CHECKSIG and OP_CHECKDATASIG, but not OP_CHECKMULTISIG. + +Although we could have added support to OP_CHECKMULTISIG as well (which would have been overall simpler), this would conflict with the desire to do batch verification in future: Currently with OP_CHECKMULTISIG validation, it is needed to check a signature against multiple public keys in order to find a possible match. +In Schnorr batch verification however, it is required to know ahead of time, which signatures are supposed to match with which public key. +Without a clear path forward on how to resolve this, we postponed the issue and simply prevented Schnorr signatures from being used in OP_CHECKMULTISIG. + +Schnorr aggregated signatures (with OP_CHECKSIG) are one way to do multisignatures, but they have different technical properties than the familiar Bitcoin multisig, and thus are far from being a drop-in replacement for it. +Besides that, it is also desirable that any existing coin can be spent using Schnorr signatures, and there are numerous OP_CHECKMULTISIG-based wallets and coins in existence that we want to be able to take advantage of Schnorr signatures. + +## Specification + +OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY will be upgraded to allow *two* execution modes, based on the value of the dummy element. + +Mode 1 (legacy ECDSA support, M-of-N; consumes N+M+3 items from stack): + + ... M ... N OP_CHECKMULTISIG + +The precise validation mechanics of this are complex and full of corner cases; the source code is the best reference. +Most notably, for 2-of-3 (M=2, N=3), `sig0` may be a valid ECDSA transaction signature from `pub0` or from `pub1`; `sig1` may be from `pub1` (if `sig0` is from `pub0`) or `pub2`. +Historical transactions (prior to FORKID, STRICTENC and NULLFAIL rules) had even more freedoms and [weirdness](https://decred.org/research/todd2014.pdf)). +Upon activation, the `dummy` element must be null, i.e., an empty byte array. + +Mode 2 (new Schnorr support, M-of-N; consumes N+M+3 items from stack): + + ... M ... N OP_CHECKMULTISIG + +* The `dummy` element has now been repurposed as a bitfield that we call `checkbits`, and indicates which public keys should have a signature checked against them. +* This mode activates when `dummy` (`checkbits`) is non-null, i.e., not an empty byte array. +* Crucially, each of the signature checks requested by `checkbits` *must* be valid, or else the script fails. +* In mode 2, ECDSA signatures are not allowed. + +### Triggering and execution mechanism + +Whether to execute in mode 1 or mode 2 is determined by the size of the dummy / checkbits element. + +* If the checkbits element is NULL (length 0), then Mode 1 is executed +* If the checkbits element is non-NULL (length > 0), then Mode 2 is executed. + +The new mode operates similar to legacy mode but only checks signatures as requested, according to the `checkbits` field. +If the least significant bit of `checkbits` is set, then the bottom (first-pushed) signature should be checked against the bottom public key, and so on. +For a successful verification in the new mode, `checkbits` must have exactly `M` bits set, and the signatures must be correctly ordered. +On stack, `checkbits` is encoded as a byte array of length `floor((N + 7)/8)`, i.e., the shortest byte array that can hold `N` bits. +It is encoded in little-endian order, i.e., the least significant bit occurs in the first byte. + +In pseudocode, the full OP_CHECKMULTISIG code is: + + Get N (number of pubkeys) from stack ; check bounds 0 <= N <= 20. + Add N to nOpCount; if nOpCount exceeds 201 limit, fail script. + Get M (number of signatures) from stack ; check bounds 0 <= M <= N. + Calculate scriptCode. + If activated, and the dummy element is not null, then: + # New mode (2) + Set a cursor on the bottom signature (first signature pushed on stack). + Set another cursor on the bottom public key (first key pushed on stack). + Fail if the dummy element does not have length in bytes = floor((N+7)/8) + Set checkbits := 0, then iterate over the bytes in the dummy element in reverse order: + For each byte X, checkbits := (checkbits << 8) | X + Loop while the signature and key cursors are not depleted: + If the least significant bit of checkbits is 1, then: + Check public key encoding. + Check signature encoding; exclude non-Schnorr signatures. + Validate the current signature against the current public key; if invalid, fail script. + Move the signature cursor up by one position. + Bitshift checkbits down by one bit. (checkbits := checkbits >> 1) + Move the public key cursor up by one position. + If the final checkbits value is nonzero, fail script. + If the signature cursor has not been depleted, fail script. + Else: + # Legacy mode (1) + Set a cursor on the top signature (last signature pushed on stack). + Set another cursor on the top public key (last key pushed on stack). + If pre-BCH-fork, then run findAndDelete on scriptCode. + Loop while the signature cursor is not depleted: + Check public key encoding. + Check signature encoding; exclude Schnorr signatures (64+1 bytes). + Validate the current signature against the current public key. + If valid, then move signature cursor deeper by one position. + Move the public key cursor deeper by one position. + If more signatures remain than public keys, set success=False and abort loop early. + If loop was not aborted, set success=True. + [non-consensus] Check NULLDUMMY rule. + + If success is False, then ensure all signatures were null. (NULLFAIL rule) + Clean up the used stack items. + + Push success onto stack + If opcode is OP_CHECKMULTISIGVERIFY: + Pop success from stack + If not success: + FAIL + +### Notes + +The mechanics of CHECKMULTISIG are complicated due to the order of signature checking, the timing of when key/signature encodings are checked, the ability to either hard-fail (fail script & invalidate transaction) or soft-fail (return False on stack), and the interaction with previously activated consensus rules. +Some features of the specification are worth emphasizing: + +- The legacy mode has unaltered functionality, except being restricted to only use a null dummy element. +- Compatibility is good, as basically any reasonable smart contract using OP_CHECKMULTISIG can be spent using either legacy or new mode. +(Of course, with effort a script could be deliberately crafted to only allow one mode.) +- In both modes, public keys only have their encoding checked just prior to performing a signature check. +The unchecked public keys may be arbitrary data. + - In legacy mode, the precise order of checking is critical to obtaining a correct implementation, due to the public key encoding rule. + Signature and pubkey iteration always starts at the top public key and signature (the last pushed on stack). + - Some multisig scripts were made unspendable on Aug 1 2017, due to the last-pushed public key having incorrect encoding. + These will now be spendable, but only in the new mode. +- Note that the numbers `N`, `M` will require minimal encoding, upon activation of the minimal number encoding rule (see https://github.com/bitcoincashorg/bitcoincash.org/pull/376/files). +- In the new mode, `checkbits` must have exactly `M` of the lower `N` bits set, and all other bits must be clear: + - Only the least significant N bits may be set in `checkbits`, i.e., if `checkbits` taken as an integer exceeds 2N-1 then the script will fail. + - If `checkbits` has more than `M` bits set, the script will fail. + - If `checkbits` is nonzero but has fewer than `M` bits set, then the script will fail because too few signature verifications were performed. +- In normal circumstances the new mode cannot be third-party malleated, since the new mode design means that `checkbits` should have only one valid value for a given set of signatures + - Third-party malleation can still occur in some very unusual cases. + For example, if some public key points are repeated in the list of keys, then signatures can be reordered and/or the `checkbits` can be adjusted. + Also, if `M=0` then two possible values of the dummy element are permitted. + - Likewise the design stops the malleation vector of the legacy mode, since the dummy element now must be null for it to execute. + The non-consensus NULLDUMMY rule will thus be made redundant, after this rule activates. +- The legacy mode can require up to N signature checks in order to complete. +In the new mode, exactly M signature checks occur for a sucessful operation. +- A soft-failing CHECKMULTISIG (that returns False on stack) can only occur with all null signatures, due to NULLFAIL. +For simplicity and avoiding malleability, the new mode does not allow a failing case, and a soft-failing CHECKMULTISIG must execute in the legacy mode (which will require a NULL dummy element). +Note that even such a soft-failing checkmultisig still requires the top public key to be correctly encoded due to the legacy mechanics. +- For M=0, the opcode returns True without checking any key encodings. +This is true in both new and legacy mode. + +And, some clarifications: + +- As usual, checking public key encoding means permitting only 65-long byte arrays starting with 0x04, or 33-long byte arrays starting with 0x02 or 0x03. +- As usual, checking signature encoding for either ECDSA or Schnorr involves permitting only recognized hashtype bytes; Schnorr signatures must have a given length, while ECDSA signatures must follow DER encoding and Low-S rules, and must not have the length allocated to Schnorr signatures. +Null signatures (empty stack elements) are also treated as 'correctly encoded'. +- The findAndDelete operation only applies to old transactions prior to August 2017, and does not impact current transactions, not even in legacy mode. + +## Wallet implementation guidelines + +(Currently, the common multisig wallet uses P2SH-multisig, i.e., a redeemScript of the form `M ... N OP_CHECKMULTISIG`. +We'll focus on this use case and assume M > 0.) + +In the new Schnorr mode, *all* signatures must be Schnorr; no mixing with ECDSA is supported. +Multisig wallets that wish to use the new Schnorr signatures will need to update their co-signing pools infrastructure to support a new type of signing. +If some parties are unable to generate a Schnorr signature, then it will not be possible to generate a successful transaction except by restarting to make an ECDSA multisig. +This creates some problems in particular when some of the parties are a hardware wallet, which may only support ECDSA for the forseeable future. + +We suggest the following for wallet software producers that wish to make Schnorr multisig spends while remaining backwards compatible: + +* Add an optional marker to the initial setup process, such as appending `?schnorr=true` to the `xpub`. +* Add a new kind of non-backwards-compatible multisignature request that indicates schnorr signatures are needed. +* If it is not known that all parties can accept Schnorr requests, then only generate ECDSA multisignature requests. +* Have the ability to participate in either ECDSA or Schnorr multisignatures, as requested. + +It may also be helpful to include *both* an ECDSA and Schnorr signature in the partially signed transaction format, so that if one cosigner is unable to sign Schnorr, then an ECDSA fallback is possible without needing a retry. +This introduces no additional malleability concerns since already any of the cosigners is able to malleate their own signature. + +### Calculating and pushing checkbits + +In order to complete a multisignature, whether in the new mode or legacy mode, wallets need to keep track of which signatures go with which public keys. +In the new mode, wallets must not just correctly order the signatures, but must also correctly include the `checkbits` parameter. + +Once the `checkbits` parameter is determined, it needs to be encoded to bytes, and then minimally pushed in the scriptSig. +While the encoding to bytes is straight forward, it is worth emphasizing that certain length-1 byte vectors must be pushed using special opcodes. + +* For N <= 8, a length-1 byte array is to be pushed. + * The byte arrays `{0x01}` through `{0x10}` must be pushed using OP_1 through OP_16, respectively. + * The byte array `{0x81}` must be pushed using OP_1NEGATE. + This can only occur for a 2-of-8 multisig, where the checkbits bit pattern is 10000001. + * Other cases will be pushed using no special opcode, i.e., using `0x01 `. +* For 9 <= N <= 16, a length-2 byte array is to be pushed. + * The push will always be `0x02 LL HH`, where `LL` is the least significant byte of `checkbits`, and `HH` is the remaining high bits. +* For 17 <= N <= 20, a length-3 byte array is to be pushed. + * The push will always be `0x03 LL II HH`, where where `LL` is the least significant byte of `checkbits`, `II` is the next-least significant byte, and `HH` is the remaining high bits. + +### ScriptSig size + +Wallets need to know ahead of time the maximum transaction size, in order to set the transaction fee. + +Let `R` be the length of the redeemScript and its push opcode, combined. + +The legacy mode scriptSig ` ... ` can be as large as 73M + 1 + R bytes, which is the upper limit assuming all max-sized ECDSA signatures. + +In the new mode scriptSig ` ... `, each Schnorr signature will contribute a fixed size of 66 bytes (including push opcode), however the length of `checkbits` will vary somewhat. Wallets should allocate for fees based on the largest possible encoding, which gives a scriptSig size of: + +* N <= 4: `checkbits` will always be pushed using OP_1 through OP_15, so always 66M + R + 1 bytes. +* 5 <= N <= 8: `checkbits` may sometimes be pushed using a single-byte opcode, or may need to be pushed as `0x01 0xnn` -- up to 66M + R + 2 bytes. +* 9 <= N <= 16: `checkbits` will be pushed as `0x02 0xnnnn` -- always 66M + R + 3 bytes. +* 17 <= N <= 20: `checkbits` will be pushed as `0x03 0xnnnnnn` -- always 66M + R + 4 bytes. + +### Pubkey Encoding + +It is strongly recommended that wallets never create scripts with invalid pubkeys, even though this specification allows them to exist in the public key list as long as they are unused. +It is possible that a future rule may stipulate that all pubkeys must be strictly encoded. +If that were to happen, any outputs violating this rule would become unspendable. + +## Rationale and commentary on design decisions + +### Repurposing of dummy element + +In an earlier edition it was proposed to require N signature items (either a signature or NULL) for new mode instead of M items and a dummy element. +The following problems inspired a move away from that approach: + +* Triggering mechanics for the new mode were somewhat of a kluge. +* Some scripts rely on a certain expected stack layout. +This is particularly the case for recently introduced high-level smart contracting languages that compile down to script, which reach deep into the stack using OP_PICK and OP_ROLL. + +That said, a scan of the blockchain only found about a hundred instances of scripts that would be impacted by stack layout changes. +All were based on a template as seen in [this spend](https://blockchair.com/bitcoin-cash/transaction/612bd9fc5cb40501f8704028da76c4c64c02eb0ac80e756870dba5cf32650753), where OP_DEPTH was used to choose an OP_IF execution branch. + +### Use of a bitfield instead of a number + +Another draft of this specification proposed decoding the dummy element as a number, using the standard number decoding rules. +The change to using a custom bitfield representation was motivated by the fact that the bitwise operators (OP_AND, OP_OR, OP_XOR) do not cleanly operate on bitcoin's numbers, since numbers are encoded using variable lengths whereas the bitwise operators require the operands to have equal lengths. + +The current specification guarantees that for a successful multisig in the new mode, the dummy element always has a specific length of either 1, 2, or 3, depending only on `N`. +Smart contracts can use this property to perform a multisignature and then do bit inspection on which signatures were actually checked. + +### No mixing ECSDA / Schnorr + +Allowing mixed signature types might help alleviate the issue of supporting mixed wallet versions that do support / don't support Schnorr signatures. +However, this would mean that an all-ECDSA signature list could be easily converted to the new mode, unless extra complicated steps were taken to prevent that conversion. +As this is an undesirable malleability mechanism, we opted to simply exclude ECDSA from the new mode, just as Schnorr are excluded from the legacy mode. + +## Implementation + +https://reviews.bitcoinabc.org/D3474 + +## Acknowledgements + +Thanks to Tendo Pein, Rosco Kalis, Amaury Sechet, and Antony Zegers for valuable feedback. diff --git a/protocol/forks/bch-uahf.md b/protocol/forks/bch-uahf.md index 9f44772..a42f4a3 100644 --- a/protocol/forks/bch-uahf.md +++ b/protocol/forks/bch-uahf.md @@ -1,153 +1,105 @@ -
-  layout: specification
-  title: UAHF Technical Specification
-  category: spec
-  date: 2017-07-24
-  activation: 1501590000
-  version: 1.6
-
+# UAHF + + layout: specification + title: UAHF Technical Specification + category: spec + date: 2017-07-24 + activation: 1501590000 + version: 1.6 ## Introduction This document describes proposed requirements for a block size Hard Fork (HF). -BUIP 55 specified a block height fork. This UAHF specification is -inspired by the idea of a flag day, but changed to a time-based fork due -to miner requests. It should be possible to change easily to a height-based -fork - the sense of the requirements would largely stay the same. - +BUIP 55 specified a block height fork. +This UAHF specification is inspired by the idea of a flag day, but changed to a time-based fork due to miner requests. +It should be possible to change easily to a height-based fork - the sense of the requirements would largely stay the same. ## Definitions -MTP: the "median time past" value of a block, calculated from its nTime -value, and the nTime values of its up to 10 immediate ancestors. +MTP: the "median time past" value of a block, calculated from its nTime value, and the nTime values of its up to 10 immediate ancestors. -"activation time": once the MTP of the chain tip is equal to or greater -than this time, the next block must be a valid fork block. The fork block -and subsequent blocks built on it must satisfy the new consensus rules. +"activation time": once the MTP of the chain tip is equal to or greater than this time, the next block must be a valid fork block. +The fork block and subsequent blocks built on it must satisfy the new consensus rules. -"fork block": the first block built on top of a chain tip whose MTP is -greater than or equal to the activation time. +"fork block": the first block built on top of a chain tip whose MTP is greater than or equal to the activation time. -"fork EB": the user-specified value that EB shall be set to at -activation time. EB can be adjusted post-activation by the user. +"fork EB": the user-specified value that EB shall be set to at activation time. +EB can be adjusted post-activation by the user. -"fork MG": the user-specified value that MG shall be set to at activation -time. It must be > 1MB. The user can adjust MG to any value once the -fork has occurred (not limited to > 1MB after the fork). +"fork MG": the user-specified value that MG shall be set to at activation time. +It must be > 1MB. +The user can adjust MG to any value once the fork has occurred (not limited to > 1MB after the fork). -"Large block" means a block satisfying 1,000,000 bytes < block -size <= EB, where EB is as adjusted by REQ-4-1 and a regular block -is a block up to 1,000,000 bytes in size. +"Large block" means a block satisfying 1,000,000 bytes < block size <= EB, where EB is as adjusted by REQ-4-1 and a regular block is a block up to 1,000,000 bytes in size. "Core rules" means all blocks <= 1,000,000 bytes (Base block size). -"Extended BU tx/sigops rules" means the existing additional consensus rules (1) and -(2) below, as formalized by BUIP040 [1] and used by the Bitcoin Unlimited -client's excessive checks for blocks larger than 1MB, extended with rule -(3) below: -1. maximum sigops per block is calculated based on the actual size of -a block using -max_block_sigops = 20000 * ceil((max(blocksize, 1000000) / 1000000)) +"Extended BU tx/sigops rules" means the existing additional consensus rules (1) and (2) below, as formalized by BUIP040 [1] and used by the Bitcoin Unlimited client's excessive checks for blocks larger than 1MB, extended with rule (3) below: + +1. maximum sigops per block is calculated based on the actual size of a block using max_block_sigops = 20000 * ceil((max(blocksize, 1000000) / 1000000)) 2. maximum allowed size of a single transaction is 1,000,000 bytes (1MB) -3. maximum allowed number of sigops for a single transaction is 20k . +3. maximum allowed number of sigops for a single transaction is 20k. -NOTE 1: In plain English, the maximum allowed sigops per block is -20K sigops per the size of the block, rounded up to nearest integer in MB. +NOTE 1: In plain English, the maximum allowed sigops per block is 20K sigops per the size of the block, rounded up to nearest integer in MB. i.e. 20K if <= 1MB, 40K for the blocks > 1MB and up to 2MB, etc. - ## Requirements ### REQ-1 (fork by default) -The client (with UAHF implementation) shall default to activating -a hard fork with new consensus rules as specified by the remaining -requirements. +The client (with UAHF implementation) shall default to activating a hard fork with new consensus rules as specified by the remaining requirements. -RATIONALE: It is better to make the HF active by default in a -special HF release version. Users have to download a version capable -of HF anyway, it is more convenient for them if the default does not -require them to make additional configuration. - -NOTE 1: It will be possible to disable the fork behavior (see -REQ-DISABLE) +RATIONALE: It is better to make the HF active by default in a special HF release version. +Users have to download a version capable of HF anyway, it is more convenient for them if the default does not require them to make additional configuration. +NOTE 1: It will be possible to disable the fork behavior (see REQ-DISABLE) ### REQ-2 (configurable activation time) -The client shall allow a "activation time" to be configured by the user, -with a default value of 1501590000 (epoch time corresponding to Tue -1 Aug 2017 12:20:00 UTC) +The client shall allow a "activation time" to be configured by the user, with a default value of 1501590000 (epoch time corresponding to Tue 1 Aug 2017 12:20:00 UTC). -RATIONALE: Make it configurable to adapt easily to UASF activation -time changes. - -NOTE 1: Configuring a "activation time" value of zero (0) shall disable -any UAHF hard fork special rules (see REQ-DISABLE) +RATIONALE: Make it configurable to adapt easily to UASF activation time changes. +NOTE 1: Configuring a "activation time" value of zero (0) shall disable any UAHF hard fork special rules (see REQ-DISABLE). ### REQ-3 (fork block must be > 1MB) -The client shall enforce a block size larger than 1,000,000 bytes -for the fork block. - -RATIONALE: This enforces the hard fork from the original 1MB -chain and prevents a re-organization of the forked chain to -the original chain. +The client shall enforce a block size larger than 1,000,000 bytes for the fork block. +RATIONALE: This enforces the hard fork from the original 1MB chain and prevents a re-organization of the forked chain to the original chain. ### REQ-4-1 (require "fork EB" configured to at least 8MB at startup) -If UAHF is not disabled (see REQ-DISABLE), the client shall enforce -that the "fork EB" is configured to at least 8,000,000 (bytes) by raising -an error during startup requesting the user to ensure adequate configuration. - -RATIONALE: Users need to be able to run with their usual EB prior to the -fork (e.g. some are running EB1 currently). The fork code needs to adjust -this EB automatically to a > 1MB value. 8MB is chosen as a minimum since -miners have indicated in the past that they would be willing to support -such a size, and the current network is capable of handling it. +If UAHF is not disabled (see REQ-DISABLE), the client shall enforce that the "fork EB" is configured to at least 8,000,000 (bytes) by raising an error during startup requesting the user to ensure adequate configuration. +RATIONALE: Users need to be able to run with their usual EB prior to the fork (e.g. some are running EB1 currently). +The fork code needs to adjust this EB automatically to a > 1MB value. +8MB is chosen as a minimum since miners have indicated in the past that they would be willing to support such a size, and the current network is capable of handling it. ### REQ-4-2 (require user to specify suitable *new* MG at startup) -If UAHF is not disabled (see REQ-DISABLE), the client shall require -the user to specify a "fork MG" (mining generation size) greater than -1,000,000 bytes. +If UAHF is not disabled (see REQ-DISABLE), the client shall require the user to specify a "fork MG" (mining generation size) greater than 1,000,000 bytes. -RATIONALE: This ensures a suitable MG is set at the activation time so -that a mining node would produce a fork block compatible with REQ-3. -It also forces the user (miner) to decide on what size blocks they want to -produce immediately after the fork. - -NOTE 1: The DEFAULT_MAX_GENERATED_BLOCK_SIZE in the released client needs -to remain 1,000,000 bytes so that the client will not generate invalid -blocks before the fork activates. At activation time, however, the "fork MG" -specified by the user (default: 2MB) will take effect. +RATIONALE: This ensures a suitable MG is set at the activation time so that a mining node would produce a fork block compatible with REQ-3. +It also forces the user (miner) to decide on what size blocks they want to produce immediately after the fork. +NOTE 1: The DEFAULT_MAX_GENERATED_BLOCK_SIZE in the released client needs to remain 1,000,000 bytes so that the client will not generate invalid blocks before the fork activates. +At activation time, however, the "fork MG" specified by the user (default: 2MB) will take effect. ### REQ-5 (max tx / max block sigops rules for blocks > 1 MB) -Blocks larger than 1,000,000 shall be subject to "Extended BU tx/sigops rules" -as follows: - -1. maximum sigops per block shall be calculated based on the actual size of -a block using -`max_block_sigops = 20000 * ceil((max(blocksize_bytes, 1000000) / 1000000))` +Blocks larger than 1,000,000 shall be subject to "Extended BU tx/sigops rules" as follows: +1. maximum sigops per block shall be calculated based on the actual size of a block using `max_block_sigops = 20000 * ceil((max(blocksize_bytes, 1000000) / 1000000))` 2. maximum allowed size of a single transaction shall be 1,000,000 bytes -NOTE 1: Blocks up to and including 1,000,000 bytes in size shall be subject -to existing pre-fork Bitcoin consensus rules. +NOTE 1: Blocks up to and including 1,000,000 bytes in size shall be subject to existing pre-fork Bitcoin consensus rules. -NOTE 2: Transactions exceeding 100,000 bytes (100KB) shall remain -non-standard after the activation time, meaning they will not be relayed. - -NOTE 3: BU treats both rules (1) and (2) as falling under the Emergent -Consensus rules (AD). Other clients may choose to implement them as -firm rules at their own risk. +NOTE 2: Transactions exceeding 100,000 bytes (100KB) shall remain non-standard after the activation time, meaning they will not be relayed. +NOTE 3: BU treats both rules (1) and (2) as falling under the Emergent Consensus rules (AD). +Other clients may choose to implement them as firm rules at their own risk. ### REQ-6-1 (disallow special OP_RETURN-marked transactions with sunset clause) @@ -155,120 +107,78 @@ Once the fork has activated, transactions consisting exclusively of a single OP_ Bitcoin: A Peer-to-Peer Electronic Cash System -(46 characters, including the single spaces separating the words, and -without any terminating null character) shall be considered invalid until -block 530,000 inclusive. +(46 characters, including the single spaces separating the words, and without any terminating null character) shall be considered invalid until block 530,000 inclusive. -RATIONALE: (DEPRECATED - see NOTE 2) To give users on the legacy chain (or other fork chains) -an opt-in way to exclude their transactions from processing on the UAHF -fork chain. The sunset clause block height is calculated as approximately -1 year after currently planned UASF activation time (Aug 1 2017 00:00:00 GMT), -rounded down to a human friendly number. +RATIONALE: (DEPRECATED - see NOTE 2) To give users on the legacy chain (or other fork chains) an opt-in way to exclude their transactions from processing on the UAHF fork chain. +The sunset clause block height is calculated as approximately 1 year after currently planned UASF activation time (Aug 1 2017 00:00:00 GMT), rounded down to a human friendly number. -NOTE 1: Transactions with such OP_RETURNs shall be considered valid again -for block 530,001 and onwards. - -NOTE 2: With the changes in v1.6 of this specification, mandatory use -of SIGHASH_FORKID replay protection on UAHF chain makes the use of this -opt-out protection unnecessary. Clients should nevertheless implement this -requirement, as removing it would constitute a hard fork vis-a-vis the -existing network. The sunset clause in this requirement will take care -of its expiry by itself. +NOTE 1: Transactions with such OP_RETURNs shall be considered valid again for block 530,001 and onwards. +NOTE 2: With the changes in v1.6 of this specification, mandatory use of SIGHASH_FORKID replay protection on UAHF chain makes the use of this opt-out protection unnecessary. +Clients should nevertheless implement this requirement, as removing it would constitute a hard fork vis-a-vis the existing network. +The sunset clause in this requirement will take care of its expiry by itself. ### REQ-6-2 (mandatory signature shift via hash type) -Once the fork has activated, a transaction shall be deemed valid only if -the following are true in combination: +Once the fork has activated, a transaction shall be deemed valid only if the following are true in combination: + - its nHashType has bit 6 set (SIGHASH_FORKID, mask 0x40) -- a magic 'fork id' value is added to the nHashType before the hash is - calculated (see note 4) +- a magic 'fork id' value is added to the nHashType before the hash is calculated (see note 4) - it is digested using the new algorithm described in REQ-6-3 -RATIONALE: To provide strong protection against replay of existing -transactions on the UAHF chain, only transactions signed with the new -hash algorithm and having SIGHASH_FORKID set will be accepted, by consensus. +RATIONALE: To provide strong protection against replay of existing transactions on the UAHF chain, only transactions signed with the new hash algorithm and having SIGHASH_FORKID set will be accepted, by consensus. -NOTE 1: It is possible for other hard forks to allow SIGHASH_FORKID-protected -transactions on their chain by implementing a compatible signature. +NOTE 1: It is possible for other hard forks to allow SIGHASH_FORKID-protected transactions on their chain by implementing a compatible signature. However, this does require a counter hard fork by legacy chains. -NOTE 2: (DEPRECATED) ~~The client shall still accept transactions whose signatures~~ -~~verify according to pre-fork rules, subject to the additional OP_RETURN~~ -~~constraint introduced by REQ-6-1.~~ +NOTE 2: (DEPRECATED) ~~The client shall still accept transactions whose signatures verify according to pre-fork rules, subject to the additional OP_RETURN constraint introduced by REQ-6-1.~~ -NOTE 3: (DEPRECATED) ~~If bit 6 is not set, only the unmodified nHashType will be used~~ -~~to compute the hash and verify the signature.~~ +NOTE 3: (DEPRECATED) ~~If bit 6 is not set, only the unmodified nHashType will be used to compute the hash and verify the signature.~~ NOTE 4: The magic 'fork id' value used by UAHF-compatible clients is zero. -This means that the change in hash when bit 6 is set is effected only by -the adapted signing algorithm (see REQ-6-3). - -NOTE 5: See also REQ-6-4 which introduces a requirement for use of -SCRIPT_VERIFY_STRICTENC. +This means that the change in hash when bit 6 is set is effected only by the adapted signing algorithm (see REQ-6-3). +NOTE 5: See also REQ-6-4 which introduces a requirement for use of SCRIPT_VERIFY_STRICTENC. ### REQ-6-3 (use adapted BIP143 hash algorithm for protected transactions) -Once the fork has activated, any transaction that has bit 6 set in its -hash type shall have its signature hash computed using a minimally revised -form of the transaction digest algorithm specified in BIP143. +Once the fork has activated, any transaction that has bit 6 set in its hash type shall have its signature hash computed using a minimally revised form of the transaction digest algorithm specified in BIP143. RATIONALE: see Motivation section of BIP143 [2]. -NOTE 1: refer to [3] for the specificaton of the revised transaction -digest based on BIP143. Revisions were made to account for non-Segwit -deployment. - +NOTE 1: refer to [3] for the specificaton of the revised transaction digest based on BIP143. +Revisions were made to account for non-Segwit deployment. ### REQ-6-4 (mandatory use of SCRIPT_VERIFY_STRICTENC) -Once the fork has activated, transactions shall be validated with -SCRIPT_VERIFY_STRICTENC flag set. +Once the fork has activated, transactions shall be validated with SCRIPT_VERIFY_STRICTENC flag set. -RATIONALE: Use of SCRIPT_VERIFY_STRICTENC also ensures that the -nHashType is validated properly. - -NOTE: As SCRIPT_VERIFY_STRICTENC is not clearly defined by BIP, -implementations seeking to be compliant should consult the Bitcoin C++ -source code to emulate the checks enforced by this flag. +RATIONALE: Use of SCRIPT_VERIFY_STRICTENC also ensures that the nHashType is validated properly. +NOTE: As SCRIPT_VERIFY_STRICTENC is not clearly defined by BIP, implementations seeking to be compliant should consult the Bitcoin C++ source code to emulate the checks enforced by this flag. ### REQ-7 Difficulty adjustement in case of hashrate drop -In case the MTP of the tip of the chain is 12h or more after the MTP 6 block -before the tip, the proof of work target is increased by a quarter, or 25%, -which corresponds to a difficulty reduction of 20% . +In case the MTP of the tip of the chain is 12h or more after the MTP 6 block before the tip, the proof of work target is increased by a quarter, or 25%, which corresponds to a difficulty reduction of 20%. -RATIONALE: The hashrate supporting the chain is dependent on market price and -hard to predict. In order to make sure the chain remains viable no matter what -difficulty needs to adjust down in case of abrupt hashrate drop. +RATIONALE: The hashrate supporting the chain is dependent on market price and hard to predict. +In order to make sure the chain remains viable no matter what difficulty needs to adjust down in case of abrupt hashrate drop. ### REQ-DISABLE (disable fork by setting fork time to 0) -If the activation time is configured to 0, the client shall not enforce -the new consensus rules of UAHF, including the activation of the fork, -the size constraint at a certain time, and the enforcing of EB/AD -constraints at startup. - -RATIONALE: To make it possible to use such a release as a compatible -client with legacy chain / i.e. to decide to not follow the HF on one's -node / make a decision at late stage without needing to change client. +If the activation time is configured to 0, the client shall not enforce the new consensus rules of UAHF, including the activation of the fork, the size constraint at a certain time, and the enforcing of EB/AD constraints at startup. +RATIONALE: To make it possible to use such a release as a compatible client with legacy chain / i.e. to decide to not follow the HF on one's node / make a decision at late stage without needing to change client. ### OPT-SERVICEBIT (NODE_BITCOIN_CASH service bit) A UAHF-compatible client should set service bit 5 (value 0x20). -RATIONALE: This service bit allows signaling that the node is a UAHF -supporting node, which helps DNS seeders distinguish UAHF implementations. +RATIONALE: This service bit allows signaling that the node is a UAHF supporting node, which helps DNS seeders distinguish UAHF implementations. -NOTE 1: This is an optional feature which clients do not strictly have to -implement. - -NOTE 2: This bit is currently referred to as NODE_BITCOIN_CASH and displayed -as "CASH" in user interfaces of some Bitcoin clients (BU, ABC). +NOTE 1: This is an optional feature which clients do not strictly have to implement. +NOTE 2: This bit is currently referred to as NODE_BITCOIN_CASH and displayed as "CASH" in user interfaces of some Bitcoin clients (BU, ABC). ## References @@ -279,6 +189,3 @@ as "CASH" in user interfaces of some Bitcoin clients (BU, ABC). [3] [Digest for replay protected signature verification accross hard forks](https://github.com/bitcoincashorg/bitcoincash.org/blob/master/spec/replay-protected-sighash.md) [4] https://github.com/bitcoincashorg/bitcoincash.org/blob/master/spec/uahf-test-plan.md - - -END \ No newline at end of file diff --git a/protocol/forks/bip-0016.md b/protocol/forks/bip-0016.md index 92abcec..36307c4 100644 --- a/protocol/forks/bip-0016.md +++ b/protocol/forks/bip-0016.md @@ -1,14 +1,14 @@ -
-  BIP: 16
-  Layer: Consensus (soft fork)
-  Title: Pay to Script Hash
-  Author: Gavin Andresen <gavinandresen@gmail.com>
-  Comments-Summary: No comments yet.
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0016
-  Status: Final
-  Type: Standards Track
-  Created: 2012-01-03
-
+# BIP-0016 + + BIP: 16 + Layer: Consensus (soft fork) + Title: Pay to Script Hash + Author: Gavin Andresen <gavinandresen@gmail.com> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0016 + Status: Final + Type: Standards Track + Created: 2012-01-03 ## Abstract @@ -17,7 +17,6 @@ This BIP describes a new "standard" transaction type for the Bitcoin scripting s ## Motivation The purpose of pay-to-script-hash is to move the responsibility for supplying the conditions to redeem a transaction from the sender of the funds to the redeemer. - The benefit is allowing a sender to fund any arbitrary transaction, no matter how complicated, using a fixed-length 20-byte hash that is short enough to scan from a QR code or easily copied and pasted. ## Specification @@ -26,28 +25,31 @@ A new standard transaction type that is relayed and included in mined blocks is OP_HASH160 [20-byte-hash-value] OP_EQUAL -[20-byte-hash-value] shall be the push-20-bytes-onto-the-stack opcode (0x14) followed by exactly 20 bytes. +`[20-byte-hash-value]` shall be the push-20-bytes-onto-the-stack opcode (0x14) followed by exactly 20 bytes. This new transaction type is redeemed by a standard scriptSig: ...signatures... {serialized script} -Transactions that redeem these pay-to-script outpoints are only considered standard if the ''serialized script'' - also referred to as the ''redeemScript'' - is, itself, one of the other standard transaction types. +Transactions that redeem these pay-to-script outpoints are only considered standard if the `serialized script` - also referred to as the `redeemScript` - is, itself, one of the other standard transaction types. The rules for validating these outpoints when relaying transactions or considering them for inclusion in a new block are as follows: - 1. Validation fails if there are any operations other than "push data" operations in the scriptSig. - 2. Normal validation is done: an initial stack is created from the signatures and {serialized script}, and the hash of the script is computed and validation fails immediately if it does not match the hash in the outpoint. - 3. {serialized script} is popped off the initial stack, and the transaction is validated again using the popped stack and the deserialized script as the scriptPubKey. +1. Validation fails if there are any operations other than "push data" operations in the scriptSig. +2. Normal validation is done: an initial stack is created from the signatures and `{serialized script}`, and the hash of the script is computed and validation fails immediately if it does not match the hash in the outpoint. +3. `{serialized script}` is popped off the initial stack, and the transaction is validated again using the popped stack and the deserialized script as the scriptPubKey. -These new rules should only be applied when validating transactions in blocks with timestamps >= 1333238400 (Apr 1 2012) [Remove -bip16 and -paytoscripthashtime command-line arguments](https://github.com/bitcoin/bitcoin/commit/8f188ece3c82c4cf5d52a3363e7643c23169c0ff). There are transactions earlier than 1333238400 in the block chain that fail these new validation rules. [Transaction 6a26d2ecb67f27d1fa5524763b49029d7106e91e3cc05743073461a719776192](http://blockexplorer.com/tx/6a26d2ecb67f27d1fa5524763b49029d7106e91e3cc05743073461a719776192). Older transactions must be validated under the old rules. (see the Backwards Compatibility section for details). +These new rules should only be applied when validating transactions in blocks with `timestamps >= 1333238400` (Apr 1 2012). [1][1] +There are transactions earlier than 1333238400 in the block chain that fail these new validation rules. [2][2] +Older transactions must be validated under the old rules. +(see the Backwards Compatibility section for details). For example, the scriptPubKey and corresponding scriptSig for a one-signature-required transaction is: scriptSig: [signature] {[pubkey] OP_CHECKSIG} scriptPubKey: OP_HASH160 [20-byte-hash of {[pubkey] OP_CHECKSIG} ] OP_EQUAL -Signature operations in the {serialized script} shall contribute to the maximum number allowed per block (20,000) as follows: +Signature operations in the `{serialized script}` shall contribute to the maximum number allowed per block (20,000) as follows: 1. OP_CHECKSIG and OP_CHECKSIGVERIFY count as 1 signature operation, whether or not they are evaluated. 2. OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY immediately preceded by OP_1 through OP_16 are counted as 1 to 16 signature operation, whether or not they are evaluated. @@ -56,22 +58,29 @@ Signature operations in the {serialized script} shall contribute to the maximum Examples: +3 signature operations: + {2 [pubkey1] [pubkey2] [pubkey3] 3 OP_CHECKMULTISIG} +22 signature operations + {OP_CHECKSIG OP_IF OP_CHECKSIGVERIFY OP_ELSE OP_CHECKMULTISIGVERIFY OP_ENDIF} ## Rationale -This BIP replaces BIP 12, which proposed a new Script opcode ("OP_EVAL") to accomplish everything in this BIP and more. +This BIP replaces BIP 12, which proposed a new Script opcode (`OP_EVAL`) to accomplish everything in this BIP and more. -The Motivation for this BIP (and BIP 13, the pay-to-script-hash address type) is somewhat controversial; several people feel that it is unnecessary, and complex/multisignature transaction types should be supported by simply giving the sender the complete {serialized script}. The author believes that this BIP will minimize the changes needed to all of the supporting infrastructure that has already been created to send funds to a base58-encoded-20-byte bitcoin addresses, allowing merchants and exchanges and other software to start supporting multisignature transactions sooner. +The Motivation for this BIP (and BIP 13, the pay-to-script-hash address type) is somewhat controversial; several people feel that it is unnecessary, and complex/multisignature transaction types should be supported by simply giving the sender the complete `{serialized script}`. +The author believes that this BIP will minimize the changes needed to all of the supporting infrastructure that has already been created to send funds to a base58-encoded-20-byte bitcoin addresses, allowing merchants and exchanges and other software to start supporting multisignature transactions sooner. -Recognizing one 'special' form of scriptPubKey and performing extra validation when it is detected is ugly. However, the consensus is that the alternatives are either uglier, are more complex to implement, and/or expand the power of the expression language in dangerous ways. +Recognizing one 'special' form of scriptPubKey and performing extra validation when it is detected is ugly. +However, the consensus is that the alternatives are either uglier, are more complex to implement, and/or expand the power of the expression language in dangerous ways. -The signature operation counting rules are intended to be easy and quick to implement by statically scanning the {serialized script}. Bitcoin imposes a maximum-number-of-signature-operations per block to prevent denial-of-service attacks on miners. If there was no limit, a rogue miner might broadcast a block that required hundreds of thousands of ECDSA signature operations to validate, and it might be able to get a head start computing the next block while the rest of the network worked to validate the current one. +The signature operation counting rules are intended to be easy and quick to implement by statically scanning the `{serialized script}`. +Bitcoin imposes a maximum-number-of-signature-operations per block to prevent denial-of-service attacks on miners. +If there was no limit, a rogue miner might broadcast a block that required hundreds of thousands of ECDSA signature operations to validate, and it might be able to get a head start computing the next block while the rest of the network worked to validate the current one. -There is a 1-confirmation attack on old implementations, but it is expensive and difficult in practice. The attack is: +There is a 1-confirmation attack on old implementations, but it is expensive and difficult in practice. +The attack is: 1. Attacker creates a pay-to-script-hash transaction that is valid as seen by old software, but invalid for new implementation, and sends themselves some coins using it. 2. Attacker also creates a standard transaction that spends the pay-to-script transaction, and pays the victim who is running old software. @@ -79,13 +88,14 @@ There is a 1-confirmation attack on old implementations, but it is expensive and If the victim accepts the 1-confirmation payment, then the attacker wins because both transactions will be invalidated when the rest of the network overwrites the attacker's invalid block. -The attack is expensive because it requires the attacker create a block that they know will be invalidated by the rest of the network. It is difficult because creating blocks is difficult and users should not accept 1-confirmation transactions for higher-value transactions. +The attack is expensive because it requires the attacker create a block that they know will be invalidated by the rest of the network. +It is difficult because creating blocks is difficult and users should not accept 1-confirmation transactions for higher-value transactions. ## Backwards Compatibility These transactions are non-standard to old implementations, which will (typically) not relay them or include them in blocks. -Old implementations will validate that the {serialize script}'s hash value matches when they validate blocks created by software that fully support this BIP, but will do no other validation. +Old implementations will validate that the `{serialize script}`'s hash value matches when they validate blocks created by software that fully support this BIP, but will do no other validation. Avoiding a block-chain split by malicious pay-to-script transactions requires careful handling of one case: @@ -95,14 +105,17 @@ To gracefully upgrade and ensure no long-lasting block-chain split occurs, more To judge whether or not more than 50% of hashing power supports this BIP, miners are asked to upgrade their software and put the string "/P2SH/" in the input of the coinbase transaction for blocks that they create. -On February 1, 2012, the block-chain will be examined to determine the number of blocks supporting pay-to-script-hash for the previous 7 days. If 550 or more contain "/P2SH/" in their coinbase, then all blocks with timestamps after 15 Feb 2012, 00:00:00 GMT shall have their pay-to-script-hash transactions fully validated. Approximately 1,000 blocks are created in a week; 550 should, therefore, be approximately 55% of the network supporting the new feature. +On February 1, 2012, the block-chain will be examined to determine the number of blocks supporting pay-to-script-hash for the previous 7 days. +If 550 or more contain "/P2SH/" in their coinbase, then all blocks with timestamps after 15 Feb 2012, 00:00:00 GMT shall have their pay-to-script-hash transactions fully validated. +Approximately 1,000 blocks are created in a week; 550 should, therefore, be approximately 55% of the network supporting the new feature. If a majority of hashing power does not support the new validation rules, then rollout will be postponed (or rejected if it becomes clear that a majority will never be achieved). ### 520-byte limitation on serialized script size -As a consequence of the requirement for backwards compatibility the serialized script is itself subject to the same rules as any other PUSHDATA operation, including the rule that no data greater than 520 bytes may be pushed to the stack. Thus it is not possible to spend a P2SH output if the redemption script it refers to is >520 bytes in length. For instance while the OP_CHECKMULTISIG opcode can itself accept up to 20 pubkeys, with 33-byte compressed pubkeys it is only possible to spend a P2SH output requiring a maximum of 15 pubkeys to redeem: 3 bytes + 15 pubkeys * 34 bytes/pubkey = 513 bytes. - +As a consequence of the requirement for backwards compatibility the serialized script is itself subject to the same rules as any other `PUSHDATA` operation, including the rule that no data greater than 520 bytes may be pushed to the stack. +Thus it is not possible to spend a P2SH output if the redemption script it refers to is > 520 bytes in length. +For instance while the `OP_CHECKMULTISIG` opcode can itself accept up to 20 pubkeys, with 33-byte compressed pubkeys it is only possible to spend a P2SH output requiring a maximum of 15 pubkeys to redeem: 3 bytes + 15 pubkeys * 34 bytes/pubkey = 513 bytes. ## Reference Implementation @@ -117,4 +130,6 @@ https://gist.github.com/gavinandresen/3966071 ## References -(inlined above) \ No newline at end of file +[1]: [Remove -bip16 and -paytoscripthashtime command-line arguments](https://github.com/bitcoin/bitcoin/commit/8f188ece3c82c4cf5d52a3363e7643c23169c0ff) + +[2]: [Transaction 6a26d2ecb67f27d1fa5524763b49029d7106e91e3cc05743073461a719776192](http://blockexplorer.com/tx/6a26d2ecb67f27d1fa5524763b49029d7106e91e3cc05743073461a719776192) diff --git a/protocol/forks/bip-0034.md b/protocol/forks/bip-0034.md index b7f9610..b8d00da 100644 --- a/protocol/forks/bip-0034.md +++ b/protocol/forks/bip-0034.md @@ -1,39 +1,48 @@ -
-  BIP: 34
-  Layer: Consensus (soft fork)
-  Title: Block v2, Height in Coinbase
-  Author: Gavin Andresen <gavinandresen@gmail.com>
-  Comments-Summary: No comments yet.
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0034
-  Status: Final
-  Type: Standards Track
-  Created: 2012-07-06
-
+# BIP-0034 -# Abstract + BIP: 34 + Layer: Consensus (soft fork) + Title: Block v2, Height in Coinbase + Author: Gavin Andresen <gavinandresen@gmail.com> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0034 + Status: Final + Type: Standards Track + Created: 2012-07-06 -Bitcoin blocks and transactions are versioned binary structures. Both currently use version 1. This BIP introduces an upgrade path for versioned transactions and blocks. A unique value is added to newly produced coinbase transactions, and blocks are updated to version 2. +## Abstract -# Motivation +Bitcoin blocks and transactions are versioned binary structures. +Both currently use version 1. +This BIP introduces an upgrade path for versioned transactions and blocks. +A unique value is added to newly produced coinbase transactions, and blocks are updated to version 2. + +## Motivation 1. Clarify and exercise the mechanism whereby the bitcoin network collectively consents to upgrade transaction or block binary structures, rules and behaviors. 2. Enforce block and transaction uniqueness, and assist unconnected block validation. -# Specification +## Specification 1. Treat transactions with a version greater than 1 as non-standard (official Satoshi client will not mine or relay them). -2. Add height as the first item in the coinbase transaction's scriptSig, and increase block version to 2. The format of the height is "serialized CScript" -- first byte is number of bytes in the number (will be 0x03 on main net for the next 150 or so years with 223-1 blocks), following bytes are little-endian representation of the number (including a sign bit). Height is the height of the mined block in the block chain, where the genesis block is height zero (0). -3. 75% rule: If 750 of the last 1,000 blocks are version 2 or greater, reject invalid version 2 blocks. (testnet3: 51 of last 100) -4. 95% rule ("Point of no return"): If 950 of the last 1,000 blocks are version 2 or greater, reject all version 1 blocks. (testnet3: 75 of last 100) +2. Add height as the first item in the coinbase transaction's scriptSig, and increase block version to 2. +The format of the height is "serialized CScript" -- first byte is number of bytes in the number (will be 0x03 on main net for the next 150 or so years with 223-1 blocks), following bytes are little-endian representation of the number (including a sign bit). +Height is the height of the mined block in the block chain, where the genesis block is height zero (0). +3. 75% rule: If 750 of the last 1,000 blocks are version 2 or greater, reject invalid version 2 blocks. +(testnet3: 51 of last 100) +4. 95% rule ("Point of no return"): If 950 of the last 1,000 blocks are version 2 or greater, reject all version 1 blocks. +(testnet3: 75 of last 100) -# Backward compatibility +## Backward compatibility -All older clients are compatible with this change. Users and merchants should not be impacted. Miners are strongly recommended to upgrade to version 2 blocks. Once 95% of the miners have upgraded to version 2, the remainder will be orphaned if they fail to upgrade. +All older clients are compatible with this change. +Users and merchants should not be impacted. Miners are strongly recommended to upgrade to version 2 blocks. +Once 95% of the miners have upgraded to version 2, the remainder will be orphaned if they fail to upgrade. -# Implementation +## Implementation https://github.com/bitcoin/bitcoin/pull/1526 -# Result +## Result -Block number 227,835 (timestamp 2013-03-24 15:49:13 GMT) was the last version 1 block. \ No newline at end of file +Block number 227,835 (timestamp 2013-03-24 15:49:13 GMT) was the last version 1 block. diff --git a/protocol/forks/bip-0037.md b/protocol/forks/bip-0037.md index b972752..975ca56 100644 --- a/protocol/forks/bip-0037.md +++ b/protocol/forks/bip-0037.md @@ -1,35 +1,47 @@ -
-  BIP: 37
-  Layer: Peer Services
-  Title: Connection Bloom filtering
-  Author: Mike Hearn <hearn@google.com>
-          Matt Corallo <bip37@bluematt.me>
-  Comments-Summary: No comments yet.
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0037
-  Status: Final
-  Type: Standards Track
-  Created: 2012-10-24
-  License: PD
-
+# BIP-0037 + + BIP: 37 + Layer: Peer Services + Title: Connection Bloom filtering + Author: Mike Hearn <hearn@google.com> + Matt Corallo <bip37@bluematt.me> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0037 + Status: Final + Type: Standards Track + Created: 2012-10-24 + License: PD ## Abstract -This BIP adds new support to the peer-to-peer protocol that allows peers to reduce the amount of transaction data they are sent. Peers have the option of setting ''filters'' on each connection they make after the version handshake has completed. A filter is defined as a [Bloom filter](http://en.wikipedia.org/wiki/Bloom_filter) on data derived from transactions. A Bloom filter is a probabilistic data structure which allows for testing set membership - they can have false positives but not false negatives. +This BIP adds new support to the peer-to-peer protocol that allows peers to reduce the amount of transaction data they are sent. +Peers have the option of setting ''filters'' on each connection they make after the version handshake has completed. +A filter is defined as a [Bloom filter](http://en.wikipedia.org/wiki/Bloom_filter) on data derived from transactions. +A Bloom filter is a probabilistic data structure which allows for testing set membership - they can have false positives but not false negatives. This document will not go into the details of how Bloom filters work and the reader is referred to Wikipedia for an introduction to the topic. ## Motivation -As Bitcoin grows in usage the amount of bandwidth needed to download blocks and transaction broadcasts increases. Clients implementing ''simplified payment verification'' do not attempt to fully verify the block chain, instead just checking that block headers connect together correctly and trusting that the transactions in a chain of high difficulty are in fact valid. See the Bitcoin paper for more detail on this mode. +As Bitcoin grows in usage the amount of bandwidth needed to download blocks and transaction broadcasts increases. +Clients implementing ''simplified payment verification'' do not attempt to fully verify the block chain, instead just checking that block headers connect together correctly and trusting that the transactions in a chain of high difficulty are in fact valid. +See the Bitcoin paper for more detail on this mode. -Today, [clients](https://bitcoin.org/en/developer-guide#simplified-payment-verification-spv) have to download the entire contents of blocks and all broadcast transactions, only to throw away the vast majority of the transactions that are not relevant to their wallets. This slows down their synchronization process, wastes users bandwidth (which on phones is often metered) and increases memory usage. All three problems are triggering real user complaints for the Android "Bitcoin Wallet" app which implements SPV mode. In order to make chain synchronization fast, cheap and able to run on older phones with limited memory we want to have remote peers throw away irrelevant transactions before sending them across the network. +Today, [clients](https://bitcoin.org/en/developer-guide#simplified-payment-verification-spv) have to download the entire contents of blocks and all broadcast transactions, only to throw away the vast majority of the transactions that are not relevant to their wallets. +This slows down their synchronization process, wastes users bandwidth (which on phones is often metered) and increases memory usage. +All three problems are triggering real user complaints for the Android "Bitcoin Wallet" app which implements SPV mode. +In order to make chain synchronization fast, cheap and able to run on older phones with limited memory we want to have remote peers throw away irrelevant transactions before sending them across the network. ## Design rationale -The most obvious way to implement the stated goal would be for clients to upload lists of their keys to the remote node. We take a more complex approach for the following reasons: +The most obvious way to implement the stated goal would be for clients to upload lists of their keys to the remote node. +We take a more complex approach for the following reasons: -* Privacy: Because Bloom filters are probabilistic, with the false positive rate chosen by the client, nodes can trade off precision vs bandwidth usage. A node with access to lots of bandwidth may choose to have a high FP rate, meaning the remote peer cannot accurately know which transactions belong to the client and which don't. A node with very little bandwidth may choose to use a very accurate filter meaning that they only get sent transactions actually relevant to their wallet, but remote peers may be able to correlate transactions with IP addresses (and each other). -* Bloom filters are compact and testing membership in them is fast. This results in satisfying performance characteristics with minimal risk of opening up potential for DoS attacks. +* Privacy: Because Bloom filters are probabilistic, with the false positive rate chosen by the client, nodes can trade off precision vs bandwidth usage. +A node with access to lots of bandwidth may choose to have a high FP rate, meaning the remote peer cannot accurately know which transactions belong to the client and which don't. +A node with very little bandwidth may choose to use a very accurate filter meaning that they only get sent transactions actually relevant to their wallet, but remote peers may be able to correlate transactions with IP addresses (and each other). +* Bloom filters are compact and testing membership in them is fast. +This results in satisfying performance characteristics with minimal risk of opening up potential for DoS attacks. ## Specification @@ -41,83 +53,112 @@ We start by adding three new messages to the protocol: * filteradd, which adds the given data element to the connections current filter without requiring a completely new one to be set * filterclear, which deletes the current filter and goes back to regular pre-BIP37 usage. -Note that there is no filterremove command because by their nature, Bloom filters are append-only data structures. Once an element is added it cannot be removed again without rebuilding the entire structure from scratch. +Note that there is no filterremove command because by their nature, Bloom filters are append-only data structures. +Once an element is added it cannot be removed again without rebuilding the entire structure from scratch. The filterload command is defined as follows: -| Field Size | Description | Data type | Comments| -|--|--|--|--| -| ? | filter | uint8_t[] | The filter itself is simply a bit field of arbitrary byte-aligned size. The maximum size is 36,000 bytes.| -| 4 | nHashFuncs | uint32_t | The number of hash functions to use in this filter. The maximum value allowed in this field is 50.| -| 4 | nTweak | uint32_t | A random value to add to the seed value in the hash function used by the bloom filter.| -| 1 | nFlags | uint8_t | A set of flags that control how matched items are added to the filter.| +| Field Size | Description | Data type | Comments | +|------------|-------------|-----------|-----------------------------------------------------------------------------------------------------------| +| ? | filter | uint8_t[] | The filter itself is simply a bit field of arbitrary byte-aligned size. The maximum size is 36,000 bytes. | +| 4 | nHashFuncs | uint32_t | The number of hash functions to use in this filter. The maximum value allowed in this field is 50. | +| 4 | nTweak | uint32_t | A random value to add to the seed value in the hash function used by the bloom filter. | +| 1 | nFlags | uint8_t | A set of flags that control how matched items are added to the filter. | See below for a description of the Bloom filter algorithm and how to select nHashFuncs and filter size for a desired false positive rate. -Upon receiving a filterload command, the remote peer will immediately restrict the broadcast transactions it announces (in inv packets) to transactions matching the filter, where the matching algorithm is specified below. The flags control the update behaviour of the matching algorithm. +Upon receiving a filterload command, the remote peer will immediately restrict the broadcast transactions it announces (in `inv` packets) to transactions matching the filter, where the matching algorithm is specified below. +The flags control the update behaviour of the matching algorithm. The filteradd command is defined as follows: -| Field Size | Description | Data type | Comments -|--|--|--|--| -| ? | data | uint8_t[] | The data element to add to the current filter.| +| Field Size | Description | Data type | Comments | +|------------|-------------|-----------|------------------------------------------------ +| ? | data | uint8_t[] | The data element to add to the current filter. | The data field must be smaller than or equal to 520 bytes in size (the maximum size of any potentially matched object). -The given data element will be added to the Bloom filter. A filter must have been previously provided using filterload. This command is useful if a new key or script is added to a clients wallet whilst it has connections to the network open, it avoids the need to re-calculate and send an entirely new filter to every peer (though doing so is usually advisable to maintain anonymity). +The given data element will be added to the Bloom filter. +A filter must have been previously provided using filterload. +This command is useful if a new key or script is added to a clients wallet whilst it has connections to the network open, it avoids the need to re-calculate and send an entirely new filter to every peer (though doing so is usually advisable to maintain anonymity). The filterclear command has no arguments at all. -After a filter has been set, nodes don't merely stop announcing non-matching transactions, they can also serve filtered blocks. A filtered block is defined by the merkleblock message and is defined like this: +After a filter has been set, nodes don't merely stop announcing non-matching transactions, they can also serve filtered blocks. +A filtered block is defined by the merkleblock message and is defined like this: -| Field Size | Description | Data type | Comments | -|--|--|--|--| -| 4 | version | uint32_t | Block version information, based upon the software version creating this block| -| 32 | prev_block | char[32] | The hash value of the previous block this particular block references| -| 32 | merkle_root | char[32] | The reference to a Merkle tree collection which is a hash of all transactions related to this block| -| 4 | timestamp | uint32_t | A timestamp recording when this block was created (Limited to 2106!)| -| 4 | bits | uint32_t | The calculated difficulty target being used for this block| -| 4 | nonce | uint32_t | The nonce used to generate this block… to allow variations of the header and compute different hashes| -| 4 | total_transactions | uint32_t | Number of transactions in the block (including unmatched ones)| -| ? | hashes | uint256[] | hashes in depth-first order (including standard varint size prefix)| -| ? | flags | byte[] | flag bits, packed per 8 in a byte, least significant bit first (including standard varint size prefix)| +| Field Size | Description | Data type | Comments | +|------------|--------------------|-----------|--------------------------------------------------------------------------------------------------------| +| 4 | version | uint32_t | Block version information, based upon the software version creating this block | +| 32 | prev_block | char[32] | The hash value of the previous block this particular block references | +| 32 | merkle_root | char[32] | The reference to a Merkle tree collection which is a hash of all transactions related to this block | +| 4 | timestamp | uint32_t | A timestamp recording when this block was created (Limited to 2106!) | +| 4 | bits | uint32_t | The calculated difficulty target being used for this block | +| 4 | nonce | uint32_t | The nonce used to generate this block… to allow variations of the header and compute different hashes | +| 4 | total_transactions | uint32_t | Number of transactions in the block (including unmatched ones) | +| ? | hashes | uint256[] | hashes in depth-first order (including standard varint size prefix) | +| ? | flags | byte[] | flag bits, packed per 8 in a byte, least significant bit first (including standard varint size prefix) | See below for the format of the partial merkle tree hashes and flags. -Thus, a merkleblock message is a block header, plus a part of a merkle tree which can be used to extract identifying information for transactions that matched the filter and prove that the matching transaction data really did appear in the solved block. Clients can use this data to be sure that the remote node is not feeding them fake transactions that never appeared in a real block, although lying through omission is still possible. +Thus, a merkleblock message is a block header, plus a part of a merkle tree which can be used to extract identifying information for transactions that matched the filter and prove that the matching transaction data really did appear in the solved block. +Clients can use this data to be sure that the remote node is not feeding them fake transactions that never appeared in a real block, although lying through omission is still possible. ### Extensions to existing messages The version command is extended with a new field: -| Field Size | Description | Data type | Comments| -|--|--|--|--| -| 1 byte | fRelay | bool | If false then broadcast transactions will not be announced until a filter{load,add,clear} command is received. If missing or true, no change in protocol behaviour occurs.| +| Field Size | Description | Data type | Comments | +|------------|-------------|-----------|----------| +| 1 byte | fRelay | bool | If false then broadcast transactions will not be announced until a filter{load,add,clear} command is received. If missing or true, no change in protocol behaviour occurs. | -SPV clients that wish to use Bloom filtering would normally set fRelay to false in the version message, then set a filter based on their wallet (or a subset of it, if they are overlapping different peers). Being able to opt-out of inv messages until the filter is set prevents a client being flooded with traffic in the brief window of time between finishing version handshaking and setting the filter. +SPV clients that wish to use Bloom filtering would normally set fRelay to false in the version message, then set a filter based on their wallet (or a subset of it, if they are overlapping different peers). +Being able to opt-out of inv messages until the filter is set prevents a client being flooded with traffic in the brief window of time between finishing version handshaking and setting the filter. -The getdata command is extended to allow a new type in the inv submessage. The type field can now be MSG_FILTERED_BLOCK (== 3) rather than MSG_BLOCK. If no filter has been set on the connection, a request for filtered blocks is ignored. If a filter has been set, a merkleblock message is returned for the requested block hash. In addition, because a merkleblock message contains only a list of transaction hashes, transactions matching the filter should also be sent in separate tx messages after the merkleblock is sent. This avoids a slow roundtrip that would otherwise be required (receive hashes, didn't see some of these transactions yet, ask for them). Note that because there is currently no way to request transactions which are already in a block from a node (aside from requesting the full block), the set of matching transactions that the requesting node hasn't either received or announced with an inv must be sent and any additional transactions which match the filter may also be sent. This allows for clients (such as the reference client) to limit the number of invs it must remember a given node to have announced while still providing nodes with, at a minimum, all the transactions it needs. +The getdata command is extended to allow a new type in the inv submessage. +The type field can now be MSG_FILTERED_BLOCK (== 3) rather than MSG_BLOCK. +If no filter has been set on the connection, a request for filtered blocks is ignored. If a filter has been set, a merkleblock message is returned for the requested block hash. +In addition, because a merkleblock message contains only a list of transaction hashes, transactions matching the filter should also be sent in separate tx messages after the merkleblock is sent. +This avoids a slow roundtrip that would otherwise be required (receive hashes, didn't see some of these transactions yet, ask for them). +Note that because there is currently no way to request transactions which are already in a block from a node (aside from requesting the full block), the set of matching transactions that the requesting node hasn't either received or announced with an inv must be sent and any additional transactions which match the filter may also be sent. +This allows for clients (such as the reference client) to limit the number of invs it must remember a given node to have announced while still providing nodes with, at a minimum, all the transactions it needs. ### Filter matching algorithm -The filter can be tested against arbitrary pieces of data, to see if that data was inserted by the client. Therefore the question arises of what pieces of data should be inserted/tested. +The filter can be tested against arbitrary pieces of data, to see if that data was inserted by the client. +Therefore the question arises of what pieces of data should be inserted/tested. -To determine if a transaction matches the filter, the following algorithm is used. Once a match is found the algorithm aborts. +To determine if a transaction matches the filter, the following algorithm is used. +Once a match is found the algorithm aborts. 1. Test the hash of the transaction itself. -2. For each output, test each data element of the output script. This means each hash and key in the output script is tested independently. '''Important:''' if an output matches whilst testing a transaction, the node might need to update the filter by inserting the serialized COutPoint structure. See below for more details. +2. For each output, test each data element of the output script. +This means each hash and key in the output script is tested independently. +**Important:** if an output matches whilst testing a transaction, the node might need to update the filter by inserting the serialized COutPoint structure. +See below for more details. 3. For each input, test the serialized COutPoint structure. -4. For each input, test each data element of the input script (note: input scripts only ever contain data elements). +4. For each input, test each data element of the input script +(note: input scripts only ever contain data elements). 5. Otherwise there is no match. -In this way addresses, keys and script hashes (for P2SH outputs) can all be added to the filter. You can also match against classes of transactions that are marked with well known data elements in either inputs or outputs, for example, to implement various forms of [Smart property](https://en.bitcoin.it/wiki/Smart_Property). +In this way addresses, keys and script hashes (for P2SH outputs) can all be added to the filter. +You can also match against classes of transactions that are marked with well known data elements in either inputs or outputs, for example, to implement various forms of [Smart property](https://en.bitcoin.it/wiki/Smart_Property). -The test for outpoints is there to ensure you can find transactions spending outputs in your wallet, even though you don't know anything about their form. As you can see, once set on a connection the filter is '''not static''' and can change throughout the connections lifetime. This is done to avoid the following race condition: +The test for outpoints is there to ensure you can find transactions spending outputs in your wallet, even though you don't know anything about their form. +As you can see, once set on a connection the filter is **not static** and can change throughout the connections lifetime. +This is done to avoid the following race condition: -1. A client sets a filter matching a key in their wallet. They then start downloading the block chain. The part of the chain that the client is missing is requested using getblocks. -2. The first block is read from disk by the serving peer. It contains TX 1 which sends money to the clients key. It matches the filter and is thus sent to the client. -3. The second block is read from disk by the serving peer. It contains TX 2 which spends TX 1. However TX 2 does not contain any of the clients keys and is thus not sent. The client does not know the money they received was already spent. +1. A client sets a filter matching a key in their wallet. +They then start downloading the block chain. +The part of the chain that the client is missing is requested using getblocks. +2. The first block is read from disk by the serving peer. +It contains TX 1 which sends money to the clients key. +It matches the filter and is thus sent to the client. +3. The second block is read from disk by the serving peer. +It contains TX 2 which spends TX 1. +However TX 2 does not contain any of the clients keys and is thus not sent. +The client does not know the money they received was already spent. By updating the bloom filter atomically in step 2 with the discovered outpoint, the filter will match against TX 2 in step 3 and the client will learn about all relevant transactions, despite that there is no pause between the node processing the first and second blocks. @@ -127,13 +168,19 @@ The nFlags field of the filter controls the nodes precise update behaviour and i * BLOOM_UPDATE_ALL (1) means if the filter matches any data element in a scriptPubKey the outpoint is serialized and inserted into the filter. * BLOOM_UPDATE_P2PUBKEY_ONLY (2) means the outpoint is inserted into the filter only if a data element in the scriptPubKey is matched, and that script is of the standard "pay to pubkey" or "pay to multisig" forms. -These distinctions are useful to avoid too-rapid degradation of the filter due to an increasing false positive rate. We can observe that a wallet which expects to receive only payments of the standard pay-to-address form doesn't need automatic filter updates because any transaction that spends one of its own outputs has a predictable data element in the input (the pubkey that hashes to the address). If a wallet might receive pay-to-address outputs and also pay-to-pubkey or pay-to-multisig outputs then BLOOM_UPDATE_P2PUBKEY_ONLY is appropriate, as it avoids unnecessary expansions of the filter for the most common types of output but still ensures correct behaviour with payments that explicitly specify keys. +These distinctions are useful to avoid too-rapid degradation of the filter due to an increasing false positive rate. +We can observe that a wallet which expects to receive only payments of the standard pay-to-address form doesn't need automatic filter updates because any transaction that spends one of its own outputs has a predictable data element in the input (the pubkey that hashes to the address). +If a wallet might receive pay-to-address outputs and also pay-to-pubkey or pay-to-multisig outputs then `BLOOM_UPDATE_P2PUBKEY_ONLY` is appropriate, as it avoids unnecessary expansions of the filter for the most common types of output but still ensures correct behaviour with payments that explicitly specify keys. -Obviously, nFlags \=\= 1 or nFlags \=\= 2 mean that the filter will get dirtier as more of the chain is scanned. Clients should monitor the observed false positive rate and periodically refresh the filter with a clean one. +Obviously, `nFlags == 1` or `nFlags == 2` mean that the filter will get dirtier as more of the chain is scanned. +Clients should monitor the observed false positive rate and periodically refresh the filter with a clean one. ### Partial Merkle branch format -A ''Merkle tree'' is a way of arranging a set of items as leaf nodes of tree in which the interior nodes are hashes of the concatenations of their child hashes. The root node is called the ''Merkle root''. Every Bitcoin block contains a Merkle root of the tree formed from the blocks transactions. By providing some elements of the trees interior nodes (called a ''Merkle branch'') a proof is formed that the given transaction was indeed in the block when it was being mined, but the size of the proof is much smaller than the size of the original block. +A ''Merkle tree'' is a way of arranging a set of items as leaf nodes of tree in which the interior nodes are hashes of the concatenations of their child hashes. +The root node is called the ''Merkle root''. +Every Bitcoin block contains a Merkle root of the tree formed from the blocks transactions. +By providing some elements of the trees interior nodes (called a ''Merkle branch'') a proof is formed that the given transaction was indeed in the block when it was being mined, but the size of the proof is much smaller than the size of the original block. #### Constructing a partial merkle tree object @@ -149,7 +196,9 @@ A ''Merkle tree'' is a way of arranging a set of items as leaf nodes of tree in #### Parsing a partial merkle tree object -As the partial block message contains the number of transactions in the entire block, the shape of the merkle tree is known before hand. Again, traverse this tree, computing traversed node's hashes along the way: +As the partial block message contains the number of transactions in the entire block, the shape of the merkle tree is known before hand. +Again, traverse this tree, computing traversed node's hashes along the way: + * Read a bit from the flag bit list: * If it is '0': * Read a hash from the hashes list, and return it as this node's hash. @@ -159,11 +208,12 @@ As the partial block message contains the number of transactions in the entire b * Descend into its left child tree, and store its computed hash as L. * If this node has a right child as well: * Descend into its right child, and store its computed hash as R. - * If L == R, the partial merkle tree object is invalid. - * Return Hash(L || R). + * If `L == R`, the partial merkle tree object is invalid. + * Return `Hash(L || R)`. * If this node has no right child, return Hash(L || L). The partial merkle tree object is only valid if: + * All hashes in the hash list were consumed and no more. * All bits in the flag bits list were consumed (except padding to make it into a full byte), and no more. * The hash computed for the root node matches the block header's merkle root. @@ -172,20 +222,27 @@ The partial merkle tree object is only valid if: ### Bloom filter format -A Bloom filter is a bit-field in which bits are set based on feeding the data element to a set of different hash functions. The number of hash functions used is a parameter of the filter. In Bitcoin we use version 3 of the 32-bit Murmur hash function. To get N "different" hash functions we simply initialize the Murmur algorithm with the following formula: +A Bloom filter is a bit-field in which bits are set based on feeding the data element to a set of different hash functions. +The number of hash functions used is a parameter of the filter. +In Bitcoin we use version 3 of the 32-bit Murmur hash function. +To get N "different" hash functions we simply initialize the Murmur algorithm with the following formula: nHashNum * 0xFBA4C795 + nTweak i.e. if the filter is initialized with 4 hash functions and a tweak of 0x00000005, when the second function (index 1) is needed h1 would be equal to 4221880218. -When loading a filter with the filterload command, there are two parameters that can be chosen. One is the size of the filter in bytes. The other is the number of hash functions to use. To select the parameters you can use the following formulas: +When loading a filter with the filterload command, there are two parameters that can be chosen. +One is the size of the filter in bytes. +The other is the number of hash functions to use. +To select the parameters you can use the following formulas: Let N be the number of elements you wish to insert into the set and P be the probability of a false positive, where 1.0 is "match everything" and zero is unachievable. -The size S of the filter in bytes is given by (-1 / pow(log(2), 2) * N * log(P)) / 8. Of course you must ensure it does not go over the maximum size (36,000: selected as it represents a filter of 20,000 items with false positive rate of < 0.1% or 10,000 items and a false positive rate of < 0.0001%). +The size S of the filter in bytes is given by (-1 / pow(log(2), 2) * N * log(P)) / 8. +Of course you must ensure it does not go over the maximum size (36,000: selected as it represents a filter of 20,000 items with false positive rate of < 0.1% or 10,000 items and a false positive rate of < 0.0001%). The number of hash functions required is given by S * 8 / N * log(2). ## Copyright -This document is placed in the public domain. \ No newline at end of file +This document is placed in the public domain. diff --git a/protocol/forks/bip-0064.md b/protocol/forks/bip-0064.md index d52085b..d48ee8d 100644 --- a/protocol/forks/bip-0064.md +++ b/protocol/forks/bip-0064.md @@ -1,14 +1,14 @@ -
-  BIP: 64
-  Layer: Peer Services
-  Title: getutxo message
-  Author: Mike Hearn <hearn@vinumeris.com>
-  Comments-Summary: No comments yet.
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0064
-  Status: Draft
-  Type: Standards Track
-  Created: 2014-06-10
-
+# BIP-0064 + + BIP: 64 + Layer: Peer Services + Title: getutxo message + Author: Mike Hearn <hearn@vinumeris.com> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0064 + Status: Draft + Type: Standards Track + Created: 2014-06-10 ## Abstract @@ -16,31 +16,35 @@ This document describes a small P2P protocol extension that performs UTXO lookup ## Motivation -All full Bitcoin nodes maintain a database called the unspent transaction output set. This set is +All full Bitcoin nodes maintain a database called the unspent transaction output set. +This set is how double spending is checked for: to be valid a transaction must identify unspent outputs in this set using an identifier called an "outpoint", which is merely the hash of the output's containing transaction plus an index. The ability to query this can sometimes be useful for a lightweight/SPV client which does not have -the full UTXO set at hand. For example, it can be useful in applications implementing assurance +the full UTXO set at hand. +For example, it can be useful in applications implementing assurance contracts to do a quick check when a new pledge becomes visible to test whether that pledge was -already revoked via a double spend. Although this message is not strictly necessary because e.g. +already revoked via a double spend. +Although this message is not strictly necessary because e.g. such an app could be implemented by fully downloading and storing the block chain, it is useful for obtaining acceptable performance and resolving various UI cases. Another example of when this data can be useful is for performing floating fee calculations in an -SPV wallet. This use case requires some other changes to the Bitcoin protocol however, so we will +SPV wallet. +This use case requires some other changes to the Bitcoin protocol however, so we will not dwell on it here. ## Specification -Two new messages are defined. The "getutxos" message has the following structure: - +Two new messages are defined. +The "getutxos" message has the following structure: | Field Size | Description | Data type | Comments | |--|--|--|--| -| 1 | check mempool | bool | Whether to apply mempool transactions during the calculation, thus exposing their UTXOs and removing outputs that they spend. -| ? | outpoints | vector<COutPoint> | The list of outpoints to be queried. Each outpoint is serialized in the same way it is in a tx message. +| 1 | check mempool | bool | Whether to apply mempool transactions during the calculation, thus exposing their UTXOs and removing outputs that they spend. | +| ? | outpoints | vector<COutPoint> | The list of outpoints to be queried. Each outpoint is serialized in the same way it is in a tx message. | The response message "utxos" has the following structure: @@ -66,21 +70,26 @@ NODE_GETUTXO flag in their nServices field, which has a value of 2 (the second b ## Authentication -The UTXO set is not currently authenticated by anything. There are proposals to resolve this by +The UTXO set is not currently authenticated by anything. +There are proposals to resolve this by introducing a new consensus rule that commits to a root hash of the UTXO set in blocks, however this -feature is not presently available in the Bitcoin protocol. Once it is, the utxos message could be +feature is not presently available in the Bitcoin protocol. +Once it is, the utxos message could be upgraded to include Merkle branches showing inclusion of the UTXOs in the committed sets. If the requesting client is looking up outputs for a signed transaction that they have locally, the -client can partly verify the returned output by running the input scripts with it. Currently this -verifies only that the script is correct. A future version of the Bitcoin protocol is likely to also -allow the value to be checked in this way. It does not show that the output is really unspent or was -ever actually created in the block chain however. Additionally, the form of the provided scriptPubKey -should be checked before execution to ensure the remote peer doesn't just set the script to OP_TRUE. +client can partly verify the returned output by running the input scripts with it. +Currently this verifies only that the script is correct. +A future version of the Bitcoin protocol is likely to also +allow the value to be checked in this way. +It does not show that the output is really unspent or was +ever actually created in the block chain however. +Additionally, the form of the provided scriptPubKey should be checked before execution to ensure the remote peer doesn't just set the script to `OP_TRUE`. If the requesting client has a mapping of chain heights to block hashes in the best chain e.g. obtained via getheaders, then they can obtain a proof that the output did at one point exist by -requesting the block and searching for the output within it. When combined with Bloom filtering this +requesting the block and searching for the output within it. +When combined with Bloom filtering this can be reasonably efficient. Note that even when the outputs are being checked against something this protocol has the same @@ -92,4 +101,4 @@ results. ## Implementation -https://github.com/bitcoin/bitcoin/pull/4351/files \ No newline at end of file +https://github.com/bitcoin/bitcoin/pull/4351/files diff --git a/protocol/forks/bip-0065.md b/protocol/forks/bip-0065.md index ac627b4..aab1b83 100644 --- a/protocol/forks/bip-0065.md +++ b/protocol/forks/bip-0065.md @@ -1,27 +1,26 @@ -
-  BIP: 65
-  Layer: Consensus (soft fork)
-  Title: OP_CHECKLOCKTIMEVERIFY
-  Author: Peter Todd <pete@petertodd.org>
-  Comments-Summary: No comments yet.
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0065
-  Status: Final
-  Type: Standards Track
-  Created: 2014-10-01
-  License: PD
-
+# BIP-0065 + + BIP: 65 + Layer: Consensus (soft fork) + Title: OP_CHECKLOCKTIMEVERIFY + Author: Peter Todd <pete@petertodd.org> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0065 + Status: Final + Type: Standards Track + Created: 2014-10-01 + License: PD ## Abstract -This BIP describes a new opcode (OP_CHECKLOCKTIMEVERIFY) for the Bitcoin -scripting system that allows a transaction output to be made unspendable until -some point in the future. +This BIP describes a new opcode (OP_CHECKLOCKTIMEVERIFY) for the Bitcoin scripting system that allows +a transaction output to be made unspendable until some point in the future. ## Summary -CHECKLOCKTIMEVERIFY redefines the existing NOP2 opcode. When executed, if -any of the following conditions are true, the script interpreter will terminate +CHECKLOCKTIMEVERIFY redefines the existing NOP2 opcode. +When executed, if any of the following conditions are true, the script interpreter will terminate with an error: * the stack is empty; or @@ -33,13 +32,12 @@ with an error: Otherwise, script execution will continue as if a NOP had been executed. The nLockTime field in a transaction prevents the transaction from being mined -until either a certain block height, or block time, has been reached. By -comparing the argument to CHECKLOCKTIMEVERIFY against the nLockTime field, we +until either a certain block height, or block time, has been reached. +By comparing the argument to CHECKLOCKTIMEVERIFY against the nLockTime field, we indirectly verify that the desired block height or block time has been reached; until that block height or block time has been reached the transaction output remains unspendable. - ## Motivation The nLockTime field in transactions can be used to prove that it is @@ -51,19 +49,18 @@ transaction output until some time in the future, as there is no way to know if a valid signature for a different transaction spending that output has been created. - ### Escrow If Alice and Bob jointly operate a business they may want to ensure that all funds are kept in 2-of-2 multisig transaction outputs that -require the co-operation of both parties to spend. However, they recognise that -in exceptional circumstances such as either party getting "hit by a bus" they -need a backup plan to retrieve the funds. So they appoint their lawyer, Lenny, -to act as a third-party. +require the co-operation of both parties to spend. +However, they recognise that in exceptional circumstances such as either party getting "hit by a bus" they +need a backup plan to retrieve the funds. +So they appoint their lawyer, Lenny, to act as a third-party. With a standard 2-of-3 CHECKMULTISIG at any time Lenny could conspire with -either Alice or Bob to steal the funds illegitimately. Equally Lenny may prefer -not to have immediate access to the funds to discourage bad actors from +either Alice or Bob to steal the funds illegitimately. +Equally Lenny may prefer not to have immediate access to the funds to discourage bad actors from attempting to get the secret keys from him by force. However, with CHECKLOCKTIMEVERIFY the funds can be stored in scriptPubKeys of @@ -87,32 +84,34 @@ funds with the following scriptSig: 0 1 - ### Non-interactive time-locked refunds There exist a number of protocols where a transaction output is created that -requires the co-operation of both parties to spend the output. To ensure the -failure of one party does not result in the funds becoming lost, refund -transactions are setup in advance using nLockTime. These refund transactions -need to be created interactively, and additionally, are currently vulnerable to -transaction malleability. CHECKLOCKTIMEVERIFY can be used in these protocols, +requires the co-operation of both parties to spend the output. +To ensure the failure of one party does not result in the funds becoming lost, refund +transactions are setup in advance using nLockTime. +These refund transactions need to be created interactively, and additionally, are currently vulnerable to +transaction malleability. +CHECKLOCKTIMEVERIFY can be used in these protocols, replacing the interactive setup with a non-interactive setup, and additionally, making transaction malleability a non-issue. - #### Two-factor wallets Services like GreenAddress store bitcoins with 2-of-2 multisig scriptPubKey's such that one keypair is controlled by the user, and the other keypair is -controlled by the service. To spend funds the user uses locally installed +controlled by the service. +To spend funds the user uses locally installed wallet software that generates one of the required signatures, and then uses a 2nd-factor authentication method to authorize the service to create the second SIGHASH_NONE signature that is locked until some time in the future and sends -the user that signature for storage. If the user needs to spend their funds and +the user that signature for storage. +If the user needs to spend their funds and the service is not available, they wait until the nLockTime expires. The problem is there exist numerous occasions the user will not have a valid -signature for some or all of their transaction outputs. With +signature for some or all of their transaction outputs. +With CHECKLOCKTIMEVERIFY rather than creating refund signatures on demand scriptPubKeys of the following form are used instead: @@ -126,25 +125,27 @@ scriptPubKeys of the following form are used instead: Now the user is always able to spend their funds without the co-operation of the service by waiting for the expiry time to be reached. - #### Payment Channels Jeremy Spilman style payment channels first setup a deposit controlled by 2-of-2 multisig, tx1, and then adjust a second transaction, tx2, that spends -the output of tx1 to payor and payee. Prior to publishing tx1 a refund +the output of tx1 to payor and payee. +Prior to publishing tx1 a refund transaction is created, tx3, to ensure that should the payee vanish the payor -can get their deposit back. The process by which the refund transaction is +can get their deposit back. +The process by which the refund transaction is created is currently vulnerable to transaction malleability attacks, and -additionally, requires the payor to store the refund. Using the same +additionally, requires the payor to store the refund. +Using the same scriptPubKey form as in the Two-factor wallets example solves both these issues. - ### Trustless Payments for Publishing Data The PayPub protocol makes it possible to pay for information in a trustless way by first proving that an encrypted file contains the desired data, and secondly crafting scriptPubKeys used for payment such that spending them reveals the -encryption keys to the data. However the existing implementation has a +encryption keys to the data. +However the existing implementation has a significant flaw: the publisher can delay the release of the keys indefinitely. This problem can be solved interactively with the refund transaction technique; @@ -159,36 +160,37 @@ scriptPubKeys of the following form: CHECKSIG ENDIF -The buyer of the data is now making a secure offer with an expiry time. If the +The buyer of the data is now making a secure offer with an expiry time. +If the publisher fails to accept the offer before the expiry time is reached the buyer can cancel the offer by spending the output. - ### Proving sacrifice to miners' fees Proving the sacrifice of some limited resource is a common technique in a -variety of cryptographic protocols. Proving sacrifices of coins to mining fees +variety of cryptographic protocols. +Proving sacrifices of coins to mining fees has been proposed as a ''universal public good'' to which the sacrifice could -be directed, rather than simply destroying the coins. However doing so is -non-trivial, and even the best existing technqiue - announce-commit sacrifices -- could encourage mining centralization. CHECKLOCKTIMEVERIFY can be used to +be directed, rather than simply destroying the coins. +However doing so is +non-trivial, and even the best existing technqiue - announce-commit sacrifices - could encourage mining centralization. +CHECKLOCKTIMEVERIFY can be used to create outputs that are provably spendable by anyone (thus to mining fees assuming miners behave optimally and rationally) but only at a time sufficiently far into the future that large miners can't profitably sell the sacrifices at a discount. - ### Freezing Funds In addition to using cold storage, hardware wallets, and P2SH multisig outputs to control funds, now funds can be frozen in UTXOs directly on the blockchain. With the following scriptPubKey, nobody will be able to spend the encumbered -output until the provided expiry time. This ability to freeze funds reliably may +output until the provided expiry time. +This ability to freeze funds reliably may be useful in scenarios where reducing duress or confiscation risk is desired. CHECKLOCKTIMEVERIFY DROP DUP HASH160 EQUALVERIFY CHECKSIG - ### Replacing the nLockTime field entirely As an aside, note how if the SignatureHash() algorithm could optionally cover @@ -199,13 +201,11 @@ Bitcoin) This per-signature capability could replace the per-transaction nLockTime field entirely as a valid signature would now be the proof that a transaction output ''can'' be spent. - ## Detailed Specification Refer to the reference implementation, reproduced below, for the precise semantics and detailed rationale for those semantics. - case OP_NOP2: { // CHECKLOCKTIMEVERIFY @@ -277,14 +277,15 @@ semantics and detailed rationale for those semantics. https://github.com/petertodd/bitcoin/commit/ab0f54f38e08ee1e50ff72f801680ee84d0f1bf4 - ## Deployment We reuse the double-threshold IsSuperMajority() switchover mechanism used in -BIP66 with the same thresholds, but for nVersion = 4. The new rules are +BIP66 with the same thresholds, but for nVersion = 4. +The new rules are in effect for every block (at height H) with nVersion = 4 and at least 750 out of 1000 blocks preceding it (with heights H-1000..H-1) also -have nVersion >= 4. Furthermore, when 950 out of the 1000 blocks +have nVersion >= 4. +Furthermore, when 950 out of the 1000 blocks preceding a block do have nVersion >= 4, nVersion < 4 blocks become invalid, and all further blocks enforce the new rules. @@ -292,24 +293,22 @@ It should be noted that BIP9 involves permanently setting a high-order bit to 1 which results in nVersion >= all prior IsSuperMajority() soft-forks and thus no bits in nVersion are permanently lost. - ### SPV Clients While SPV clients are (currently) unable to validate blocks in general, trusting miners to do validation for them, they are able to validate block -headers and thus can validate a subset of the deployment rules. SPV clients +headers and thus can validate a subset of the deployment rules + SPV clients should reject nVersion < 4 blocks if 950 out of 1000 preceding blocks have nVersion >= 4 to prevent false confirmations from the remaining 5% of non-upgraded miners when the 95% threshold has been reached. - ## Credits Thanks goes to Gregory Maxwell for suggesting that the argument be compared against the per-transaction nLockTime, rather than the current block height and time. - ## References PayPub @@ -320,7 +319,6 @@ Jeremy Spilman Payment Channels * https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2013-April/002433.html - ## Implementations Python / python-bitcoinlib @@ -331,7 +329,6 @@ JavaScript / Node.js / bitcore * https://github.com/mruddy/bip65-demos - ## Copyright -This document is placed in the public domain. \ No newline at end of file +This document is placed in the public domain. diff --git a/protocol/forks/bip-0066.md b/protocol/forks/bip-0066.md index 04ec595..c49e409 100644 --- a/protocol/forks/bip-0066.md +++ b/protocol/forks/bip-0066.md @@ -1,15 +1,15 @@ -
-  BIP: 66
-  Layer: Consensus (soft fork)
-  Title: Strict DER signatures
-  Author: Pieter Wuille <pieter.wuille@gmail.com>
-  Comments-Summary: No comments yet.
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0066
-  Status: Final
-  Type: Standards Track
-  Created: 2015-01-10
-  License: BSD-2-Clause
-
+# BIP-0066 + + BIP: 66 + Layer: Consensus (soft fork) + Title: Strict DER signatures + Author: Pieter Wuille <pieter.wuille@gmail.com> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0066 + Status: Final + Type: Standards Track + Created: 2015-01-10 + License: BSD-2-Clause ## Abstract @@ -21,21 +21,29 @@ This BIP is licensed under the 2-clause BSD license. ## Motivation -Bitcoin's reference implementation currently relies on OpenSSL for signature validation, which means it is implicitly defining Bitcoin's block validity rules. Unfortunately, OpenSSL is not designed for consensus-critical behaviour (it does not guarantee bug-for-bug compatibility between versions), and thus changes to it can - and have - affected Bitcoin software. +Bitcoin's reference implementation currently relies on OpenSSL for signature validation, which means it is implicitly defining Bitcoin's block validity rules. +Unfortunately, OpenSSL is not designed for consensus-critical behaviour (it does not guarantee bug-for-bug compatibility between versions), and thus changes to it can - and have - affected Bitcoin software. -One specifically critical area is the encoding of signatures. Until recently, OpenSSL's releases would accept various deviations from the DER standard and accept signatures as valid. When this changed in OpenSSL 1.0.0p and 1.0.1k, it made some nodes reject the chain. +One specifically critical area is the encoding of signatures. +Until recently, OpenSSL's releases would accept various deviations from the DER standard and accept signatures as valid. +When this changed in OpenSSL 1.0.0p and 1.0.1k, it made some nodes reject the chain. -This document proposes to restrict valid signatures to exactly what is mandated by DER, to make the consensus rules not depend on OpenSSL's signature parsing. A change like this is required if implementations would want to remove all of OpenSSL from the consensus code. +This document proposes to restrict valid signatures to exactly what is mandated by DER, to make the consensus rules not depend on OpenSSL's signature parsing. +A change like this is required if implementations would want to remove all of OpenSSL from the consensus code. ## Specification -Every signature passed to OP_CHECKSIG, OP_CHECKSIGVERIFY, OP_CHECKMULTISIG, or OP_CHECKMULTISIGVERIFY, to which ECDSA verification is applied, must be encoded using strict DER encoding (see further). +Every signature passed to `OP_CHECKSIG`, `OP_CHECKSIGVERIFY`, `OP_CHECKMULTISIG`, or `OP_CHECKMULTISIGVERIFY`, to which ECDSA verification is applied, must be encoded using strict DER encoding (see further). -These operators all perform ECDSA verifications on pubkey/signature pairs, iterating from the top of the stack backwards. For each such verification, if the signature does not pass the IsValidSignatureEncoding check below, the entire script evaluates to false immediately. If the signature is valid DER, but does not pass ECDSA verification, opcode execution continues as it used to, causing opcode execution to stop and push false on the stack (but not immediately fail the script) in some cases, which potentially skips further signatures (and thus does not subject them to IsValidSignatureEncoding). +These operators all perform ECDSA verifications on pubkey/signature pairs, iterating from the top of the stack backwards. +For each such verification, if the signature does not pass the IsValidSignatureEncoding check below, the entire script evaluates to false immediately. +If the signature is valid DER, but does not pass ECDSA verification, opcode execution continues as it used to, causing opcode execution to stop and push false on the stack (but not immediately fail the script) in some cases, which potentially skips further signatures (and thus does not subject them to IsValidSignatureEncoding). ### DER encoding reference -The following code specifies the behaviour of strict DER checking. Note that this function tests a signature byte vector which includes the 1-byte sighash flag that Bitcoin adds, even though that flag falls outside of the DER specification, and is unaffected by this proposal. The function is also not called for cases where the length of sig is 0, in order to provide a simple, short and efficiently-verifiable encoding for deliberately invalid signatures. +The following code specifies the behaviour of strict DER checking. +Note that this function tests a signature byte vector which includes the 1-byte sighash flag that Bitcoin adds, even though that flag falls outside of the DER specification, and is unaffected by this proposal. +The function is also not called for cases where the length of sig is 0, in order to provide a simple, short and efficiently-verifiable encoding for deliberately invalid signatures. DER is specified in https://www.itu.int/rec/T-REC-X.690/en . @@ -108,7 +116,11 @@ bool static IsValidSignatureEncoding(const std::vector &sig) { ### Examples -Notation: P1 and P2 are valid, serialized, public keys. S1 and S2 are valid signatures using respective keys P1 and P2. S1' and S2' are non-DER but otherwise valid signatures using those same keys. F is any invalid but DER-compliant signature (including 0, the empty string). F' is any invalid and non-DER-compliant signature. +Notation: P1 and P2 are valid, serialized, public keys. +S1 and S2 are valid signatures using respective keys P1 and P2. +S1' and S2' are non-DER but otherwise valid signatures using those same keys. +F is any invalid but DER-compliant signature (including 0, the empty string). +F' is any invalid and non-DER-compliant signature. 1. S1' P1 CHECKSIG fails (changed) 2. S1' P1 CHECKSIG NOT fails (unchanged) @@ -127,20 +139,24 @@ Note that the examples above show that only additional failures are required by ## Deployment -We reuse the double-threshold switchover mechanism from BIP 34, with the same thresholds, but for nVersion = 3. The new rules are in effect for every block (at height H) with nVersion = 3 and at least 750 out of 1000 blocks preceding it (with heights H-1000..H-1) also have nVersion = 3. Furthermore, when 950 out of the 1000 blocks preceding a block do have nVersion = 3, nVersion = 2 blocks become invalid, and all further blocks enforce the new rules. +We reuse the double-threshold switchover mechanism from BIP 34, with the same thresholds, but for nVersion = 3. +The new rules are in effect for every block (at height H) with nVersion = 3 and at least 750 out of 1000 blocks preceding it (with heights H-1000..H-1) also have nVersion = 3. +Furthermore, when 950 out of the 1000 blocks preceding a block do have nVersion = 3, nVersion = 2 blocks become invalid, and all further blocks enforce the new rules. ## Compatibility -The requirement to have signatures that comply strictly with DER has been enforced as a relay policy by the reference client since v0.8.0, and very few transactions violating it are being added to the chain as of January 2015. In addition, every non-compliant signature can trivially be converted into a compliant one, so there is no loss of functionality by this requirement. This proposal has the added benefit of reducing transaction malleability (see BIP 62). +The requirement to have signatures that comply strictly with DER has been enforced as a relay policy by the reference client since v0.8.0, and very few transactions violating it are being added to the chain as of January 2015. +In addition, every non-compliant signature can trivially be converted into a compliant one, so there is no loss of functionality by this requirement. +This proposal has the added benefit of reducing transaction malleability (see BIP 62). ## Implementation -An implementation for the reference client is available at https://github.com/bitcoin/bitcoin/pull/5713 +An implementation for the reference client is available at https://github.com/bitcoin/bitcoin/pull/5713 . ## Acknowledgements -This document is extracted from the previous BIP62 proposal, which had input from various people, in particular Greg Maxwell and Peter Todd, who gave feedback about this document as well. +This document is extracted from the previous BIP62 proposal, which had input from various people, in particular Greg Maxwell and Peter Todd, who gave feedback about this document as well. ## Disclosures -* Subsequent to the network-wide adoption and enforcement of this BIP, the author [disclosed](https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-July/009697.html) that strict DER signatures provided an indirect solution to a consensus bug he had previously discovered. \ No newline at end of file +* Subsequent to the network-wide adoption and enforcement of this BIP, the author [disclosed](https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-July/009697.html) that strict DER signatures provided an indirect solution to a consensus bug he had previously discovered. diff --git a/protocol/forks/bip-0068.md b/protocol/forks/bip-0068.md index 76c1912..cec584c 100644 --- a/protocol/forks/bip-0068.md +++ b/protocol/forks/bip-0068.md @@ -1,17 +1,17 @@ -
-  BIP: 68
-  Layer: Consensus (soft fork)
-  Title: Relative lock-time using consensus-enforced sequence numbers
-  Author: Mark Friedenbach <mark@friedenbach.org>
-          BtcDrak <btcdrak@gmail.com>
-          Nicolas Dorier <nicolas.dorier@gmail.com>
-          kinoshitajona <kinoshitajona@gmail.com>
-  Comments-Summary: No comments yet.
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0068
-  Status: Final
-  Type: Standards Track
-  Created: 2015-05-28
-
+# BIP-0068 + + BIP: 68 + Layer: Consensus (soft fork) + Title: Relative lock-time using consensus-enforced sequence numbers + Author: Mark Friedenbach <mark@friedenbach.org> + BtcDrak <btcdrak@gmail.com> + Nicolas Dorier <nicolas.dorier@gmail.com> + kinoshitajona <kinoshitajona@gmail.com> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0068 + Status: Final + Type: Standards Track + Created: 2015-05-28 ## Abstract @@ -19,9 +19,16 @@ This BIP introduces relative lock-time (RLT) consensus-enforced semantics of the ## Motivation -Bitcoin transactions have a sequence number field for each input. The original idea appears to have been that a transaction in the mempool would be replaced by using the same input with a higher sequence value. Although this was not properly implemented, it assumes miners would prefer higher sequence numbers even if the lower ones were more profitable to mine. However, a miner acting on profit motives alone would break that assumption completely. The change described by this BIP repurposes the sequence number for new use cases without breaking existing functionality. It also leaves room for future expansion and other use cases. +Bitcoin transactions have a sequence number field for each input. +The original idea appears to have been that a transaction in the mempool would be replaced by using the same input with a higher sequence value. +Although this was not properly implemented, it assumes miners would prefer higher sequence numbers even if the lower ones were more profitable to mine. +However, a miner acting on profit motives alone would break that assumption completely. +The change described by this BIP repurposes the sequence number for new use cases without breaking existing functionality. +It also leaves room for future expansion and other use cases. -The transaction nLockTime is used to prevent the mining of a transaction until a certain date. nSequence will be repurposed to prevent mining of a transaction until a certain age of the spent output in blocks or timespan. This, among other uses, allows bi-directional payment channels as used in [Hashed Timelock Contracts (HTLCs)](https://github.com/ElementsProject/lightning/raw/master/doc/deployable-lightning.pdf) and [BIP112](/protocol/forks/bip-0112#bidirectional-payment-channels). +The transaction nLockTime is used to prevent the mining of a transaction until a certain date. +nSequence will be repurposed to prevent mining of a transaction until a certain age of the spent output in blocks or timespan. +This, among other uses, allows bi-directional payment channels as used in [Hashed Timelock Contracts (HTLCs)](https://github.com/ElementsProject/lightning/raw/master/doc/deployable-lightning.pdf) and [BIP112](/protocol/forks/bip-0112#bidirectional-payment-channels). ## Specification @@ -33,24 +40,33 @@ If bit (1 << 31) of the sequence number is set, then no consensus meaning is app If bit (1 << 31) of the sequence number is not set, then the sequence number is interpreted as an encoded relative lock-time. -The sequence number encoding is interpreted as follows: +The sequence number encoding is interpreted as follows: -Bit (1 << 22) determines if the relative lock-time is time-based or block based: If the bit is set, the relative lock-time specifies a timespan in units of 512 seconds granularity. The timespan starts from the median-time-past of the output’s previous block, and ends at the MTP of the previous block. If the bit is not set, the relative lock-time specifies a number of blocks. +Bit (1 << 22) determines if the relative lock-time is time-based or block based: If the bit is set, the relative lock-time specifies a timespan in units of 512 seconds granularity. +The timespan starts from the median-time-past of the output’s previous block, and ends at the MTP of the previous block. +If the bit is not set, the relative lock-time specifies a number of blocks. The flag (1<<22) is the highest order bit in a 3-byte signed integer for use in bitcoin scripts as a 3-byte PUSHDATA with OP_CHECKSEQUENCEVERIFY (BIP 112). -This specification only interprets 16 bits of the sequence number as relative lock-time, so a mask of 0x0000ffff MUST be applied to the sequence field to extract the relative lock-time. The 16-bit specification allows for a year of relative lock-time and the remaining bits allow for future expansion. +This specification only interprets 16 bits of the sequence number as relative lock-time, so a mask of 0x0000ffff MUST be applied to the sequence field to extract the relative lock-time. +The 16-bit specification allows for a year of relative lock-time and the remaining bits allow for future expansion. -For time based relative lock-time, 512 second granularity was chosen because bitcoin blocks are generated every 600 seconds. So when using block-based or time-based, the same amount of time can be encoded with the available number of bits. Converting from a sequence number to seconds is performed by multiplying by 512 = 2^9, or equivalently shifting up by 9 bits. +For time based relative lock-time, 512 second granularity was chosen because bitcoin blocks are generated every 600 seconds. +So when using block-based or time-based, the same amount of time can be encoded with the available number of bits. +Converting from a sequence number to seconds is performed by multiplying by 512 = 2^9, or equivalently shifting up by 9 bits. -When the relative lock-time is time-based, it is interpreted as a minimum block-time constraint over the input's age. A relative time-based lock-time of zero indicates an input which can be included in any block. More generally, a relative time-based lock-time n can be included into any block produced 512 * n seconds after the mining date of the output it is spending, or any block thereafter. +When the relative lock-time is time-based, it is interpreted as a minimum block-time constraint over the input's age. +A relative time-based lock-time of zero indicates an input which can be included in any block. +More generally, a relative time-based lock-time n can be included into any block produced 512 * n seconds after the mining date of the output it is spending, or any block thereafter. The mining date of the output is equal to the median-time-past of the previous block which mined it. The block produced time is equal to the median-time-past of its previous block. -When the relative lock-time is block-based, it is interpreted as a minimum block-height constraint over the input's age. A relative block-based lock-time of zero indicates an input which can be included in any block. More generally, a relative block lock-time n can be included n blocks after the mining date of the output it is spending, or any block thereafter. +When the relative lock-time is block-based, it is interpreted as a minimum block-height constraint over the input's age. +A relative block-based lock-time of zero indicates an input which can be included in any block. +More generally, a relative block lock-time n can be included n blocks after the mining date of the output it is spending, or any block thereafter. The new rules are not applied to the nSequence field of the input of the coinbase transaction. @@ -233,11 +249,14 @@ This BIP must be deployed simultaneously with BIP112 and BIP113 using the same d ## Compatibility -The only use of sequence numbers by the Bitcoin Core reference client software is to disable checking the nLockTime constraints in a transaction. The semantics of that application are preserved by this BIP. +The only use of sequence numbers by the Bitcoin Core reference client software is to disable checking the nLockTime constraints in a transaction. +The semantics of that application are preserved by this BIP. -As can be seen from the specification section, a number of bits are undefined by this BIP to allow for other use cases by setting bit (1 << 31) as the remaining 31 bits have no meaning under this BIP. Additionally, bits (1 << 23) through (1 << 30) inclusive have no meaning at all when bit (1 << 31) is unset. +As can be seen from the specification section, a number of bits are undefined by this BIP to allow for other use cases by setting bit (1 << 31) as the remaining 31 bits have no meaning under this BIP. +Additionally, bits (1 << 23) through (1 << 30) inclusive have no meaning at all when bit (1 << 31) is unset. -Additionally, this BIP specifies only 16 bits to actually encode relative lock-time meaning a further 6 are unused (1 << 16 through 1 << 21 inclusive). This allows the possibility to increase granularity by soft-fork, or for increasing the maximum possible relative lock-time in the future. +Additionally, this BIP specifies only 16 bits to actually encode relative lock-time meaning a further 6 are unused (1 << 16 through 1 << 21 inclusive). +This allows the possibility to increase granularity by soft-fork, or for increasing the maximum possible relative lock-time in the future. The most efficient way to calculate sequence number from relative lock-time is with bit masks and shifts: @@ -261,4 +280,4 @@ Bitcoin mailing list discussion: https://www.mail-archive.com/bitcoin-developmen [BIP113](/protocol/forks/bip-0113): https://github.com/bitcoin/bips/blob/master/bip-0113.mediawiki -Hashed Timelock Contracts (HTLCs): https://github.com/ElementsProject/lightning/raw/master/doc/deployable-lightning.pdf \ No newline at end of file +Hashed Timelock Contracts (HTLCs): https://github.com/ElementsProject/lightning/raw/master/doc/deployable-lightning.pdf diff --git a/protocol/forks/bip-0112.md b/protocol/forks/bip-0112.md index 3a84a45..b1e1862 100644 --- a/protocol/forks/bip-0112.md +++ b/protocol/forks/bip-0112.md @@ -1,25 +1,21 @@ -
-  BIP: 112
-  Layer: Consensus (soft fork)
-  Title: CHECKSEQUENCEVERIFY
-  Author: BtcDrak <btcdrak@gmail.com>
-          Mark Friedenbach <mark@friedenbach.org>
-          Eric Lombrozo <elombrozo@gmail.com>
-  Comments-Summary: No comments yet.
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0112
-  Status: Final
-  Type: Standards Track
-  Created: 2015-08-10
-  License: PD
-
+BIP-0112 + + BIP: 112 + Layer: Consensus (soft fork) + Title: CHECKSEQUENCEVERIFY + Author: BtcDrak <btcdrak@gmail.com> + Mark Friedenbach <mark@friedenbach.org> + Eric Lombrozo <elombrozo@gmail.com> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0112 + Status: Final + Type: Standards Track + Created: 2015-08-10 + License: PD ## Abstract -This BIP describes a new opcode (CHECKSEQUENCEVERIFY) for the Bitcoin -scripting system that in combination with BIP 68 allows execution -pathways of a script to be restricted based on the age of the output -being spent. - +This BIP describes a new opcode (CHECKSEQUENCEVERIFY) for the Bitcoin scripting system that in combination with BIP 68 allows execution pathways of a script to be restricted based on the age of the output being spent. ## Summary @@ -36,31 +32,24 @@ When executed, if any of the following conditions are true, the script interpret Otherwise, script execution will continue as if a NOP had been executed. -BIP 68 prevents a non-final transaction from being selected for inclusion in a block until the corresponding input has reached the specified age, as measured in block-height or block-time. By comparing the argument to CHECKSEQUENCEVERIFY against the nSequence field, we indirectly verify a desired minimum age of the -the output being spent; until that relative age has been reached any script execution pathway including the CHECKSEQUENCEVERIFY will fail to validate, causing the transaction not to be selected for inclusion in a block. - +BIP 68 prevents a non-final transaction from being selected for inclusion in a block until the corresponding input has reached the specified age, as measured in block-height or block-time. +By comparing the argument to CHECKSEQUENCEVERIFY against the nSequence field, we indirectly verify a desired minimum age of the the output being spent; +until that relative age has been reached any script execution pathway including the CHECKSEQUENCEVERIFY will fail to validate, causing the transaction not to be selected for inclusion in a block. ## Motivation -BIP 68 repurposes the transaction nSequence field meaning by giving -sequence numbers new consensus-enforced semantics as a relative -lock-time. However, there is no way to build Bitcoin scripts to make -decisions based on this field. - -By making the nSequence field accessible to script, it becomes -possible to construct code pathways that only become accessible some -minimum time after proof-of-publication. This enables a wide variety -of applications in phased protocols such as escrow, payment channels, -or bidirectional pegs. +BIP 68 repurposes the transaction nSequence field meaning by giving sequence numbers new consensus-enforced semantics as a relative lock-time. +However, there is no way to build Bitcoin scripts to make decisions based on this field. +By making the nSequence field accessible to script, it becomes possible to construct code pathways that only become accessible some minimum time after proof-of-publication. +This enables a wide variety of applications in phased protocols such as escrow, payment channels, or bidirectional pegs. ### Contracts With Expiration Deadlines #### Escrow with Timeout -An escrow that times out automatically 30 days after being funded can be -established in the following way. Alice, Bob and Escrow create a 2-of-3 -address with the following redeemscript. +An escrow that times out automatically 30 days after being funded can be established in the following way. +Alice, Bob and Escrow create a 2-of-3 address with the following redeemscript. IF 2 3 CHECKMULTISIG @@ -69,74 +58,49 @@ address with the following redeemscript. CHECKSIG ENDIF -At any time funds can be spent using signatures from any two of Alice, -Bob or the Escrow. - +At any time funds can be spent using signatures from any two of Alice, Bob or the Escrow. After 30 days Alice can sign alone. - The clock does not start ticking until the payment to the escrow address -confirms. - +confirms. ### Retroactive Invalidation -In many instances, we would like to create contracts that can be revoked in case -of some future event. However, given the immutable nature of the blockchain, it -is practically impossible to retroactively invalidate a previous commitment that -has already confirmed. The only mechanism we really have for retroactive -invalidation is blockchain reorganization which, for fundamental security -reasons, is designed to be very hard and very expensive to do. +In many instances, we would like to create contracts that can be revoked in case of some future event. +However, given the immutable nature of the blockchain, it is practically impossible to retroactively invalidate a previous commitment that has already confirmed. +The only mechanism we really have for retroactive invalidation is blockchain reorganization which, for fundamental security reasons, is designed to be very hard and very expensive to do. -Despite this limitation, we do have a way to provide something functionally similar to retroactive invalidation while preserving irreversibility of past commitments -using CHECKSEQUENCEVERIFY. By constructing scripts with multiple branches of -execution where one or more of the branches are delayed we provide -a time window in which someone can supply an invalidation condition that allows the -output to be spent, effectively invalidating the would-be delayed branch and potentially discouraging -another party from broadcasting the transaction in the first place. If the invalidation -condition does not occur before the timeout, the delayed branch becomes spendable, -honoring the original contract. +Despite this limitation, we do have a way to provide something functionally similar to retroactive invalidation while preserving irreversibility of past commitments using `CHECKSEQUENCEVERIFY`. +By constructing scripts with multiple branches of execution where one or more of the branches are delayed we provide a time window in which someone can supply an invalidation condition that allows the output to be spent, effectively invalidating the would-be delayed branch and potentially discouraging another party from broadcasting the transaction in the first place. +If the invalidation condition does not occur before the timeout, the delayed branch becomes spendable, honoring the original contract. Some more specific applications of this idea: #### Hash Time-Locked Contracts -Hash Time-Locked Contracts (HTLCs) provide a general mechanism for off-chain contract negotiation. An execution pathway can be made to require knowledge of a secret (a hash preimage) that can be presented within an invalidation time window. By sharing the secret it is possible to guarantee to the counterparty that the transaction will never be broadcast since this would allow the counterparty to claim the output immediately while one would have to wait for the time window to pass. If the secret has not been shared, the counterparty will be unable to use the instant pathway and the delayed pathway must be used instead. +Hash Time-Locked Contracts (HTLCs) provide a general mechanism for off-chain contract negotiation. +An execution pathway can be made to require knowledge of a secret (a hash preimage) that can be presented within an invalidation time window. +By sharing the secret it is possible to guarantee to the counterparty that the transaction will never be broadcast since this would allow the counterparty to claim the output immediately while one would have to wait for the time window to pass. +If the secret has not been shared, the counterparty will be unable to use the instant pathway and the delayed pathway must be used instead. #### Bidirectional Payment Channels -Scriptable relative locktime provides a predictable amount of time to respond in -the event a counterparty broadcasts a revoked transaction: Absolute locktime -necessitates closing the channel and reopen it when getting close to the timeout, -whereas with relative locktime, the clock starts ticking the moment the -transactions confirms in a block. It also provides a means to know exactly how -long to wait (in number of blocks) before funds can be pulled out of the channel -in the event of a noncooperative counterparty. - +Scriptable relative locktime provides a predictable amount of time to respond in the event a counterparty broadcasts a revoked transaction: +Absolute locktime necessitates closing the channel and reopen it when getting close to the timeout, whereas with relative locktime, the clock starts ticking the moment the transactions confirms in a block. +It also provides a means to know exactly how long to wait (in number of blocks) before funds can be pulled out of the channel in the event of a noncooperative counterparty. #### Lightning Network The lightning network extends the bidirectional payment channel idea to allow for payments to be routed over multiple bidirectional payment channel hops. -These channels are based on an anchor transaction that requires a 2-of-2 -multisig from Alice and Bob, and a series of revocable commitment -transactions that spend the anchor transaction. The commitment -transaction splits the funds from the anchor between Alice and Bob and -the latest commitment transaction may be published by either party at -any time, finalising the channel. +These channels are based on an anchor transaction that requires a 2-of-2 multisig from Alice and Bob, and a series of revocable commitment transactions that spend the anchor transaction. +The commitment transaction splits the funds from the anchor between Alice and Bob and the latest commitment transaction may be published by either party at any time, finalising the channel. -Ideally then, a revoked commitment transaction would never be able to -be successfully spent; and the latest commitment transaction would be -able to be spent very quickly. +Ideally then, a revoked commitment transaction would never be able to be successfully spent; and the latest commitment transaction would be able to be spent very quickly. -To allow a commitment transaction to be effectively revoked, Alice -and Bob have slightly different versions of the latest commitment -transaction. In Alice's version, any outputs in the commitment -transaction that pay Alice also include a forced delay, and an -alternative branch that allows Bob to spend the output if he knows that -transaction's revocation code. In Bob's version, payments to Bob are -similarly encumbered. When Alice and Bob negotiate new balances and -new commitment transactions, they also reveal the old revocation code, -thus committing to not relaying the old transaction. +To allow a commitment transaction to be effectively revoked, Alice and Bob have slightly different versions of the latest commitment transaction. +In Alice's version, any outputs in the commitment transaction that pay Alice also include a forced delay, and an alternative branch that allows Bob to spend the output if he knows that transaction's revocation code. +In Bob's version, payments to Bob are similarly encumbered. +When Alice and Bob negotiate new balances and new commitment transactions, they also reveal the old revocation code, thus committing to not relaying the old transaction. A simple output, paying to Alice might then look like: @@ -149,11 +113,9 @@ A simple output, paying to Alice might then look like: ENDIF CHECKSIG -This allows Alice to publish the latest commitment transaction at any -time and spend the funds after 24 hours, but also ensures that if Alice -relays a revoked transaction, that Bob has 24 hours to claim the funds. +This allows Alice to publish the latest commitment transaction at any time and spend the funds after 24 hours, but also ensures that if Alice relays a revoked transaction, that Bob has 24 hours to claim the funds. -With CHECKLOCKTIMEVERIFY, this would look like: +With `CHECKLOCKTIMEVERIFY`, this would look like: HASH160 EQUAL IF @@ -164,22 +126,13 @@ With CHECKLOCKTIMEVERIFY, this would look like: ENDIF CHECKSIG -This form of transaction would mean that if the anchor is unspent on -2015/12/16, Alice can use this commitment even if it has been revoked, -simply by spending it immediately, giving no time for Bob to claim it. +This form of transaction would mean that if the anchor is unspent on 2015/12/16, Alice can use this commitment even if it has been revoked, simply by spending it immediately, giving no time for Bob to claim it. -This means that the channel has a deadline that cannot be pushed -back without hitting the blockchain; and also that funds may not be -available until the deadline is hit. CHECKSEQUENCEVERIFY allows you -to avoid making such a tradeoff. +This means that the channel has a deadline that cannot be pushed back without hitting the blockchain;and also that funds may not be available until the deadline is hit. +`CHECKSEQUENCEVERIFY` allows you to avoid making such a tradeoff. -Hashed Time-Lock Contracts (HTLCs) make this slightly more complicated, -since in principle they may pay either Alice or Bob, depending on whether -Alice discovers a secret R, or a timeout is reached, but the same principle -applies -- the branch paying Alice in Alice's commitment transaction gets a -delay, and the entire output can be claimed by the other party if the -revocation secret is known. With CHECKSEQUENCEVERIFY, a HTLC payable to -Alice might look like the following in Alice's commitment transaction: +Hashed Time-Lock Contracts (HTLCs) make this slightly more complicated, since in principle they may pay either Alice or Bob, depending on whether Alice discovers a secret R, or a timeout is reached, but the same principle applies -- the branch paying Alice in Alice's commitment transaction gets a delay, and the entire output can be claimed by the other party if the revocation secret is known. +With `CHECKSEQUENCEVERIFY`, a HTLC payable to Alice might look like the following in Alice's commitment transaction: HASH160 DUP EQUAL IF @@ -209,17 +162,14 @@ and correspondingly in Bob's commitment transaction: ENDIF CHECKSIG -Note that both CHECKSEQUENCEVERIFY and CHECKLOCKTIMEVERIFY are used in the -final branch of above to ensure Bob cannot spend the output until after both -the timeout is complete and Alice has had time to reveal the revocation -secret. - -See the [https://github.com/ElementsProject/lightning/blob/master/doc/deployable-lightning.pdf Deployable Lightning] paper. +Note that both CHECKSEQUENCEVERIFY and CHECKLOCKTIMEVERIFY are used in the final branch of above to ensure Bob cannot spend the output until after both the timeout is complete and Alice has had time to reveal the revocation secret. +See the [https://github.com/ElementsProject/lightning/blob/master/doc/deployable-lightning.pdf] Deployable Lightning paper. #### 2-Way Pegged Sidechains -The 2-way pegged sidechain requires a new REORGPROOFVERIFY opcode, the semantics of which are outside the scope of this BIP. CHECKSEQUENCEVERIFY is used to make sure that sufficient time has passed since the return peg was posted to publish a reorg proof: +The 2-way pegged sidechain requires a new REORGPROOFVERIFY opcode, the semantics of which are outside the scope of this BIP. +CHECKSEQUENCEVERIFY is used to make sure that sufficient time has passed since the return peg was posted to publish a reorg proof: IF lockTxHeight nlocktxOut [] reorgBounty Hash160(<...>) REORGPROOFVERIFY @@ -227,11 +177,9 @@ The 2-way pegged sidechain requires a new REORGPROOFVERIFY opcode, the semantics withdrawLockTime CHECKSEQUENCEVERIFY DROP HASH160 p2shWithdrawDest EQUAL ENDIF - ## Specification -Refer to the reference implementation, reproduced below, for the precise -semantics and detailed rationale for those semantics. +Refer to the reference implementation, reproduced below, for the precise semantics and detailed rationale for those semantics.
 /* Below flags apply in the context of BIP 68 */
@@ -343,7 +291,6 @@ A reference implementation is provided by the following pull request:
 
 https://github.com/bitcoin/bitcoin/pull/7524
 
-
 ## Deployment
 
 This BIP is to be deployed by "versionbits" BIP9 using bit 0.
@@ -367,7 +314,6 @@ BtcDrak authored this BIP document.
 
 Thanks to Eric Lombrozo and Anthony Towns for contributing example use cases.
 
-
 ## References
 
 [BIP 9](/protocol/forks/bip-0009) Versionbits
@@ -392,7 +338,6 @@ Thanks to Eric Lombrozo and Anthony Towns for contributing example use cases.
 
 [Jeremy Spilman Micropayment Channels](https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2013-April/002433.html)
 
-
 ## Copyright
 
-This document is placed in the public domain.
\ No newline at end of file
+This document is placed in the public domain.
diff --git a/protocol/forks/bip-0113.md b/protocol/forks/bip-0113.md
index 031c717..85f6617 100644
--- a/protocol/forks/bip-0113.md
+++ b/protocol/forks/bip-0113.md
@@ -1,54 +1,37 @@
-
-  BIP: 113
-  Layer: Consensus (soft fork)
-  Title: Median time-past as endpoint for lock-time calculations
-  Author: Thomas Kerin <me@thomaskerin.io>
-          Mark Friedenbach <mark@friedenbach.org>
-  Comments-Summary: No comments yet.
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0113
-  Status: Final
-  Type: Standards Track
-  Created: 2015-08-10
-  License: PD
-
+# BIP-0113 + BIP: 113 + Layer: Consensus (soft fork) + Title: Median time-past as endpoint for lock-time calculations + Author: Thomas Kerin <me@thomaskerin.io> + Mark Friedenbach <mark@friedenbach.org> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0113 + Status: Final + Type: Standards Track + Created: 2015-08-10 + License: PD ## Abstract -This BIP is a proposal to redefine the semantics used in determining a -time-locked transaction's eligibility for inclusion in a block. The -median of the last 11 blocks is used instead of the block's timestamp, -ensuring that it increases monotonically with each block. - +This BIP is a proposal to redefine the semantics used in determining a time-locked transaction's eligibility for inclusion in a block. +The median of the last 11 blocks is used instead of the block's timestamp, ensuring that it increases monotonically with each block. ## Motivation -At present, transactions are excluded from inclusion in a block if the -present time or block height is less than or equal to that specified -in the locktime. Since the consensus rules do not mandate strict -ordering of block timestamps, this has the unfortunate outcome of -creating a perverse incentive for miners to lie about the time of -their blocks in order to collect more fees by including transactions -that by wall clock determination have not yet matured. +At present, transactions are excluded from inclusion in a block if the present time or block height is less than or equal to that specified in the locktime. +Since the consensus rules do not mandate strict ordering of block timestamps, this has the unfortunate outcome of creating a perverse incentive for miners to lie about the time of their blocks in order to collect more fees by including transactions that by wall clock determination have not yet matured. -This BIP proposes comparing the locktime against the median of the -past 11 block's timestamps, rather than the timestamp of the block -including the transaction. Existing consensus rules guarantee this -value to monotonically advance, thereby removing the capability for -miners to claim more transaction fees by lying about the timestamps of -their block. - -This proposal seeks to ensure reliable behaviour in locktime calculations -as required by BIP65 (CHECKLOCKTIMEVERIFY) and matching the behavior of -BIP68 (sequence numbers) and BIP112 (CHECKSEQUENCEVERIFY). +This BIP proposes comparing the locktime against the median of the past 11 block's timestamps, rather than the timestamp of the block including the transaction. +Existing consensus rules guarantee this value to monotonically advance, thereby removing the capability for miners to claim more transaction fees by lying about the timestamps of their block. +This proposal seeks to ensure reliable behaviour in locktime calculations as required by BIP65 (`CHECKLOCKTIMEVERIFY`) and matching the behavior of BIP68 (sequence numbers) and BIP112 (`CHECKSEQUENCEVERIFY`). ## Specification -The values for transaction locktime remain unchanged. The difference is only in -the calculation determining whether a transaction can be included. Instead of -an unreliable timestamp, the following function is used to determine the current -block time for the purpose of checking lock-time constraints: +The values for transaction locktime remain unchanged. +The difference is only in the calculation determining whether a transaction can be included. +Instead of an unreliable timestamp, the following function is used to determine the current block time for the purpose of checking lock-time constraints: enum { nMedianTimeSpan=11 }; @@ -64,47 +47,37 @@ block time for the purpose of checking lock-time constraints: } Lock-time constraints are checked by the consensus method IsFinalTx(). -This method takes the block time as one parameter. This BIP proposes -that after activation calls to IsFinalTx() within consensus code use -the return value of `GetMedianTimePast(pindexPrev)` instead. +This method takes the block time as one parameter. +This BIP proposes that after activation calls to IsFinalTx() within consensus code use the return value of `GetMedianTimePast(pindexPrev)` instead. The new rule applies to all transactions, including the coinbase transaction. -A reference implementation of this proposal is provided by the -following pull request: +A reference implementation of this proposal is provided by the following pull request: https://github.com/bitcoin/bitcoin/pull/6566 - ## Deployment This BIP is to be deployed by "versionbits" BIP9 using bit 0. For Bitcoin '''mainnet''', the BIP9 '''starttime''' will be midnight 1st May 2016 UTC (Epoch timestamp 1462060800) and BIP9 '''timeout''' will be midnight 1st May 2017 UTC (Epoch timestamp 1493596800). -For Bitcoin '''testnet''', the BIP9 '''starttime''' will be midnight 1st March 2016 UTC (Epoch timestamp 1456790400) and BIP9 '''timeout''' will be midnight 1st May 2017 UTC (Epoch timestamp 1493596800). +For Bitcoin '''testnet''', the BIP9 '''starttime''' will be midnight 1st March 2016 UTC (Epoch timestamp 1456790400) and BIP9 '''timeout''' will be midnight 1st May 2017 UTC (Epoch imestamp 1493596800). This BIP must be deployed simultaneously with BIP68 and BIP112 using the same deployment mechanism. - ## Acknowledgements -Mark Friedenbach for designing and authoring the reference -implementation of this BIP. +Mark Friedenbach for designing and authoring the reference implementation of this BIP. -Thanks go to Gregory Maxwell who came up with the original idea, -in #bitcoin-wizards on 2013-07-16. +Thanks go to Gregory Maxwell who came up with the original idea, in #bitcoin-wizards on 2013-07-16. Thomas Kerin authored this BIP document. - ## Compatibility -Transactions generated using time-based lock-time will take -approximately an hour longer to confirm than would be expected under -the old rules. This is not known to introduce any compatibility -concerns with existing protocols. - +Transactions generated using time-based lock-time will take approximately an hour longer to confirm than would be expected under the old rules. +This is not known to introduce any compatibility concerns with existing protocols. ## References @@ -120,7 +93,6 @@ concerns with existing protocols. [Version bits](https://gist.github.com/sipa/bf69659f43e763540550) - ## Copyright -This document is placed in the public domain. \ No newline at end of file +This document is placed in the public domain. diff --git a/protocol/forks/bip-0157.md b/protocol/forks/bip-0157.md index bf48ab3..6b6e232 100644 --- a/protocol/forks/bip-0157.md +++ b/protocol/forks/bip-0157.md @@ -1,38 +1,54 @@ -
-  BIP: 157
-  Layer: Peer Services
-  Title: Client Side Block Filtering
-  Author: Olaoluwa Osuntokun <laolu32@gmail.com>
-          Alex Akselrod <alex@akselrod.org>
-          Jim Posen <jimpo@coinbase.com>
-  Comments-Summary: None yet
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0157
-  Status: Draft
-  Type: Standards Track
-  Created: 2017-05-24
-  License: CC0-1.0
-
+# BIP-0157 + BIP: 157 + Layer: Peer Services + Title: Client Side Block Filtering + Author: Olaoluwa Osuntokun <laolu32@gmail.com> + Alex Akselrod <alex@akselrod.org> + Jim Posen <jimpo@coinbase.com> + Comments-Summary: None yet + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0157 + Status: Draft + Type: Standards Track + Created: 2017-05-24 + License: CC0-1.0 ## Abstract -This BIP describes a new light client protocol in Bitcoin that improves upon currently available options. The standard light client protocol in use today, defined in [BIP 37](/protocol/forks/bip-0037), has known flaws that weaken the security and privacy of clients and [allow denial-of-service attack vectors on full nodes](https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-May/012636.html). The new protocol overcomes these issues by allowing light clients to obtain compact probabilistic filters of block content from full nodes and download full blocks if the filter matches relevant data. +This BIP describes a new light client protocol in Bitcoin that improves upon currently available options. +The standard light client protocol in use today, defined in [BIP 37](/protocol/forks/bip-0037), has known flaws that weaken the security and privacy of clients and [allow denial-of-service attack vectors on full nodes](https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-May/012636.html). +The new protocol overcomes these issues by allowing light clients to obtain compact probabilistic filters of block content from full nodes and download full blocks if the filter matches relevant data. -New P2P messages empower light clients to securely sync the blockchain without relying on a trusted source. This BIP also defines a filter header, which serves as a commitment to all filters for previous blocks and provides the ability to efficiently detect malicious or faulty peers serving invalid filters. The resulting protocol guarantees that light clients with at least one honest peer are able to identify the correct block filters. +New P2P messages empower light clients to securely sync the blockchain without relying on a trusted source. +This BIP also defines a filter header, which serves as a commitment to all filters for previous blocks and provides the ability to efficiently detect malicious or faulty peers serving invalid filters. +The resulting protocol guarantees that light clients with at least one honest peer are able to identify the correct block filters. ## Motivation -Bitcoin light clients allow applications to read relevant transactions from the blockchain without incurring the full cost of downloading and validating all data. Such applications seek to simultaneously minimize the trust in peers and the amount of bandwidth, storage space, and computation required. They achieve this by downloading all block headers, verifying the proofs of work, and following the longest proof-of-work chain. Since block headers are a fixed 80-bytes and are generated every 10 minutes on average, the bandwidth required -to sync the block headers is minimal. Light clients then download only the blockchain data relevant to them directly from peers and validate inclusion in the header chain. Though clients do not check the validity of all blocks in the longest proof-of-work chain, they rely on miner incentives for security. +Bitcoin light clients allow applications to read relevant transactions from the blockchain without incurring the full cost of downloading and validating all data. +Such applications seek to simultaneously minimize the trust in peers and the amount of bandwidth, storage space, and computation required. +They achieve this by downloading all block headers, verifying the proofs of work, and following the longest proof-of-work chain. +Since block headers are a fixed 80-bytes and are generated every 10 minutes on average, the bandwidth required +to sync the block headers is minimal. +Light clients then download only the blockchain data relevant to them directly from peers and validate inclusion in the header chain. +Though clients do not check the validity of all blocks in the longest proof-of-work chain, they rely on miner incentives for security. -BIP 37 is currently the most widely used light client execution mode for -Bitcoin. With BIP 37, a client sends a Bloom filter it wants to watch to a full node peer, then receives notifications for each new transaction or block that matches the filter. The client then requests relevant transactions from the peer along with Merkle proofs of inclusion in the blocks containing them, which are verified against the block headers. The Bloom filters match data such as client addresses and unspent outputs, and the filter size must be carefully tuned to balance the false positive rate with the amount of information leaked to peer. It -has been shown, however, that most implementations available offer virtually ''zero privacy'' to wallets and [other](https://eprint.iacr.org/2014/763.pdf) -[applications](https://jonasnick.github.io/blog/2015/02/12/privacy-in-bitcoinj/). Additionally, malicious full nodes serving light clients can omit critical data with little risk of detection, which is unacceptable for some applications (such as Lightning Network clients) that must respond to certain on-chain events. Finally, honest nodes servicing BIP 37 light clients may incur significant I/O and CPU resource usage due to maliciously crafted Bloom filters, creating a denial-of-service (DoS) vector and disincentizing node operators from [supporting the protocol](/protocol/forks/bip-0111). +BIP 37 is currently the most widely used light client execution mode for Bitcoin. +With BIP 37, a client sends a Bloom filter it wants to watch to a full node peer, then receives notifications for each new transaction or block that matches the filter. +The client then requests relevant transactions from the peer along with Merkle proofs of inclusion in the blocks containing them, which are verified against the block headers. +The Bloom filters match data such as client addresses and unspent outputs, and the filter size must be carefully tuned to balance the false positive rate with the amount of information leaked to peer. +It has been shown, however, that most implementations available offer virtually ''zero privacy'' to wallets and [other](https://eprint.iacr.org/2014/763.pdf) +[applications](https://jonasnick.github.io/blog/2015/02/12/privacy-in-bitcoinj/). +Additionally, malicious full nodes serving light clients can omit critical data with little risk of detection, which is unacceptable for some applications (such as Lightning Network clients) that must respond to certain on-chain events. +Finally, honest nodes servicing BIP 37 light clients may incur significant I/O and CPU resource usage due to maliciously crafted Bloom filters, creating a denial-of-service (DoS) vector and disincentizing node operators from [supporting the protocol](/protocol/forks/bip-0111). -The alternative detailed in this document can be seen as the opposite of BIP 37: instead of the client sending a filter to a full node peer, full nodes generate deterministic filters on block data that are served to the client. A light client can then download an entire block if the filter matches the data it is watching for. Since filters are deterministic, they only need to be constructed once and stored on disk, whenever a new block is connected to the chain. This keeps the computation required to serve filters minimal, and eliminates the I/O asymmetry that makes BIP 37 enabled nodes vulnerable. Clients also get better assurance of seeing all relevant transactions because they can check the validity of filters received from peers more easily than they can check completeness of filtered blocks. Finally, client privacy is improved because -blocks can be downloaded from ''any source'', so that no one peer gets complete information on the data required by a client. Extremely privacy conscious light clients may opt to anonymously fetch blocks using advanced techniques such as [Private Information -Retrieval](https://en.wikipedia.org/wiki/Private_information_retrieval). +The alternative detailed in this document can be seen as the opposite of BIP 37: instead of the client sending a filter to a full node peer, full nodes generate deterministic filters on block data that are served to the client. +A light client can then download an entire block if the filter matches the data it is watching for. +Since filters are deterministic, they only need to be constructed once and stored on disk, whenever a new block is connected to the chain. +This keeps the computation required to serve filters minimal, and eliminates the I/O asymmetry that makes BIP 37 enabled nodes vulnerable. +Clients also get better assurance of seeing all relevant transactions because they can check the validity of filters received from peers more easily than they can check completeness of filtered blocks. +Finally, client privacy is improved because blocks can be downloaded from ''any source'', so that no one peer gets complete information on the data required by a client. +Extremely privacy conscious light clients may opt to anonymously fetch blocks using advanced techniques such as [Private Information Retrieval](https://en.wikipedia.org/wiki/Private_information_retrieval). ## Definitions @@ -52,106 +68,145 @@ interpreted as described in RFC 2119. ### Filter Types -For the sake of future extensibility and reducing filter sizes, there are -multiple ''filter types'' that determine which data is included in a block filter as well as the method of filter construction/querying. In this model, full nodes generate one filter per block per filter type supported. +For the sake of future extensibility and reducing filter sizes, there are multiple ''filter types'' that determine which data is included in a block filter as well as the method of filter construction/querying. +In this model, full nodes generate one filter per block per filter type supported. -Each type is identified by a one byte code, and specifies the contents and serialization format of the filter. A full node MAY signal support for -particular filter types using service bits. The initial filter types are defined separately in [BIP 158](/protocol/forks/bip-0158), and one service bit is allocated to signal support for them. +Each type is identified by a one byte code, and specifies the contents and serialization format of the filter. +A full node MAY signal support for particular filter types using service bits. +The initial filter types are defined separately in [BIP 158](/protocol/forks/bip-0158), and one service bit is allocated to signal support for them. ### Filter Headers -This proposal draws inspiration from the headers-first mechanism that Bitcoin nodes use to sync the [block chain](https://bitcoin.org/en/developer-guide#headers-first). Similar to -how block headers have a Merkle commitment to all transaction data in the block, we define filter headers that have commitments to the block filters. Also like block headers, filter headers each have a commitment to the preceding one. Before downloading the block filters themselves, a light client can download all filter headers for the current block chain and use them to verify the authenticity of the filters. If the filter header chains differ between multiple peers, the client can identify the point where they diverge, then download the full block and compute the correct filter, thus identifying which peer is faulty. +This proposal draws inspiration from the headers-first mechanism that Bitcoin nodes use to sync the [block chain](https://bitcoin.org/en/developer-guide#headers-first). +Similar to +how block headers have a Merkle commitment to all transaction data in the block, we define filter headers that have commitments to the block filters. +Also like block headers, filter headers each have a commitment to the preceding one. +Before downloading the block filters themselves, a light client can download all filter headers for the current block chain and use them to verify the authenticity of the filters. +If the filter header chains differ between multiple peers, the client can identify the point where they diverge, then download the full block and compute the correct filter, thus identifying which peer is faulty. -The canonical hash of a block filter is the double-SHA256 of the serialized filter. Filter headers are 32-byte hashes derived for each block filter. They are computed as the double-SHA256 of the concatenation of the filter hash with the previous filter header. The previous filter header used to calculate that of the genesis block is defined to be the 32-byte array of 0's. +The canonical hash of a block filter is the double-SHA256 of the serialized filter. +Filter headers are 32-byte hashes derived for each block filter. +They are computed as the double-SHA256 of the concatenation of the filter hash with the previous filter header. +The previous filter header used to calculate that of the genesis block is defined to be the 32-byte array of 0's. ### New Messages #### getcfilters -getcfilters is used to request the compact filters of a particular type for a particular range of blocks. The message contains the following fields: -| Field Name | Data Type | Byte Size | Description | -|--|--|--|--| -| FilterType | byte | 1 | Filter type for which headers are requested | -| StartHeight | uint32 | 4 | The height of the first block in the requested range | -| StopHash | [32]byte | 32 | The hash of the last block in the requested range | +getcfilters is used to request the compact filters of a particular type for a particular range of blocks. +The message contains the following fields: -1. Nodes SHOULD NOT send getcfilters unless the peer has signaled support for this filter type. Nodes receiving getcfilters with an unsupported filter type SHOULD NOT respond. -2. StopHash MUST be known to belong to a block accepted by the receiving peer. This is the case if the peer had previously sent a headers or inv message with that block or any descendents. A node that receives getcfilters with an unknown StopHash SHOULD NOT respond. +| Field Name | Data Type | Byte Size | Description | +|-------------|-----------|-----------|------------------------------------------------------| +| FilterType | byte | 1 | Filter type for which headers are requested | +| StartHeight | uint32 | 4 | The height of the first block in the requested range | +| StopHash | [32]byte | 32 | The hash of the last block in the requested range | + +1. Nodes SHOULD NOT send getcfilters unless the peer has signaled support for this filter type. +Nodes receiving getcfilters with an unsupported filter type SHOULD NOT respond. +2. StopHash MUST be known to belong to a block accepted by the receiving peer. +This is the case if the peer had previously sent a headers or inv message with that block or any descendents. +A node that receives getcfilters with an unknown StopHash SHOULD NOT respond. 3. The height of the block with hash StopHash MUST be greater than or equal to StartHeight, and the difference MUST be strictly less than 1000. 4. The receiving node MUST respond to valid requests by sending one cfilter message for each block in the requested range, sequentially in order by block height. #### cfilter -cfilter is sent in response to getcfilters, one for each block in the requested range. The message contains the following fields: -| Field Name | Data Type | Byte Size | Description | -|--|--|--|--| -| FilterType | byte | 1 | Byte identifying the type of filter being returned | -| BlockHash | [32]byte | 32 | Block hash of the Bitcoin block for which the filter is being returned | -| NumFilterBytes | CompactSize | 1-5 | A variable length integer representing the size of the filter in the following field | -| FilterBytes | []byte | NumFilterBytes | The serialized compact filter for this block | +cfilter is sent in response to getcfilters, one for each block in the requested range. +The message contains the following fields: + +| Field Name | Data Type | Byte Size | Description | +|----------------|-------------|----------------|--------------------------------------------------------------------------------------| +| FilterType | byte | 1 | Byte identifying the type of filter being returned | +| BlockHash | [32]byte | 32 | Block hash of the Bitcoin block for which the filter is being returned | +| NumFilterBytes | CompactSize | 1-5 | A variable length integer representing the size of the filter in the following field | +| FilterBytes | []byte | NumFilterBytes | The serialized compact filter for this block | 1. The FilterType SHOULD match the field in the getcfilters request, and BlockHash must correspond to a block that is an ancestor of StopHash with height greater than or equal to StartHeight. #### getcfheaders -getcfheaders is used to request verifiable filter headers for a range of blocks. The message contains the following fields: -| Field Name | Data Type | Byte Size | Description | -|--|--|--|--| -| FilterType | byte | 1 | Filter type for which headers are requested | -| StartHeight | uint32 | 4 | The height of the first block in the requested range | -| StopHash | [32]byte | 32 | The hash of the last block in the requested range | +getcfheaders is used to request verifiable filter headers for a range of blocks. +The message contains the following fields: -1. Nodes SHOULD NOT send getcfheaders unless the peer has signaled support for this filter type. Nodes receiving getcfheaders with an unsupported filter type SHOULD NOT respond. -2. StopHash MUST be known to belong to a block accepted by the receiving peer. This is the case if the peer had previously sent a headers or inv message with that block or any descendents. A node that receives getcfheaders with an unknown StopHash SHOULD NOT respond. +| Field Name | Data Type | Byte Size | Description | +|-------------|-----------|-----------|------------------------------------------------------| +| FilterType | byte | 1 | Filter type for which headers are requested | +| StartHeight | uint32 | 4 | The height of the first block in the requested range | +| StopHash | [32]byte | 32 | The hash of the last block in the requested range | + +1. Nodes SHOULD NOT send getcfheaders unless the peer has signaled support for this filter type. +Nodes receiving getcfheaders with an unsupported filter type SHOULD NOT respond. +2. StopHash MUST be known to belong to a block accepted by the receiving peer. +This is the case if the peer had previously sent a headers or inv message with that block or any descendents. +A node that receives getcfheaders with an unknown StopHash SHOULD NOT respond. 3. The height of the block with hash StopHash MUST be greater than or equal to StartHeight, and the difference MUST be strictly less than 2,000. #### cfheaders -cfheaders is sent in response to getcfheaders. Instead of including the filter headers themselves, the response includes one filter header and a sequence of filter hashes, from which the headers can be derived. This has the benefit that the client can verify the binding links between the -headers. The message contains the following fields: -| Field Name | Data Type | Byte Size | Description | -|--|--|--|--| -| FilterType | byte | 1 | Filter type for which hashes are requested | -| StopHash | [32]byte | 32 | The hash of the last block in the requested range | -| PreviousFilterHeader | [32]byte | 32 | The filter header preceding the first block in the requested range | -| FilterHashesLength | CompactSize | 1-3 | The length of the following vector of filter hashes | -| FilterHashes | [][32]byte | FilterHashesLength * 32 | The filter hashes for each block in the requested range | +cfheaders is sent in response to getcfheaders. +Instead of including the filter headers themselves, the response includes one filter header and a sequence of filter hashes, from which the headers can be derived. +This has the benefit that the client can verify the binding links between the headers. +The message contains the following fields: + +| Field Name | Data Type | Byte Size | Description | +|----------------------|-------------|-------------------------|--------------------------------------------------------------------| +| FilterType | byte | 1 | Filter type for which hashes are requested | +| StopHash | [32]byte | 32 | The hash of the last block in the requested range | +| PreviousFilterHeader | [32]byte | 32 | The filter header preceding the first block in the requested range | +| FilterHashesLength | CompactSize | 1-3 | The length of the following vector of filter hashes | +| FilterHashes | [][32]byte | FilterHashesLength * 32 | The filter hashes for each block in the requested range | 1. The FilterType and StopHash SHOULD match the fields in the getcfheaders request. 2. FilterHashesLength MUST NOT be greater than 2,000. -3. FilterHashes MUST have one entry for each block on the chain terminating with tip StopHash, starting with the block at height StartHeight. The entries MUST be the filter hashes of the given type for each block in that range, in ascending order by height. +3. FilterHashes MUST have one entry for each block on the chain terminating with tip StopHash, starting with the block at height StartHeight. +The entries MUST be the filter hashes of the given type for each block in that range, in ascending order by height. 4. PreviousFilterHeader MUST be set to the previous filter header of first block in the requested range. #### getcfcheckpt -getcfcheckpt is used to request filter headers at evenly spaced intervals over a range of blocks. Clients may use filter hashes from getcfheaders to connect these checkpoints, as is described in the [Client Operation](#client-operation) section below. The getcfcheckpt message contains the following fields: -| Field Name | Data Type | Byte Size | Description | -|--|--|--|--| -| FilterType | byte | 1 | Filter type for which headers are requested | -| StopHash | [32]byte | 32 | The hash of the last block in the chain that headers are requested for | +getcfcheckpt is used to request filter headers at evenly spaced intervals over a range of blocks. +Clients may use filter hashes from getcfheaders to connect these checkpoints, as is described in the [Client Operation](#client-operation) section below. +The getcfcheckpt message contains the following fields: -1. Nodes SHOULD NOT send getcfcheckpt unless the peer has signaled support for this filter type. Nodes receiving getcfcheckpt with an unsupported filter type SHOULD NOT respond. -2. StopHash MUST be known to belong to a block accepted by the receiving peer. This is the case if the peer had previously sent a headers or inv message with any descendent blocks. A node that receives getcfcheckpt with an unknown StopHash SHOULD NOT respond. +| Field Name | Data Type | Byte Size | Description | +|------------|-----------|-----------|------------------------------------------------------------------------| +| FilterType | byte | 1 | Filter type for which headers are requested | +| StopHash | [32]byte | 32 | The hash of the last block in the chain that headers are requested for | + +1. Nodes SHOULD NOT send getcfcheckpt unless the peer has signaled support for this filter type. +Nodes receiving getcfcheckpt with an unsupported filter type SHOULD NOT respond. +2. StopHash MUST be known to belong to a block accepted by the receiving peer. +This is the case if the peer had previously sent a headers or inv message with any descendent blocks. +A node that receives getcfcheckpt with an unknown StopHash SHOULD NOT respond. #### cfcheckpt -cfcheckpt is sent in response to getcfcheckpt. The filter headers included are the set of all filter headers on the requested chain where the height is a positive multiple of 1,000. The message contains the following fields: -| Field Name | Data Type | Byte Size | Description | -|--|--|--|--| -| FilterType | byte | 1 | Filter type for which headers are requested | -| StopHash | [32]byte | 32 | The hash of the last block in the chain that headers are requested for | -| FilterHeadersLength | CompactSize | 1-3 | The length of the following vector of filter headers | -| FilterHeaders | [][32]byte | FilterHeadersLength * 32 | The filter headers at intervals of 1,000 | +cfcheckpt is sent in response to getcfcheckpt. +The filter headers included are the set of all filter headers on the requested chain where the height is a positive multiple of 1,000. +The message contains the following fields: + +| Field Name | Data Type | Byte Size | Description | +|---------------------|-------------|--------------------------|------------------------------------------------------------------------| +| FilterType | byte | 1 | Filter type for which headers are requested | +| StopHash | [32]byte | 32 | The hash of the last block in the chain that headers are requested for | +| FilterHeadersLength | CompactSize | 1-3 | The length of the following vector of filter headers | +| FilterHeaders | [][32]byte | FilterHeadersLength * 32 | The filter headers at intervals of 1,000 | 1. The FilterType and StopHash SHOULD match the fields in the getcfcheckpt request. -2. FilterHeaders MUST have exactly one entry for each block on the chain terminating in StopHash, where the block height is a multiple of 1,000 greater than 0. The entries MUST be the filter headers of the given type for each such block, in ascending order by height. +2. FilterHeaders MUST have exactly one entry for each block on the chain terminating in StopHash, where the block height is a multiple of 1,000 greater than 0. +The entries MUST be the filter headers of the given type for each such block, in ascending order by height. ### Node Operation -Full nodes MAY opt to support this BIP and generate filters for any of the specified filter types. Such nodes SHOULD treat the filters as an additional index of the blockchain. For each new block that is connected to the main chain, nodes SHOULD generate filters for all supported types and persist them. Nodes that are missing filters and are already synced with the blockchain SHOULD reindex the chain upon start-up, constructing filters for each block from genesis to the current tip. They also SHOULD keep every checkpoint header in memory, so that getcfcheckpt requests do not result in many random-access disk reads. +Full nodes MAY opt to support this BIP and generate filters for any of the specified filter types. +Such nodes SHOULD treat the filters as an additional index of the blockchain. +For each new block that is connected to the main chain, nodes SHOULD generate filters for all supported types and persist them. +Nodes that are missing filters and are already synced with the blockchain SHOULD reindex the chain upon start-up, constructing filters for each block from genesis to the current tip. +They also SHOULD keep every checkpoint header in memory, so that getcfcheckpt requests do not result in many random-access disk reads. -Nodes SHOULD NOT generate filters dynamically on request, as malicious peers may be able to perform DoS attacks by requesting small filters derived from large blocks. This would require an asymmetical amount of I/O on the node to compute and serve, similar to attacks against BIP 37 enabled nodes noted in BIP 111. +Nodes SHOULD NOT generate filters dynamically on request, as malicious peers may be able to perform DoS attacks by requesting small filters derived from large blocks. +This would require an asymmetical amount of I/O on the node to compute and serve, similar to attacks against BIP 37 enabled nodes noted in BIP 111. Nodes MAY prune block data after generating and storing all filters for a block. @@ -159,32 +214,52 @@ Nodes MAY prune block data after generating and storing all filters for a block. This section provides recommendations for light clients to download filters with maximal security. -Clients SHOULD first sync the entire block header chain from peers using the standard headers-first syncing mechanism before downloading any block filters or filter headers. Clients configured with trusted checkpoints MAY only sync headers started from the last checkpoint. Clients SHOULD disconnect any outbound -peers whose best chain has significantly less work than the known longest proof-of-work chain. +Clients SHOULD first sync the entire block header chain from peers using the standard headers-first syncing mechanism before downloading any block filters or filter headers. +Clients configured with trusted checkpoints MAY only sync headers started from the last checkpoint. +Clients SHOULD disconnect any outbound peers whose best chain has significantly less work than the known longest proof-of-work chain. -Once a client's block headers are in sync, it SHOULD download and verify filter headers for all blocks and filter types that it might later download. The client SHOULD send getcfheaders messages to peers and derive and store the filter headers for each block. The client MAY first fetch headers at evenly spaced intervals of 1,000 by sending getcfcheckpt. The header checkpoints allow the client to download filter headers for different intervals from multiple peers in parallel, verifying each range of 1,000 headers against -the checkpoints. +Once a client's block headers are in sync, it SHOULD download and verify filter headers for all blocks and filter types that it might later download. +The client SHOULD send getcfheaders messages to peers and derive and store the filter headers for each block. +The client MAY first fetch headers at evenly spaced intervals of 1,000 by sending getcfcheckpt. +The header checkpoints allow the client to download filter headers for different intervals from multiple peers in parallel, verifying each range of 1,000 headers against the checkpoints. -Unless securely connected to a trusted peer that is serving filter headers, the client SHOULD connect to multiple outbound peers that support each filter type to mitigate the risk of downloading incorrect headers. If the client receives conflicting filter headers from different peers for any block and filter type, it SHOULD interrogate them to determine which is faulty. The client SHOULD use getcfheaders and/or getcfcheckpt to first identify the first filter headers that the peers disagree on. The client then SHOULD download the full block from any peer and derive the correct filter and filter header. The client SHOULD ban any peers that sent a filter header that does not match the computed one. +Unless securely connected to a trusted peer that is serving filter headers, the client SHOULD connect to multiple outbound peers that support each filter type to mitigate the risk of downloading incorrect headers. +If the client receives conflicting filter headers from different peers for any block and filter type, it SHOULD interrogate them to determine which is faulty. +The client SHOULD use getcfheaders and/or getcfcheckpt to first identify the first filter headers that the peers disagree on. +The client then SHOULD download the full block from any peer and derive the correct filter and filter header. +The client SHOULD ban any peers that sent a filter header that does not match the computed one. -Once the client has downloaded and verified all filter headers needed, ''and'' no outbound peers have sent conflicting headers, the client can download the actual block filters it needs. The client MAY backfill filter headers before the first verified one at this point if it only downloaded them starting at a later point. Clients SHOULD persist the verified filter headers for last 100 blocks in the chain (or whatever finality depth is desired), to compare against headers received from new peers after restart. They MAY store more filter headers to avoid redownloading them if a rescan is later necessary. +Once the client has downloaded and verified all filter headers needed, ''and'' no outbound peers have sent conflicting headers, the client can download the actual block filters it needs. +The client MAY backfill filter headers before the first verified one at this point if it only downloaded them starting at a later point. +Clients SHOULD persist the verified filter headers for last 100 blocks in the chain (or whatever finality depth is desired), to compare against headers received from new peers after restart. +They MAY store more filter headers to avoid redownloading them if a rescan is later necessary. -Starting from the first block in the desired range, the client now MAY download the filters. The client SHOULD test that each filter links to its corresponding filter header and ban peers that send incorrect filters. The client MAY download multiple filters at once to increase throughput, though it SHOULD test the filters sequentially. The client MAY check if a filter is empty before requesting it by checking if the filter header commits to the hash of the empty filter, saving a round trip if that is the case. +Starting from the first block in the desired range, the client now MAY download the filters. +The client SHOULD test that each filter links to its corresponding filter header and ban peers that send incorrect filters. +The client MAY download multiple filters at once to increase throughput, though it SHOULD test the filters sequentially. +The client MAY check if a filter is empty before requesting it by checking if the filter header commits to the hash of the empty filter, saving a round trip if that is the case. -Each time a new valid block header is received, the client SHOULD request the corresponding filter headers from all eligible peers. If two peers send conflicting filter headers, the client should interrogate them as described above and ban any peers that send an invalid header. +Each time a new valid block header is received, the client SHOULD request the corresponding filter headers from all eligible peers. +If two peers send conflicting filter headers, the client should interrogate them as described above and ban any peers that send an invalid header. -If a client is fetching full blocks from the P2P network, they SHOULD be downloaded from outbound peers at random to mitigate privacy loss due to transaction intersection analysis. Note that blocks may be downloaded from peers that do not support this BIP. +If a client is fetching full blocks from the P2P network, they SHOULD be downloaded from outbound peers at random to mitigate privacy loss due to transaction intersection analysis. +Note that blocks may be downloaded from peers that do not support this BIP. ## Rationale -The filter headers and checkpoints messages are defined to help clients identify the correct filter for a block when connected to peers sending conflicting information. An alternative solution is to require Bitcoin blocks to include commitments to derived block filters, so light clients can verify authenticity given block headers and some additional witness data. This would require a network-wide change to the Bitcoin consensus rules, however, whereas this document proposes a solution purely at the P2P layer. +The filter headers and checkpoints messages are defined to help clients identify the correct filter for a block when connected to peers sending conflicting information. +An alternative solution is to require Bitcoin blocks to include commitments to derived block filters, so light clients can verify authenticity given block headers and some additional witness data. +This would require a network-wide change to the Bitcoin consensus rules, however, whereas this document proposes a solution purely at the P2P layer. -The constant interval of 1,000 blocks between checkpoints was chosen so that, given the current chain height and rate of growth, the size of a -cfcheckpt message is not drastically from a cfheaders between two checkpoints. Also, 1,000 is a nice round number, at least to those of us who think in decimal. +The constant interval of 1,000 blocks between checkpoints was chosen so that, given the current chain height and rate of growth, the size of a cfcheckpt message is not drastically from a cfheaders between two checkpoints. +Also, 1,000 is a nice round number, at least to those of us who think in decimal. ## Compatibility -This light client mode is not compatible with current node deployments and requires support for the new P2P messages. The node implementation of this proposal is not incompatible with the current P2P network rules (ie. doesn't affect network topology of full nodes). Light clients may adopt protocols based on this as an alternative to the existing BIP 37. Adoption of this BIP may result in reduced network support for BIP 37. +This light client mode is not compatible with current node deployments and requires support for the new P2P messages. +The node implementation of this proposal is not incompatible with the current P2P network rules (i.e. doesn't affect network topology of full nodes). +Light clients may adopt protocols based on this as an alternative to the existing BIP 37. +Adoption of this BIP may result in reduced network support for BIP 37. ## Acknowledgments @@ -206,4 +281,4 @@ Golomb-Rice Coded sets: https://github.com/Roasbeef/btcutil/tree/gcs/gcs ## Copyright -This document is licensed under the Creative Commons CC0 1.0 Universal license. \ No newline at end of file +This document is licensed under the Creative Commons CC0 1.0 Universal license. diff --git a/protocol/forks/bip-0158.md b/protocol/forks/bip-0158.md index 472cd76..d822dd8 100644 --- a/protocol/forks/bip-0158.md +++ b/protocol/forks/bip-0158.md @@ -1,34 +1,28 @@ -
-  BIP: 158
-  Layer: Peer Services
-  Title: Compact Block Filters for Light Clients
-  Author: Olaoluwa Osuntokun <laolu32@gmail.com>
-          Alex Akselrod <alex@akselrod.org>
-  Comments-Summary: None yet
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0158
-  Status: Draft
-  Type: Standards Track
-  Created: 2017-05-24
-  License: CC0-1.0
-
+# BIP-0158 + BIP: 158 + Layer: Peer Services + Title: Compact Block Filters for Light Clients + Author: Olaoluwa Osuntokun <laolu32@gmail.com> + Alex Akselrod <alex@akselrod.org> + Comments-Summary: None yet + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0158 + Status: Draft + Type: Standards Track + Created: 2017-05-24 + License: CC0-1.0 ## Abstract -This BIP describes a structure for compact filters on block data, for use in the -[BIP 157 light client protocol](/protocol/forks/bip-0157). The filter -construction proposed is an alternative to Bloom filters, as used in BIP 37, -that minimizes filter size by using Golomb-Rice coding for compression. This -document specifies one initial filter type based on this construction that -enables basic wallets and applications with more advanced smart contracts. +This BIP describes a structure for compact filters on block data, for use in the [BIP 157 light client protocol](/protocol/forks/bip-0157). +The filter construction proposed is an alternative to Bloom filters, as used in BIP 37, that minimizes filter size by using Golomb-Rice coding for compression. +This document specifies one initial filter type based on this construction that enables basic wallets and applications with more advanced smart contracts. ## Motivation -[BIP 157](/protocol/forks/bip-0157) defines a light client protocol based on -deterministic filters of block content. The filters are designed to -minimize the expected bandwidth consumed by light clients, downloading filters -and full blocks. This document defines the initial filter type ''basic'' -that is designed to reduce the filter size for regular wallets. +[BIP 157](/protocol/forks/bip-0157) defines a light client protocol based on deterministic filters of block content. +The filters are designed to minimize the expected bandwidth consumed by light clients, downloading filters and full blocks. +This document defines the initial filter type ''basic'' that is designed to reduce the filter size for regular wallets. ## Definitions @@ -42,8 +36,9 @@ P2P protocol. ''Data pushes'' are byte vectors pushed to the stack according to the rules of Bitcoin script. -''Bit streams'' are readable and writable streams of individual bits. The -following functions are used in the pseudocode in this document: +''Bit streams'' are readable and writable streams of individual bits. +The following functions are used in the pseudocode in this document: + * new_bit_stream instantiates a new writable bit stream * new_bit_stream(vector) instantiates a new bit stream reading data from vector * write_bit(stream, b) appends the bit b to the end of the stream @@ -51,58 +46,39 @@ following functions are used in the pseudocode in this document: * write_bits_big_endian(stream, n, k) appends the k least significant bits of integer n to the end of the stream in big-endian bit order * read_bits_big_endian(stream, k) reads the next available k bits from the stream and interprets them as the least significant bits of a big-endian integer -The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", -"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be -interpreted as described in RFC 2119. +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. ## Specification ### Golomb-Coded Sets -For each block, compact filters are derived containing sets of items associated -with the block (eg. addresses sent to, outpoints spent, etc.). A set of such -data objects is compressed into a probabilistic structure called a -''Golomb-coded set'' (GCS), which matches all items in the set with probability -1, and matches other items with probability 1/M for some -integer parameter M. The encoding is also parameterized by -P, the bit length of the remainder code. Each filter defined -specifies values for P and M. +For each block, compact filters are derived containing sets of items associated with the block (eg. addresses sent to, outpoints spent, etc.). +A set of such data objects is compressed into a probabilistic structure called a ''Golomb-coded set'' (GCS), which matches all items in the set with probability 1, and matches other items with probability 1/M for some integer parameter M. +The encoding is also parameterized by P, the bit length of the remainder code. +Each filter defined specifies values for P and M. At a high level, a GCS is constructed from a set of N items by: - 1. hashing all items to 64-bit integers in the range [0, N * M) - 2. sorting the hashed values in ascending order - 3. computing the differences between each value and the previous one - 4. writing the differences sequentially, compressed with Golomb-Rice coding + +1. hashing all items to 64-bit integers in the range [0, N * M) +2. sorting the hashed values in ascending order +3. computing the differences between each value and the previous one +4. writing the differences sequentially, compressed with Golomb-Rice coding The following sections describe each step in greater detail. #### Hashing Data Objects -The first step in the filter construction is hashing the variable-sized raw -items in the set to the range [0, F), where F = N * -M. Customarily, M is set to 2^P. However, if -one is able to select both Parameters independently, then [more optimal values -can be -selected](https://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845). -Set membership queries against the hash outputs will have a false positive rate -of M. To avoid integer overflow, the number of items N -MUST be <2^32 and M MUST be <2^32. +The first step in the filter construction is hashing the variable-sized raw items in the set to the range [0, F), where F = N * M. Customarily, M is set to 2^P. +However, if one is able to select both Parameters independently, then [more optimal values can be selected](https://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845). +Set membership queries against the hash outputs will have a false positive rate of M. +To avoid integer overflow, the number of items N MUST be <2^32 and M MUST be <2^32. +The items are first passed through the pseudorandom function ''SipHash'', which takes a 128-bit key k and a variable-sized byte vector and produces a uniformly random 64-bit output. +Implementations of this BIP MUST use the SipHash parameters c = 2 and d = 4. -The items are first passed through the pseudorandom function ''SipHash'', which -takes a 128-bit key k and a variable-sized byte vector and produces -a uniformly random 64-bit output. Implementations of this BIP MUST use the -SipHash parameters c = 2 and d = 4. - -The 64-bit SipHash outputs are then mapped uniformly over the desired range by -multiplying with F and taking the top 64 bits of the 128-bit result. This -algorithm is a faster alternative to modulo reduction, as it avoids the -[expensive division -operation](https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/). -Note that care must be taken when implementing this reduction to ensure the -upper 64 bits of the integer multiplication are not truncated; certain -architectures and high level languages may require code that decomposes the -64-bit multiplication into four 32-bit multiplications and recombines into the -result. +The 64-bit SipHash outputs are then mapped uniformly over the desired range by multiplying with F and taking the top 64 bits of the 128-bit result. +This algorithm is a faster alternative to modulo reduction, as it avoids the [expensive division operation](https://lemire.me/blog/2016/06/27/ -fast-alternative-to-the-modulo-reduction/). +Note that care must be taken when implementing this reduction to ensure the upper 64 bits of the integer multiplication are not truncated; +certain architectures and high level languages may require code that decomposes the 64-bit multiplication into four 32-bit multiplications and recombines into the result.
 hash_to_range(item: []byte, F: uint64, k: [16]byte) -> uint64:
@@ -123,33 +99,27 @@ hashed_set_construct(raw_items: [][]byte, k: [16]byte, M: uint) -> []uint64:
 
 #### Golomb-Rice Coding
 
-Instead of writing the items in the hashed set directly to the filter, greater
-compression is achieved by only writing the differences between successive
-items in sorted order. Since the items are distributed uniformly, it can be
-shown that the differences resemble a [geometric
-distribution](https://en.wikipedia.org/wiki/Geometric_distribution).
-[''Golomb-Rice''
-''coding''](https://en.wikipedia.org/wiki/Golomb_coding#Rice_coding)
-is a technique that optimally compresses geometrically distributed values.
+Instead of writing the items in the hashed set directly to the filter, greater compression is achieved by only writing the differences between successive items in sorted order.
+Since the items are distributed uniformly, it can be shown that the differences resemble a [geometric distribution](https://en.wikipedia.org/wiki/Geometric_distribution).
+[''Golomb-Rice coding''](https://en.wikipedia.org/wiki/Golomb_coding#Rice_coding) is a technique that optimally compresses geometrically distributed values.
 
-With Golomb-Rice, a value is split into a quotient and remainder modulo
-2^P, which are encoded separately. The quotient q is
-encoded as ''unary'', with a string of q 1's followed by one 0. The
-remainder r is represented in big-endian by P bits. For example,
-this is a table of Golomb-Rice coded values using P=2:
+With Golomb-Rice, a value is split into a quotient and remainder modulo 2^P, which are encoded separately.
+The quotient q is encoded as ''unary'', with a string of q 1's followed by one 0.
+The remainder r is represented in big-endian by P bits.
+For example, this is a table of Golomb-Rice coded values using P=2:
 
-| n | (q, r) | c
-|--|--|--|
-| 0 | (0, 0) | 0 00
-| 1 | (0, 1) | 0 01
-| 2 | (0, 2) | 0 10
-| 3 | (0, 3) | 0 11
-| 4 | (1, 0) | 10 00
-| 5 | (1, 1) | 10 01
-| 6 | (1, 2) | 10 10
-| 7 | (1, 3) | 10 11
-| 8 | (2, 0) | 110 00
-| 9 | (2, 1) | 110 01
+| n | (q, r) | c                   |
+|---|--------|---------------------|
+| 0 | (0, 0) | 0 00   |
+| 1 | (0, 1) | 0 01   |
+| 2 | (0, 2) | 0 10   |
+| 3 | (0, 3) | 0 11   |
+| 4 | (1, 0) | 10 00  |
+| 5 | (1, 1) | 10 01  |
+| 6 | (1, 2) | 10 10  |
+| 7 | (1, 3) | 10 11  |
+| 8 | (2, 0) | 110 00 |
+| 9 | (2, 1) | 110 01 |
 
 
 golomb_encode(stream, x: uint64, P: uint):
@@ -176,19 +146,17 @@ golomb_decode(stream, P: uint) -> uint64:
 #### Set Construction
 
 A GCS is constructed from four parameters:
+
 * L, a vector of N raw items
 * P, the bit parameter of the Golomb-Rice coding
 * M, the target false positive rate
 * k, the 128-bit key used to randomize the SipHash outputs
 
-The result is a byte vector with a minimum size of N * (P + 1)
-bits.
+The result is a byte vector with a minimum size of N * (P + 1) bits.
 
-The raw items in L are first hashed to 64-bit unsigned integers as
-specified above and sorted. The differences between consecutive values,
-hereafter referred to as ''deltas'', are encoded sequentially to a bit stream
-with Golomb-Rice coding. Finally, the bit stream is padded with 0's to the
-nearest byte boundary and serialized to the output byte vector.
+The raw items in L are first hashed to 64-bit unsigned integers as specified above and sorted.
+The differences between consecutive values, hereafter referred to as ''deltas'', are encoded sequentially to a bit stream with Golomb-Rice coding.
+Finally, the bit stream is padded with 0's to the nearest byte boundary and serialized to the output byte vector.
 
 
 construct_gcs(L: [][]byte, P: uint, k: [16]byte, M: uint) -> []byte:
@@ -209,13 +177,11 @@ construct_gcs(L: [][]byte, P: uint, k: [16]byte, M: uint) -> []byte:
 
 #### Set Querying/Decompression
 
-To check membership of an item in a compressed GCS, one must reconstruct the
-hashed set members from the encoded deltas. The procedure to do so is the
-reverse of the compression: deltas are decoded one by one and added to a
-cumulative sum. Each intermediate sum represents a hashed value in the original
-set. The queried item is hashed in the same way as the set members and compared
-against the reconstructed values. Note that querying does not require the entire
-decompressed set be held in memory at once.
+To check membership of an item in a compressed GCS, one must reconstruct the hashed set members from the encoded deltas.
+The procedure to do so is the reverse of the compression: deltas are decoded one by one and added to a cumulative sum.
+Each intermediate sum represents a hashed value in the original set.
+The queried item is hashed in the same way as the set members and compared against the reconstructed values.
+Note that querying does not require the entire decompressed set be held in memory at once.
 
 
 gcs_match(key: [16]byte, compressed_set: []byte, target: []byte, P: uint, N: uint, M: uint) -> bool:
@@ -243,61 +209,48 @@ gcs_match(key: [16]byte, compressed_set: []byte, target: []byte, P: uint, N: uin
     return false
 
-Some applications may need to check for set intersection instead of membership -of a single item. This can be performed far more efficiently than checking each -item individually by leveraging the sorted structure of the compressed GCS. -First the query elements are all hashed and sorted, then compared in order -against the decompressed GCS contents. See -[Appendix B](#golomb-coded-set-multi-match) for pseudocode. +Some applications may need to check for set intersection instead of membership of a single item. +This can be performed far more efficiently than checking each item individually by leveraging the sorted structure of the compressed GCS. +First the query elements are all hashed and sorted, then compared in order against the decompressed GCS contents. +See [Appendix B](#golomb-coded-set-multi-match) for pseudocode. ### Block Filters This BIP defines one initial filter type: + * Basic (0x00) -** M = 784931 -** P = 19 + * M = 784931 + * P = 19 #### Contents -The basic filter is designed to contain everything that a light client needs to -sync a regular Bitcoin wallet. A basic filter MUST contain exactly the -following items for each transaction in a block: -* The previous output script (the script being spent) for each input, except - for the coinbase transaction. -* The scriptPubKey of each output, aside from all OP_RETURN output - scripts. +The basic filter is designed to contain everything that a light client needs to sync a regular Bitcoin wallet. +A basic filter MUST contain exactly the following items for each transaction in a block: + +* The previous output script (the script being spent) for each input, except for the coinbase transaction. +* The scriptPubKey of each output, aside from all OP_RETURN output scripts. Any "nil" items MUST NOT be included into the final set of filter elements. -We exclude all outputs that start with OP_RETURN in order to allow -filters to easily be committed to in the future via a soft-fork. A likely area -for future commitments is an additional OP_RETURN output in the -coinbase transaction similar to the [current witness commitment](/protocol/forks/bip-0141). By -excluding all OP_RETURN outputs we avoid a circular dependency -between the commitment, and the item being committed to. +We exclude all outputs that start with OP_RETURN in order to allow filters to easily be committed to in the future via a soft-fork. +A likely area for future commitments is an additional OP_RETURN output in the coinbase transaction similar to the [current witness commitment](/protocol/forks/bip-0141). +By excluding all OP_RETURN outputs we avoid a circular dependency between the commitment, and the item being committed to. #### Construction -The basic type is constructed as Golomb-coded sets with the following -parameters. +The basic type is constructed as Golomb-coded sets with the following parameters. -The parameter P MUST be set to 19, and the parameter -M MUST be set to 784931. Analysis has shown that if -one is able to select P and M independently, then -setting M=1.497137 * 2^P is [close to optimal](https://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845). +The parameter P MUST be set to 19, and the parameter M MUST be set to 784931. +Analysis has shown that if one is able to select P and M independently, then setting M=1.497137 * 2^P is [close to optimal](https://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845). -Empirical analysis also shows that was chosen as these parameters minimize the -bandwidth utilized, considering both the expected number of blocks downloaded -due to false positives and the size of the filters themselves. +Empirical analysis also shows that was chosen as these parameters minimize the bandwidth utilized, considering both the expected number of blocks downloaded due to false positives and the size of the filters themselves. -The parameter k MUST be set to the first 16 bytes of the hash -(in standard little-endian representation) of the block for which the filter is -constructed. This ensures the key is deterministic while still varying from -block to block. +The parameter k MUST be set to the first 16 bytes of the hash (in standard little-endian representation) of the block for which the filter is constructed. +This ensures the key is deterministic while still varying from block to block. + +Since the value N is required to decode a GCS, a serialized GCS includes it as a prefix, written as a CompactSize. +Thus, the complete serialization of a filter is: -Since the value N is required to decode a GCS, a serialized GCS -includes it as a prefix, written as a CompactSize. Thus, the -complete serialization of a filter is: * N, encoded as a CompactSize * The bytes of the compressed filter itself @@ -305,25 +258,19 @@ complete serialization of a filter is: This BIP allocates a new service bit: -|||| -|--|--|--| -| NODE_COMPACT_FILTERS | 1 << 6 | If enabled, the node MUST respond to all BIP 157 messages for filter type 0x00 +| | | | +|----------------------|---------------------|---------------------------------------------------------------------------------------------| +| NODE_COMPACT_FILTERS | 1 << 6 | If enabled, the node MUST respond to all BIP 157 messages for filter type 0x00 | ## Compatibility -This block filter construction is not incompatible with existing software, -though it requires implementation of the new filters. +This block filter construction is not incompatible with existing software, though it requires implementation of the new filters. ## Acknowledgments -We would like to thank bfd (from the bitcoin-dev mailing list) for bringing the -basis of this BIP to our attention, Greg Maxwell for pointing us in the -direction of Golomb-Rice coding and fast range optimization, Pieter Wullie for -his analysis of optimal GCS parameters, and Pedro -Martelletto for writing the initial indexing code for btcd. +We would like to thank bfd (from the bitcoin-dev mailing list) for bringing the basis of this BIP to our attention, Greg Maxwell for pointing us in the direction of Golomb-Rice coding and fast range optimization, Pieter Wullie for his analysis of optimal GCS parameters, and Pedro Martelletto for writing the initial indexing code for btcd. -We would also like to thank Dave Collins, JJ Jeffrey, and Eric Lombrozo for -useful discussions. +We would also like to thank Dave Collins, JJ Jeffrey, and Eric Lombrozo for useful discussions. ## Reference Implementation @@ -335,36 +282,25 @@ Golomb-Rice Coded sets: https://github.com/btcsuite/btcutil/blob/master/gcs ## Appendix A: Alternatives -A number of alternative set encodings were considered before Golomb-coded -sets were settled upon. In this appendix section, we'll list a few of the -alternatives along with our rationale for not pursuing them. +A number of alternative set encodings were considered before Golomb-coded sets were settled upon. +In this appendix section, we'll list a few of the alternatives along with our rationale for not pursuing them. -#### Bloom Filters +### Bloom Filters -Bloom Filters are perhaps the best known probabilistic data structure for -testing set membership, and were introduced into the Bitcoin protocol with BIP -37. The size of a Bloom filter is larger than the expected size of a GCS with -the same false positive rate, which is the main reason the option was rejected. +Bloom Filters are perhaps the best known probabilistic data structure for testing set membership, and were introduced into the Bitcoin protocol with BIP 37. +The size of a Bloom filter is larger than the expected size of a GCS with the same false positive rate, which is the main reason the option was rejected. -#### Cryptographic Accumulators +### Cryptographic Accumulators -[Cryptographic -accumulators](https://en.wikipedia.org/wiki/Accumulator_(cryptography)) -are a cryptographic data structures that enable (amongst other operations) a one -way membership test. One advantage of accumulators are that they are constant -size, independent of the number of elements inserted into the accumulator. -However, current constructions of cryptographic accumulators require an initial -trusted set up. Additionally, accumulators based on the Strong-RSA Assumption -require mapping set items to prime representatives in the associated group which -can be preemptively expensive. +[Cryptographic accumulators](https://en.wikipedia.org/wiki/Accumulator_(cryptography)) are a cryptographic data structures that enable (amongst other operations) a one way membership test. +One advantage of accumulators are that they are constant size, independent of the number of elements inserted into the accumulator. +However, current constructions of cryptographic accumulators require an initial trusted set up. +Additionally, accumulators based on the Strong-RSA Assumption require mapping set items to prime representatives in the associated group which can be preemptively expensive. -#### Matrix Based Probabilistic Set Data Structures +### Matrix Based Probabilistic Set Data Structures -There exist data structures based on matrix solving which are even more space -efficient compared to [Bloom -filters](https://arxiv.org/pdf/0804.1845.pdf). We instead opted for our -GCS-based filters as they have a much lower implementation complexity and are -easier to understand. +There exist data structures based on matrix solving which are even more space efficient compared to [Bloom filters](https://arxiv.org/pdf/0804.1845.pdf). +We instead opted for our GCS-based filters as they have a much lower implementation complexity and are easier to understand. ## Appendix B: Pseudocode @@ -416,7 +352,8 @@ gcs_match_any(key: [16]byte, compressed_set: []byte, targets: [][]byte, P: uint, ## Appendix C: Test Vectors -Test vectors for basic block filters on five testnet blocks, including the filters and filter headers, can be found [here](https://github.com/bitcoin/bips/blob/master/bip-0158/testnet-19.json). The code to generate them can be found [here](https://github.com/bitcoin/bips/blob/master/bip-0158/gentestvectors.go). +Test vectors for basic block filters on five testnet blocks, including the filters and filter headers, can be found [here](https://github.com/bitcoin/bips/blob/master/bip-0158/testnet-19.json). +The code to generate them can be found [here](https://github.com/bitcoin/bips/blob/master/bip-0158/gentestvectors.go). ## References @@ -424,4 +361,4 @@ Test vectors for basic block filters on five testnet blocks, including the filte ## Copyright -This document is licensed under the Creative Commons CC0 1.0 Universal license. \ No newline at end of file +This document is licensed under the Creative Commons CC0 1.0 Universal license. diff --git a/protocol/forks/bip-0159.md b/protocol/forks/bip-0159.md index 63115de..36e5c56 100644 --- a/protocol/forks/bip-0159.md +++ b/protocol/forks/bip-0159.md @@ -1,35 +1,35 @@ -
-  BIP: 159
-  Layer: Peer Services
-  Title: NODE_NETWORK_LIMITED service bit
-  Author: Jonas Schnelli <dev@jonasschnelli.ch>
-  Comments-Summary: No comments yet.
-  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0159
-  Status: Draft
-  Type: Standards Track
-  Created: 2017-05-11
-  License: BSD-2-Clause
-
+# BIP-0159 -# Abstract + BIP: 159 + Layer: Peer Services + Title: NODE_NETWORK_LIMITED service bit + Author: Jonas Schnelli <dev@jonasschnelli.ch> + Comments-Summary: No comments yet. + Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0159 + Status: Draft + Type: Standards Track + Created: 2017-05-11 + License: BSD-2-Clause -Define a service bit that allow pruned peers to signal their limited services +## Abstract -# Motivation +Define a service bit that allow pruned peers to signal their limited services. + +## Motivation Pruned peers can offer the same services as traditional peer except of serving all historical blocks. -Bitcoin right now only offers the NODE_NETWORK service bit which indicates that a peer can serve -all historical blocks. +Bitcoin right now only offers the NODE_NETWORK service bit which indicates that a peer can serve all historical blocks. + 1. Pruned peers can relay blocks, headers, transactions, addresses and can serve a limited number of historical blocks, thus they should have a way how to announce their service(s) 2. Peers no longer in initial block download should consider connecting some of its outbound connections to pruned peers to allow other peers to bootstrap from non-pruned peers -# Specification +## Specification -## New service bit +### New service bit This BIP proposes a new service bit -| | | | -|--|--|--| +| | | | +|----------------------|----------------|-------------------------------------------------------------------------------------------------| | NODE_NETWORK_LIMITED | bit 10 (0x400) | If signaled, the peer MUST be capable of serving at least the last 288 blocks (~2 days). | A safety buffer of 144 blocks to handle chain reorganizations SHOULD be taken into account when connecting to a peer signaling the NODE_NETWORK_LIMITED service bit. @@ -40,13 +40,16 @@ Full nodes following this BIP SHOULD relay address/services (addrSHOULD avoid leaking the prune depth and therefore not serve blocks deeper than the signaled NODE_NETWORK_LIMITED threshold (288 blocks). +Peers may have different prune depths (depending on the peers configuration, disk space, etc.) which can result in a fingerprinting weakness (finding the prune depth through getdata requests). +NODE_NETWORK_LIMITED supporting peers SHOULD avoid leaking the prune depth and therefore not serve blocks deeper than the signaled NODE_NETWORK_LIMITED threshold (288 blocks). ### Risks Pruned peers following this BIP may consume more outbound bandwidth. -Light clients (and such) who are not checking the nServiceFlags (service bits) from a relayed addr-message may unwillingly connect to a pruned peer and ask for (filtered) blocks at a depth below their pruned depth. Light clients should therefore check the service bits (and eventually connect to peers signaling NODE_NETWORK_LIMITED if they require [filtered] blocks around the tip). Light clients obtaining peer IPs though DNS seed should use the DNS filtering option. +Light clients (and such) who are not checking the nServiceFlags (service bits) from a relayed addr-message may unwillingly connect to a pruned peer and ask for (filtered) blocks at a depth below their pruned depth. +Light clients should therefore check the service bits (and eventually connect to peers signaling NODE_NETWORK_LIMITED if they require [filtered] blocks around the tip). +Light clients obtaining peer IPs though DNS seed should use the DNS filtering option. ## Compatibility @@ -59,4 +62,4 @@ This proposal is backward compatible. ## Copyright -This BIP is licensed under the 2-clause BSD license. \ No newline at end of file +This BIP is licensed under the 2-clause BSD license. diff --git a/protocol/forks/hf-20171113.md b/protocol/forks/hf-20171113.md index 6fb44af..3c90e09 100644 --- a/protocol/forks/hf-20171113.md +++ b/protocol/forks/hf-20171113.md @@ -1,3 +1,5 @@ +# HF-20171113 + ``` layout: specification title: November 13th Bitcoin Cash Hardfork Technical Details @@ -7,16 +9,18 @@ activation: 1510600000 version: 1.3 ``` -# Summary +## Summary When the median time past[1] of the most recent `11` blocks (`MTP - 11`) is greater than or equal to UNIX timestamp `1510600000` Bitcoin Cash will execute a hardfork according to this specification. Starting from the next block these three consensus rules changes will take effect: * Enforcement of LOW_S signatures ([BIP 0146](//github.com/bitcoin/bips/blob/master/bip-0146.mediawiki#low_s)) * Enforcement of NULLFAIL ([BIP 0146](//github.com/bitcoin/bips/blob/master/bip-0146.mediawiki#nullfail)) -* A replacement for the emergency difficulty adjustment. The algorithm for the new difficulty adjustment is described below +* A replacement for the emergency difficulty adjustment. -# Difficulty Adjustment Algorithm Description +The algorithm for the new difficulty adjustment is described below + +## Difficulty Adjustment Algorithm Description To calculate the difficulty of a given block (`Bn + 1`), with an `MTP-11`[1] greater than or equal to the unix timestamp `1510600000`, perform the following steps: @@ -25,22 +29,25 @@ _NOTE: Implementations must use integer arithmetic only_ 1. Let `Bn` be the Nth block in a Bitcoin Cash Blockchain. 1. Let `Blast` be chosen[2] from `[Bn - 2, Bn - 1, Bn]`. 1. Let `Bfirst` be chosen[2] from `[Bn - 146, Bn - 145, Bn - 144]`. -1. Let the Timespan (`TS`) be equal to the difference in UNIX timestamps (in seconds) between `Blast` and `Bfirst` within the range `[72 * 600, 288 * 600]`. Values outside should be treated as their respective limit. -1. Let the Work Performed (`W`) be equal to the difference in chainwork[3] between Blast and Bfirst. +1. Let the Timespan (`TS`) be equal to the difference in UNIX timestamps (in seconds) between `Blast` and `Bfirst` within the range `[72 * 600, 288 * 600]`. +Values outside should be treated as their respective limit. +1. Let the Work Performed (`W`) be equal to the difference in chainwork[3] between Blast and Bfirst. 1. Let the Projected Work (`PW`) be equal to `(W * 600) / TS`. 1. Let Target (`T`) be equal to the `(2256 - PW) / PW`. This is calculated by taking the two’s complement of `PW` (`-PW`) and dividing it by `PW` (`-PW / PW`). 1. The target difficulty for block `Bn + 1` is then equal to the lesser of `T` and `0x00000000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF` -# Test Case +## Test Case 1. Create a genesis block with the following data: + ``` nHeight = 0; nTime = 1269211443; nBits = 0x1C0FFFFF; ``` + 2. Add `2049` blocks at `600` second intervals with the same `nBits`. -1. Add another `10` blocks at `600` second intervals. `nBits` should remain constant. +1. Add another `10` blocks at `600` second intervals. `nBits` should remain constant. 1. Add a block `6000` seconds in the future with `nBits` remaining the same. 1. Add a block `-4800` seconds from the previous block. `nBits` should remain the constant. 1. Add `20` blocks at `600` second intervals. `nBits` should remain constant. @@ -59,27 +66,28 @@ _NOTE: Implementations must use integer arithmetic only_ 1. `nBits` should be `0x1D00FFFF`. 1. Add `5` blocks at `6000` second intervals. Target should stay constant at the maximum value. -# References +## References - [Algorithm](//github.com/Bitcoin-ABC/bitcoin-abc/commit/be51cf295c239ff6395a0aa67a3e13906aca9cb2) - [Activation](//github.com/Bitcoin-ABC/bitcoin-abc/commit/18dc8bb907091d69f4887560ab2e4cfbc19bae77) - [Activation Time](//github.com/Bitcoin-ABC/bitcoin-abc/commit/8eed7939c72781a812fdf3fb8c36d4e3a428d268) - [Test Case](//github.com/Bitcoin-ABC/bitcoin-abc/blob/d8eac91f8d16716eed0ad11ccac420122280bb13/src/test/pow_tests.cpp#L193) -FAQ +## FAQ --- > Q: Does this imply that if the blocks are timestamped sequentially, the last block has no effect since it will look at the block before that one? > > A: Yes -Footnotes ---------- +## Footnotes + +--- 1. The `MTP-11` of a block is defined as the median timestamp of the last `11` blocks prior to, and including, a specific block. 1. A block is chosen via the following mechanism: ->Given a list: `S = [Bn - 2, Bn - 1, Bn]` - +> Given a list: `S = [Bn - 2, Bn - 1, Bn]` +> >> a. If timestamp (`S0`) greater than timestamp (`S2`) then swap `S0` and `S2`. >> >> b. If timestamp (`S0`) greater than timestamp (`S1`) then swap `S0` and `S1`. @@ -87,7 +95,8 @@ Footnotes >> c. If timestamp (`S1`) greater than timestamp (`S2`) then swap `S1` and `S2`. >> >> d. Return `S1`. - +> > See [GetSuitableBlock](https://github.com/Bitcoin-ABC/bitcoin-abc/commit/be51cf295c239ff6395a0aa67a3e13906aca9cb2#diff-ba91592f703a9d0badf94e67144bc0aaR208) -3. Chainwork for a Block (B) is the sum of block proofs from the genesis block up to and including block `B`. `Block proof` is defined in [chain.cpp](https://github.com/Bitcoin-ABC/bitcoin-abc/blob/d8eac91f8d16716eed0ad11ccac420122280bb13/src/chain.cpp#L132) +3. Chainwork for a Block (B) is the sum of block proofs from the genesis block up to and including block `B`. + `Block proof` is defined in [chain.cpp](https://github.com/Bitcoin-ABC/bitcoin-abc/blob/d8eac91f8d16716eed0ad11ccac420122280bb13/src/chain.cpp#L132) diff --git a/protocol/forks/hf-20180515.md b/protocol/forks/hf-20180515.md new file mode 100644 index 0000000..5e70336 --- /dev/null +++ b/protocol/forks/hf-20180515.md @@ -0,0 +1,40 @@ +# HF-20180515 + + layout: specification + title: May 2018 Hardfork Specification + category: spec + date: 2018-04-09 + activation: 1526400000 + +## Summary + +When the median time past of the most recent 11 blocks (MTP-11) is greater than or equal to UNIX timestamp 1526400000 Bitcoin Cash will execute a hardfork according to this specification. +Starting from the next block these consensus rules changes will take effect: + +* Blocksize increase to 32,000,000 bytes +* Re-enabling of several opcodes + +The following are not consensus changes, but are recommended changes for Bitcoin Cash implementations: + +* Automatic replay protection for future hardforks +* Increase OP_RETURN relay size to 223 total bytes + +## Blocksize increase + +The blocksize hard capacity limit will be increased to 32MB (32000000 bytes). + +## OpCodes + +Several opcodes will be re-enabled per [may-2018-reenabled-opcodes](/protocol/forks/may-2018-reenabled-opcodes) + +## Automatic Replay Protection + +When the median time past of the most recent 11 blocks (MTP-11) is greater than or equal to UNIX timestamp 1542300000 (November 2018 hardfork) Bitcoin Cash full nodes implementing the May 2018 consensus rules SHOULD enforce the following change: + +* Update `forkid`[1] to be equal to 0xFF0001. + +ForkIDs beginning with 0xFF will be reserved for future protocol upgrades. + +This particular consensus rule MUST NOT be implemented by Bitcoin Cash wallet software. + +[1] The `forkId` is defined as per the [replay protected sighash](/protocol/forks/replay-protected-sighash) specification. diff --git a/protocol/forks/hf-20181115.md b/protocol/forks/hf-20181115.md new file mode 100644 index 0000000..40d8ea6 --- /dev/null +++ b/protocol/forks/hf-20181115.md @@ -0,0 +1,76 @@ +# HF-20181115 + + layout: specification + title: 2018 November 15 Network Upgrade Specification + date: 2018-10-10 + category: spec + activation: 1542300000 + version: 0.5 + +## Summary + +When the median time past [1] of the most recent 11 blocks (MTP-11) is greater than or equal to UNIX timestamp 1542300000, Bitcoin Cash will execute an upgrade of the network consensus rules according to this specification. +Starting from the next block these consensus rules changes will take effect: + +* Remove topological transaction order constraint, and enforce canonical transaction order +* Enable OP_CHECKDATASIG and OP_CHECKDATASIGVERIFY opcodes +* Enforce minimum transaction size +* Enforce "push only" rule for scriptSig +* Enforce "clean stack" rule + +The following are not consensus changes, but are recommended changes for Bitcoin Cash implementations: + +* Automatic replay protection for future upgrade + +## Canonical Transaction Order + +With the exception of the coinbase transaction, transactions within a block MUST be sorted in numerically ascending order of the transaction id, interpreted as 256-bit little endian integers. +The coinbase transaction MUST be the first transaction in a block. + +## OpCodes + +New opcodes OP_CHECKDATASIG and OP_CHECKDATASIGVERIFY will be enabled as specified in [op_checkdatasig](/protocol/forks/op_checkdatasig) [2]. + +## Minimum Transaction Size + +Transactions that are smaller than 100 bytes shall be considered invalid. +This protects against a Merkle tree vulnerability that allows attackers to spoof transactions against SPV wallets [3]. + +## Push Only + +Transactions shall be considered invalid if an opcode with number greater than 96 (hex encoding 0x60) appears in a scriptSig. +This is the same as Bitcoin BIP 62 rule #2 [4]. + +## Clean Stack + +For a transaction to be valid, only a single non-zero item must remain on the stack upon completion of Script evaluation. +If any extra data elements remain on the stack, the script evaluates to false. +This is the same as Bitcoin BIP 62 rule #6 [4]. + +## Automatic Replay Protection + +When the median time past [2] of the most recent 11 blocks (MTP-11) is less than UNIX timestamp 1557921600 (May 2019 upgrade) Bitcoin Cash full nodes MUST enforce the following rule: + +* `forkid` [5] to be equal to 0. + +When the median time past [1] of the most recent 11 blocks (MTP-11) is greater than or equal to UNIX timestamp 1557921600 (May 2019 upgrade) Bitcoin Cash full nodes implementing the November 2018 consensus rules SHOULD enforce the following change: + +* Update `forkid` [5] to be equal to 0xFF0001. + +ForkIDs beginning with 0xFF will be reserved for future protocol upgrades. + +This particular consensus rule MUST NOT be implemented by Bitcoin Cash wallet software. +Wallets that follow the upgrade should not have to change anything. + +## References + +[1] Median Time Past is described in [bitcoin.it wiki](https://en.bitcoin.it/wiki/Block_timestamp). +It is guaranteed by consensus rules to be monotonically increasing. + +[2] https://github.com/bitcoincashorg/bitcoincash.org/blob/master/spec/op_checkdatasig.md + +[3] [Leaf-Node weakness in Bitcoin Merkle Tree Design](https://bitslog.wordpress.com/2018/06/09/leaf-node-weakness-in-bitcoin-merkle-tree-design/) + +[4] [BIP 62](https://github.com/bitcoin/bips/blob/master/bip-0062.mediawiki) + +[5] The `forkId` is defined as per the [replay protected sighash](/protocol/forks/replay-protected-sighash) specification. diff --git a/protocol/forks/hf-20190515.md b/protocol/forks/hf-20190515.md new file mode 100644 index 0000000..9a05d0d --- /dev/null +++ b/protocol/forks/hf-20190515.md @@ -0,0 +1,54 @@ +# HF-20190515 + + layout: specification + title: 2019-MAY-15 Network Upgrade Specification + date: 2019-02-28 + category: spec + activation: 1557921600 + version: 0.5 + +## Summary + +When the median time past [1] of the most recent 11 blocks (MTP-11) is greater than or equal to UNIX timestamp 1557921600, Bitcoin Cash will execute an upgrade of the network consensus rules according to this specification. +Starting from the next block these consensus rules changes will take effect: + +* Enable Schnorr signatures. +* Allow Segwit recovery. + +The following are not consensus changes, but are recommended changes for Bitcoin Cash implementations: + +* Automatic replay protection for future upgrade + +## Enable Schnorr signatures + +Support Schnorr signatures in CHECKSIG and CHECKDATASIG per [2019-05-15-schnorr](/protocol/forks/2019-05-15-schnorr). + +## Allow Segwit recovery + +In the last upgrade, coins accidentally sent to Segwit P2SH addresses were made unspendable by the CLEANSTACK rule. +This upgrade will make an exemption for these coins and return them to the previous situation, where they are spendable. +This means that once the P2SH redeem script pre-image is revealed (for example by spending coins from the corresponding BTC address), any miner can take the coins. + +Details: [2019-05-15-segwit-recovery](/protocol/forks/2019-05-15-segwit-recovery) + +## Automatic Replay Protection + +When the median time past [1] of the most recent 11 blocks (MTP-11) is less than UNIX timestamp 1573819200 (Nov 2019 upgrade) Bitcoin Cash full nodes MUST enforce the following rule: + +* `forkid` [2] to be equal to 0. + +When the median time past [1] of the most recent 11 blocks (MTP-11) is greater than or equal to UNIX timestamp 1573819200 (Nov 2019 upgrade) Bitcoin Cash full nodes implementing the May 2019 consensus rules SHOULD enforce the following change: + +* Update `forkid` [2] to be equal to 0xFF0002. + +ForkIDs beginning with 0xFF will be reserved for future protocol upgrades. + +This particular consensus rule MUST NOT be implemented by Bitcoin Cash wallet software. +Wallets that follow the upgrade should not have to change anything. + +## References + +[1] Median Time Past is described in [bitcoin.it wiki](https://en.bitcoin.it/wiki/Block_timestamp). +It is guaranteed by consensus rules to be monotonically increasing. + +[2] The `forkId` is defined as per the [replay protected sighash](/protocol/forks/replay-protected-sighash) specification. diff --git a/protocol/forks/hf-20191115.md b/protocol/forks/hf-20191115.md new file mode 100644 index 0000000..f952b9b --- /dev/null +++ b/protocol/forks/hf-20191115.md @@ -0,0 +1,61 @@ +# HF-201911115 + + layout: specification + title: 2019-NOV-15 Network Upgrade Specification + date: 2019-10-23 + category: spec + activation: 1573819200 + version: 0.4 + +## Summary + +When the median time past [1] of the most recent 11 blocks (MTP-11) is greater than or equal to UNIX timestamp 1573819200, Bitcoin Cash will execute an upgrade of the network consensus rules according to this specification. +Starting from the next block these consensus rules changes will take effect: + +* Enable Schnorr signatures for OP_CHECKMULTISIG(VERIFY). +* Enforce minimal push and minimal number encoding rules in Script. + +The following are not consensus changes, but are recommended changes for Bitcoin Cash implementations: + +* Automatic replay protection for future upgrade + +## Schnorr Signatures for OP_CHECKMULTISIG(VERIFY) + +Use of Schnorr signatures is enabled in OP_CHECKMULTISIG(VERIFY). +The dummy element is repurposed to flag Schnorr mode when it is non-null, and the order of signatures in Schnorr mode is constrained according to the bitfield encoded in the repurposed dummy element. + +Details can be found in the [full specification: 2019-11-15-schnorrmultisig](/protocol/forks/2019-11-15-schnorrmultisig). + +NOTE: The repurposing of the dummy element as a flag and bitfield supersedes the need for NULLDUMMY. + +## Enforce MINIMALDATA in Script. + +Enforce existing standardness checks that all executed data pushes use minimal push operators, and all numbers are encoded minimally, together known as the "MINIMALDATA" rule. +This goes into effect at the consensus layer. + +Details can be found in the [full specification: 2019-11-15-minimaldata](/protocol/forks/2019-11-15-minimaldata). + +## Automatic Replay Protection + +The purpose of Automatic Replay Protection is to serve as a full node version-deprecation mechanism. +It is intended to cause full validating nodes which do not upgrade, to automatically separate themselves from the main network after the next upgrade on 15 May 2020. +Nodes which implement the next upgrade will remove this automatic replay protection, and thus all regular wallets can continue using the default ForkID with no change to follow the main upgraded chain. + +When the median time past [1] of the most recent 11 blocks (MTP-11) is less than UNIX timestamp 1589544000 (May 2020 upgrade) Bitcoin Cash full nodes MUST enforce the following rule: + +* `forkid` [2] to be equal to 0. + +When the median time past [1] of the most recent 11 blocks (MTP-11) is greater than or equal to UNIX timestamp 1589544000 (May 2020 upgrade) Bitcoin Cash full nodes implementing the Nov 2019 consensus rules SHOULD enforce the following change: + +* Update `forkid` [2] to be equal to `0xFFXXXX`, where `XXXX` is some arbitrary hex value. +ForkIDs beginning with 0xFF will be reserved for future protocol upgrades. + +This particular consensus rule MUST NOT be implemented by Bitcoin Cash wallet software. +Wallets that follow the upgrade should not have to change anything. + +## References + +[1] Median Time Past is described in [bitcoin.it wiki](https://en.bitcoin.it/wiki/Block_timestamp). +It is guaranteed by consensus rules to be monotonically increasing. + +[2] The `forkId` is defined as per the [replay protected sighash](/protocol/forks/replay-protected-sighash) specification. diff --git a/protocol/forks/may-2018-reenabled-opcodes.md b/protocol/forks/may-2018-reenabled-opcodes.md new file mode 100644 index 0000000..3151f01 --- /dev/null +++ b/protocol/forks/may-2018-reenabled-opcodes.md @@ -0,0 +1,494 @@ +# Restore disabled script opcodes, May 2018 + + layout: specification + title: Restore disabled script opcodes, May 2018 + category: spec + date: 2018-04-05 + activation: 1526400000 + version: 0.4 + updated: 2018-05-23 + +## Introduction + +In 2010 and 2011 the discovery of serious bugs prompted the deactivation of many opcodes in the Bitcoin script language. +It is our intention to restore the functionality that some of these opcodes provided in Bitcoin Cash. +Rather than simply re-enable the opcodes, the functionality that they provide has been re-examined and in some cases the opcodes have been re-designed or new opcodes have been added to address specific issues. + +This document contains the specifications for the opcodes that are to be added in the May 2018 protocol upgrade. +We anticipate that additional opcodes will be proposed for the November 2018, or later, protocol upgrades. + +The opcodes that are to be added are: + +|Word |OpCode |Hex |Input |Output | Description | +|-----------|-------|----|--------------|--------|------------------------------------------------------------------| +|OP_CAT |126 |0x7e|x1 x2 |out |Concatenates two byte sequences | +|OP_SPLIT |127 |0x7f|x n |x1 x2 |Split byte sequence *x* at position *n* | +|OP_AND |132 |0x84|x1 x2 |out |Boolean *AND* between each bit of the inputs | +|OP_OR |133 |0x85|x1 x2 |out |Boolean *OR* between each bit of the inputs | +|OP_XOR |134 |0x86|x1 x2 |out |Boolean *EXCLUSIVE OR* between each bit of the inputs | +|OP_DIV |150 |0x96|a b |out |*a* is divided by *b* | +|OP_MOD |151 |0x97|a b |out |return the remainder after *a* is divided by *b* | +|OP_NUM2BIN |128 |0x80|a b |out |convert numeric value *a* into byte sequence of length *b* | +|OP_BIN2NUM |129 |0x81|x |out |convert byte sequence *x* into a numeric value | + +Splice operations: `OP_CAT`, `OP_SPLIT`** + +Bitwise logic: `OP_AND`, `OP_OR`, `OP_XOR` + +Arithmetic: `OP_DIV`, `OP_MOD` + +New operations: + +* `x OP_BIN2NUM -> n`, convert a byte sequence `x` into a numeric value +* `n m OP_NUM2BIN -> out`, convert a numeric value `n` into a byte sequence of length `m` + +Further discussion of the purpose of these new operations can be found below under *bitwise operations*. + +** A new operation, `OP_SPLIT`, has been designed as a replacement for `OP_SUBSTR`, `OP_LEFT`and `OP_RIGHT`. +The original operations can be implemented with varying combinations of `OP_SPLIT`, `OP_SWAP` and `OP_DROP`. + +## Script data types + +It should be noted that in script operation data values on the stack are interpreted as either byte sequences or numeric values. +**All data on the stack is interpreted as a byte sequence unless specifically stated as being interpreted as a numeric value.** + +For accuracy in this specification, a byte sequences is presented as {0x01, 0x02, 0x03}. +This sequence is three bytes long, it begins with a byte of value 1 and ends with a byte of value 3. + +The numeric value type has specific limitations: + +1. The used encoding is little endian with an explicit sign bit (the highest bit of the last byte). +2. They cannot exceed 4 bytes in length. +3. They must be encoded using the shortest possible byte length (no zero padding) + 1. There is one exception to rule 3: if there is more than one byte and the most significant bit of the second-most-significant-byte is set it would conflict with the sign bit. + In this case a single 0x00 or 0x80 byte is allowed to the left. +4. Zero is encoded as a zero length byte sequence. +Single byte positive or negative zero (0x00 or 0x80) are not allowed. + +The new opcode `x OP_BIN2NUM -> out` can be used convert a byte sequence into a numeric value where required. + +The new opcode `x n OP_NUM2BIN` can be used to convert a numeric value into a zero padded byte sequence of length `n` whilst preserving the sign bit. + +## Definitions + +* *Stack memory use*. +This is the sum of the size of the elements on the stack. +It gives an indication of impact on memory use by the interpreter. +* *Operand order*. +In keeping with convention where multiple operands are specified the top most stack item is the last operand. +e.g. `x1 x2 OP_CAT` --> `x2` is the top stack item and `x1` is the next from the top. +* *empty byte sequence*. +Throughout this document `OP_0` is used as a convenient representation of an empty byte sequence. +Whilst it is a push data op code, its effect is to push an empty byte sequence on to the stack. + +## Specification + +Global conditions apply to all operations. +These conditions must be checked by the implementation when it is possible that they will occur: + +* for all e : elements on the stack, `0 <= len(e) <= MAX_SCRIPT_ELEMENT_SIZE` +* for each operator, the required number of operands are present on the stack when the operand is executed + +These unit tests should be included for every operation: + +1. executing the operation with an input element of length greater than `MAX_SCRIPT_ELEMENT_SIZE` will fail +2. executing the operation with an insufficient number of operands on the stack causes a failure + +### Operand consumption + +In all cases where not explicitly stated otherwise the operand stack elements are consumed by the operation and replaced with the output. + +## Splice operations + +### OP_CAT + + Opcode (decimal): 126 + Opcode (hex): 0x7e + +Concatenates two operands. + + x1 x2 OP_CAT -> out + +Examples: + +* `{Ox11} {0x22, 0x33} OP_CAT -> 0x112233` + +The operator must fail if `len(out) > MAX_SCRIPT_ELEMENT_SIZE`. +The operation cannot output elements that violate the constraint on the element size. + +Note that the concatenation of a zero length operand is valid. + +Impact of successful execution: + +* stack memory use is constant +* number of elements on stack is reduced by one + +The limit on the length of the output prevents the memory exhaustion attack and results in the operation having less impact on stack size than existing OP_DUP operators. + +Unit tests: + +1. `maxlen_x y OP_CAT -> failure`. +Concatenating any operand except an empty vector, including a single byte value (e.g. `OP_1`), onto a maximum sized array causes failure +2. `large_x large_y OP_CAT -> failure`. +Concatenating two operands, where the total length is greater than `MAX_SCRIPT_ELEMENT_SIZE`, causes failure +3. `OP_0 OP_0 OP_CAT -> OP_0`. +Concatenating two empty arrays results in an empty array +4. `x OP_0 OP_CAT -> x`. +Concatenating an empty array onto any operand results in the operand, including when `len(x) = MAX_SCRIPT_ELEMENT_SIZE` +5. `OP_0 x OP_CAT -> x`. +Concatenating any operand onto an empty array results in the operand, including when `len(x) = MAX_SCRIPT_ELEMENT_SIZE` +6. `x y OP_CAT -> concat(x,y)`. +Concatenating two operands generates the correct result + +### OP_SPLIT + +*`OP_SPLIT` replaces `OP_SUBSTR` and uses it's opcode.* + + Opcode (decimal): 127 + Opcode (hex): 0x7f + +Split the operand at the given position. +This operation is the exact inverse of OP_CAT + + x n OP_SPLIT -> x1 x2 + + where n is interpreted as a numeric value + +Examples: + +* `{0x00, 0x11, 0x22} 0 OP_SPLIT -> OP_0 {0x00, 0x11, 0x22}` +* `{0x00, 0x11, 0x22} 1 OP_SPLIT -> {0x00} {0x11, 0x22}` +* `{0x00, 0x11, 0x22} 2 OP_SPLIT -> {0x00, 0x11} {0x22}` +* `{0x00, 0x11, 0x22} 3 OP_SPLIT -> {0x00, 0x11, 0x22} OP_0` + +Notes: + +* this operator has been introduced as a replacement for the previous `OP_SUBSTR`, `OP_LEFT`and `OP_RIGHT`. +All three operators can be simulated with varying combinations of `OP_SPLIT`, `OP_SWAP` and `OP_DROP`. +This is in keeping with the minimalist philosophy where a singlenprimitive can be used to simulate multiple more complex operations. +* `x` is split at position `n`, where `n` is the number of bytes from the beginning +* `x1` will be the first `n` bytes of `x` and `x2` will be the remaining bytes +* if `n == 0`, then `x1` is the empty array and `x2 == x` +* if `n == len(x)` then `x1 == x` and `x2` is the empty array. +* if `n > len(x)`, then the operator must fail. +* `x n OP_SPLIT OP_CAT -> x`, for all `x` and for all `0 <= n <= len(x)` + +The operator must fail if: + +* `!isnum(n)`. +Fail if `n` is not a numeric value. +* `n < 0`. +Fail if `n` is negative. +* `n > len(x)`. +Fail if `n` is too high. + +Impact of successful execution: + +* stack memory use is constant (slight reduction by `len(n)`) +* number of elements on stack is constant + +Unit tests: + +* `OP_0 0 OP_SPLIT -> OP_0 OP_0`. +Execution of OP_SPLIT on empty array results in two empty arrays. +* `x 0 OP_SPLIT -> OP_0 x` +* `x len(x) OP_SPLIT -> x OP_0` +* `x (len(x) + 1) OP_SPLIT -> FAIL` +* include successful unit tests + +## Bitwise logic + +The bitwise logic operators expect 'byte sequence' operands. +The operands must be the same length. + +* In the case of 'byte sequence' operands `OP_CAT` can be used to pad a shorter byte sequence to an appropriate length. +* In the case of 'byte sequence' operands where the length of operands is not known until runtime a sequence of 0x00 bytes (for use with `OP_CAT`) can be produced using `OP_0 n OP_NUM2BIN` +* In the case of numeric value operands `x n OP_NUM2BIN` can be used to pad a numeric value to length `n` whilst preserving the sign bit. + +### OP_AND + + Opcode (decimal): 132 + Opcode (hex): 0x84 + +Boolean *and* between each bit in the operands. + + x1 x2 OP_AND -> out + +Notes: + +* where `len(x1) == 0` and `len(x2) == 0` the output will be an empty array. + +The operator must fail if: + +1. `len(x1) != len(x2)`. +The two operands must be the same size. + +Impact of successful execution: + +* stack memory use reduced by `len(x1)` +* number of elements on stack is reduced by one + +Unit tests: + +1. `x1 x2 OP_AND -> failure`, where `len(x1) != len(x2)`. +The two operands must be the same size. +2. `x1 x2 OP_AND -> x1 & x2`. +Check valid results. + +### OP_OR + + Opcode (decimal): 133 + Opcode (hex): 0x85 + +Boolean *or* between each bit in the operands. + + x1 x2 OP_OR -> out + +The operator must fail if: + +1. `len(x1) != len(x2)`. +The two operands must be the same size. + +Impact of successful execution: + +* stack memory use reduced by `len(x1)` +* number of elements on stack is reduced by one + +Unit tests: + +1. `x1 x2 OP_OR -> failure`, where `len(x1) != len(x2)`. The two operands must be the same size. +2. `x1 x2 OP_OR -> x1 | x2`. Check valid results. + +### OP_XOR + + Opcode (decimal): 134 + Opcode (hex): 0x86 + +Boolean *xor* between each bit in the operands. + + x1 x2 OP_XOR -> out + +The operator must fail if: + +1. `len(x1) != len(x2)`. +The two operands must be the same size. + +Impact of successful execution: + +* stack memory use reduced by `len(x1)` +* number of elements on stack is reduced by one + +Unit tests: + +1. `x1 x2 OP_XOR -> failure`, where `len(x1) != len(x2)`. +The two operands must be the same size. +2. `x1 x2 OP_XOR -> x1 xor x2`. +Check valid results. + +## Arithmetic + +### Note about canonical form and floor division + +Operands for all arithmetic operations are assumed to be numeric values and must be in canonical form +See [data types](#Script%20data%20types) for more information. + +#### Floor division + +Note that when considering integer division and modulo operations with negative operands, the rules applied in the C language and most languages (with Python being a notable exception) differ from the strict mathematical definition. +Script follows the C language set of rules. +Namely: + +1. Non-integer quotients are rounded towards zero +2. The equation `(a/b)*b + a%b == a` is satisfied by the results +3. From the above equation it follows that: `a%b == a - (a/b)*b` +4. In practice if `a` is negative for the modulo operator the result will be negative or zero. + +### OP_DIV + + Opcode (decimal): 150 + Opcode (hex): 0x96 + +Return the integer quotient of `a` and `b`. +If the result would be a non-integer it is rounded *towards* zero. + + a b OP_DIV -> out + +where a and b are interpreted as numeric values + +The operator must fail if: + +1. `!isnum(a) || !isnum(b)`. +Fail if either operand is not a numeric value. +1. `b == 0`. +Fail if `b` is equal to any type of zero. + +Impact of successful execution: + +* stack memory use reduced +* number of elements on stack is reduced by one + +Unit tests: + +1. `a b OP_DIV -> failure` where `!isnum(a)` or `!isnum(b)`. +Both operands must be numeric values +2. `a 0 OP_DIV -> failure`. +Division by positive zero (all sizes), negative zero (all sizes), `OP_0` +3. `27 7 OP_DIV -> 3`, `27 -7 OP_DIV -> -3`, `-27 7 OP_DIV -> -3`, `-27 -7 OP_DIV -> 3`. +Check negative operands. +*Pay attention to sign*. +4. check valid results for operands of different lengths `0..4` + +### OP_MOD + + Opcode (decimal): 151 + Opcode (hex): 0x97 + +Returns the remainder after dividing a by b. +The output will be represented using the least number of bytes required. + + a b OP_MOD -> out + +where a and b are interpreted as numeric values + +The operator must fail if: + +1. `!isnum(a) || !isnum(b)`. +Fail if either operand is not a numeric value. +1. `b == 0`. +Fail if `b` is equal to any type of zero. + +Impact of successful execution: + +* stack memory use reduced (one element removed) +* number of elements on stack is reduced by one + +Unit tests: + +1. `a b OP_MOD -> failure` where `!isnum(a)` or `!isnum(b)`. +Both operands must be numeric values. +2. `a 0 OP_MOD -> failure`. +Division by positive zero (all sizes), negative zero (all sizes), `OP_0` +3. `27 7 OP_MOD -> 6`, `27 -7 OP_MOD -> 6`, `-27 7 OP_MOD -> -6`, `-27 -7 OP_MOD -> -6`. +Check negative operands. +*Pay attention to sign*. +4. check valid results for operands of different lengths `0..4` and returning result zero + +## New operations + +### OP_NUM2BIN + +*`OP_NUM2BIN` replaces `OP_LEFT` and uses it's opcode* + + Opcode (decimal): 128 + Opcode (hex): 0x80 + +Convert the numeric value into a byte sequence of a certain size, taking account of the sign bit. +The byte sequence produced uses the little-endian encoding. + + a b OP_NUM2BIN -> x + +where `a` and `b` are interpreted as numeric values. +`a` is the value to be converted to a byte sequence, +it can be up to `MAX_SCRIPT_ELEMENT_SIZE` long and does not need to be minimally encoded. +`b` is the desired size of the result, it must be minimally encoded and <= 4 bytes long. +It must be possible for the +value `a` to be encoded in a byte sequence of length `b` without loss of data. + +See also `OP_BIN2NUM`. + +Examples: + +* `2 4 OP_NUM2BIN -> {0x02, 0x00, 0x00, 0x00}` +* `-5 4 OP_NUM2BIN -> {0x05, 0x00, 0x00, 0x80}` + +The operator must fail if: + +1. `b` is not a minimally encoded numeric value. +2. `b < len(minimal_encoding(a))`. +`a` must be able to fit into `b` bytes. +3. `b > MAX_SCRIPT_ELEMENT_SIZE`. +The result would be too large. + +Impact of successful execution: + +* stack memory use will be increased by `b - len(a) - len(b)`, maximum increase is when `b = MAX_SCRIPT_ELEMENT_SIZE` +* number of elements on stack is reduced by one + +Unit tests: + +1. `a b OP_NUM2BIN -> failure` where `!isnum(b)`. +`b` must be a minimally encoded numeric value. +2. `256 1 OP_NUM2BIN -> failure`. +Trying to produce a byte sequence which is smaller than the minimum size needed to + contain the numeric value. +3. `1 (MAX_SCRIPT_ELEMENT_SIZE+1) OP_NUM2BIN -> failure`. +Trying to produce an array which is too large. +4. other valid parameters with various results + +### OP_BIN2NUM + +*`OP_BIN2NUM` replaces `OP_RIGHT` and uses it's opcode* + + Opcode (decimal): 129 + Opcode (hex): 0x81 + +Convert the byte sequence into a numeric value, including minimal encoding. +The byte sequence must encode the value in little-endian encoding. + + a OP_BIN2NUM -> x + +See also `OP_NUM2BIN`. + +Notes: + +* if `a` is any form of zero, including negative zero, then `OP_0` must be the result + +Examples: + +* `{0x02, 0x00, 0x00, 0x00, 0x00} OP_BIN2NUM -> 2`. +`0x0200000000` in little-endian encoding has value 2. +* `{0x05, 0x00, 0x80} OP_BIN2NUM -> -5` - `0x050080` in little-endian encoding has value -5. + +The operator must fail if: + +1.the numeric value is out of the range of acceptable numeric values (currently size is limited to 4 bytes) + +Impact of successful execution: + +* stack memory use is equal or less than before. +Minimal encoding of the byte sequence can produce a result which is shorter. +* the number of elements on the stack remains constant + +Unit tests: + +1. `a OP_BIN2NUM -> failure`, when `a` is a byte sequence whose numeric value is too large to fit into the numeric value type, for both positive and negative values. +2. `{0x00} OP_BIN2NUM -> OP_0`. +Byte sequences, of various lengths, consisting only of zeros should produce an OP_0 (zero length array). +3. `{0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00} OP_BIN2NUM -> 1`. +A large byte sequence, whose numeric value would fit in the numeric value type, is a valid operand. +4. The same test as above, where the length of the input byte sequence is equal to MAX_SCRIPT_ELEMENT_SIZE. +5. `{0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80} OP_BIN2NUM -> -1`. +Same as above, for negative values. +6. `{0x80} OP_BIN2NUM -> OP_0`. +Negative zero, in a byte sequence, should produce zero. +7. `{0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80} OP_BIN2NUM -> OP_0`. +Large negative zero, in a byte sequence, should produce zero. +8. other valid parameters with various results + +## Reference implementation + +* OP_AND, OP_OR, OP_XOR: https://reviews.bitcoinabc.org/D1211 + +* OP_DIV and OP_MOD: https://reviews.bitcoinabc.org/D1212 + +* OP_CAT: https://reviews.bitcoinabc.org/D1227 + +* OP_SPLIT: https://reviews.bitcoinabc.org/D1228 + +* OP_BIN2NUM: https://reviews.bitcoinabc.org/D1220 + +* OP_NUM2BIN: https://reviews.bitcoinabc.org/D1222 + +## References + +OP_CODES: https://en.bitcoin.it/wiki/Script#Opcodes diff --git a/protocol/forks/op_checkdatasig.md b/protocol/forks/op_checkdatasig.md index 95cabf7..5f2d0f8 100644 --- a/protocol/forks/op_checkdatasig.md +++ b/protocol/forks/op_checkdatasig.md @@ -1,34 +1,37 @@ ---- -layout: specification -title: OP_CHECKDATASIG and OP_CHECKDATASIGVERIFY Specification -category: spec -date: 2018-08-20 -activation: 1542300000 -version: 0.6 ---- +# HF-20180820 -OP_CHECKDATASIG -=============== + layout: specification + title: OP_CHECKDATASIG and OP_CHECKDATASIGVERIFY Specification + category: spec + date: 2018-08-20 + activation: 1542300000 + version: 0.6 + +## Summary OP_CHECKDATASIG and OP_CHECKDATASIGVERIFY check whether a signature is valid with respect to a message and a public key. OP_CHECKDATASIG permits data to be imported into a script, and have its validity checked against some signing authority such as an "Oracle". -OP_CHECKDATASIG and OP_CHECKDATASIGVERIFY are designed to be implemented similarly to OP_CHECKSIG [1]. Conceptually, one could imagine OP_CHECKSIG functionality being replaced by OP_CHECKDATASIG, along with a separate Op Code to create a hash from the transaction based on the SigHash algorithm. +OP_CHECKDATASIG and OP_CHECKDATASIGVERIFY are designed to be implemented similarly to OP_CHECKSIG [1][1]. +Conceptually, one could imagine OP_CHECKSIG functionality being replaced by OP_CHECKDATASIG, along with a separate Op Code to create a hash from the transaction based on the SigHash algorithm. -OP_CHECKDATASIG Specification ------------------------------ +## OP_CHECKDATASIG Specification ### Semantics -OP_CHECKDATASIG fails immediately if the stack is not well formed. To be well formed, the stack must contain at least three elements [``, ``, ``] in this order where `` is the top element and - * `` must be a validly encoded public key - * `` can be any string - * `` must follow the strict DER encoding as described in [2] and the S-value of `` must be at most the curve order divided by 2 as described in [3] +OP_CHECKDATASIG fails immediately if the stack is not well formed. +To be well formed, the stack must contain at least three elements [``, ``, ``] in this order where `` is the top element and -If the stack is well formed, then OP_CHECKDATASIG pops the top three elements [``, ``, ``] from the stack and pushes true onto the stack if `` is valid with respect to the raw single-SHA256 hash of `` and `` using the secp256k1 elliptic curve. Otherwise, it pops three elements and pushes false onto the stack in the case that `` is the empty string and fails in all other cases. +* `` must be a validly encoded public key +* `` can be any string +* `` must follow the strict DER encoding as described in [2][2] and the S-value of `` must be at most the curve order divided by 2 as described in [3][3] -Nullfail is enforced the same as for OP_CHECKSIG [3]. If the signature does not match the supplied public key and message hash, and the signature is not an empty byte array, the entire script fails. +If the stack is well formed, then OP_CHECKDATASIG pops the top three elements [``, ``, ``] from the stack and pushes true onto the stack if `` is valid with respect to the raw single-SHA256 hash of `` and `` using the secp256k1 elliptic curve. +Otherwise, it pops three elements and pushes false onto the stack in the case that `` is the empty string and fails in all other cases. + +Nullfail is enforced the same as for OP_CHECKSIG [3][3]. +If the signature does not match the supplied public key and message hash, and the signature is not an empty byte array, the entire script fails. ### Opcode Number @@ -36,7 +39,8 @@ OP_CHECKDATASIG uses the previously unused opcode number 186 (0xba in hex encodi ### SigOps -Signature operations accounting for OP_CHECKDATASIG shall be calculated the same as OP_CHECKSIG. This means that each OP_CHECKDATASIG shall be counted as one (1) SigOp. +Signature operations accounting for OP_CHECKDATASIG shall be calculated the same as OP_CHECKSIG. +This means that each OP_CHECKDATASIG shall be counted as one (1) SigOp. ### Activation @@ -53,12 +57,12 @@ Use of OP_CHECKDATASIG, unless occuring in an unexecuted OP_IF branch, will make - ` OP_CHECKDATASIG` pops three elements and pushes false onto the stack if `` is an empty byte array. - ` OP_CHECKDATASIG` pops three elements and pushes true onto the stack if `` is a valid signature of `` with respect to ``. -OP_CHECKDATASIGVERIFY Specification ------------------------------------ +## OP_CHECKDATASIGVERIFY Specification ### Semantics -OP_CHECKDATASIGVERIFY is equivalent to OP_CHECKDATASIG followed by OP_VERIFY. It leaves nothing on the stack, and will cause the script to fail immediately if the signature check does not pass. +OP_CHECKDATASIGVERIFY is equivalent to OP_CHECKDATASIG followed by OP_VERIFY. +It leaves nothing on the stack, and will cause the script to fail immediately if the signature check does not pass. ### Opcode Number @@ -66,7 +70,8 @@ OP_CHECKDATASIGVERIFY uses the previously unused opcode number 187 (0xbb in hex ### SigOps -Signature operations accounting for OP_CHECKDATASIGVERIFY shall be calculated the same as OP_CHECKSIGVERIFY. This means that each OP_CHECKDATASIGVERIFY shall be counted as one (1) SigOp. +Signature operations accounting for OP_CHECKDATASIGVERIFY shall be calculated the same as OP_CHECKSIGVERIFY. +This means that each OP_CHECKDATASIGVERIFY shall be counted as one (1) SigOp. ### Activation @@ -82,86 +87,89 @@ Use of OP_CHECKDATASIGVERIFY, unless occuring in an unexecuted OP_IF branch, wil - ` OP_CHECKDATASIGVERIFY` fails if `` is not a valid signature of `` with respect to ``. - ` OP_CHECKDATASIGVERIFY` pops the top three stack elements if `` is a valid signature of `` with respect to ``. -Sample Implementation [4, 5] ----------------------------- +## Sample Implementation [4][4], [5][5] -```c++ - case OP_CHECKDATASIG: - case OP_CHECKDATASIGVERIFY: { - // Make sure this remains an error before activation. - if ((flags & SCRIPT_ENABLE_CHECKDATASIG) == 0) { - return set_error(serror, SCRIPT_ERR_BAD_OPCODE); - } +``` +case OP_CHECKDATASIG: +case OP_CHECKDATASIGVERIFY: { + // Make sure this remains an error before activation. + if ((flags & SCRIPT_ENABLE_CHECKDATASIG) == 0) { + return set_error(serror, SCRIPT_ERR_BAD_OPCODE); + } - // (sig message pubkey -- bool) - if (stack.size() < 3) { - return set_error( - serror, SCRIPT_ERR_INVALID_STACK_OPERATION); - } + // (sig message pubkey -- bool) + if (stack.size() < 3) { + return set_error( + serror, SCRIPT_ERR_INVALID_STACK_OPERATION); + } - valtype &vchSig = stacktop(-3); - valtype &vchMessage = stacktop(-2); - valtype &vchPubKey = stacktop(-1); + valtype &vchSig = stacktop(-3); + valtype &vchMessage = stacktop(-2); + valtype &vchPubKey = stacktop(-1); - if (!CheckDataSignatureEncoding(vchSig, flags, - serror) || - !CheckPubKeyEncoding(vchPubKey, flags, serror)) { - // serror is set - return false; - } + if (!CheckDataSignatureEncoding(vchSig, flags, + serror) || + !CheckPubKeyEncoding(vchPubKey, flags, serror)) { + // serror is set + return false; + } - bool fSuccess = false; - if (vchSig.size()) { - valtype vchHash(32); - CSHA256() - .Write(vchMessage.data(), vchMessage.size()) - .Finalize(vchHash.data()); - uint256 message(vchHash); - CPubKey pubkey(vchPubKey); - fSuccess = pubkey.Verify(message, vchSig); - } + bool fSuccess = false; + if (vchSig.size()) { + valtype vchHash(32); + CSHA256() + .Write(vchMessage.data(), vchMessage.size()) + .Finalize(vchHash.data()); + uint256 message(vchHash); + CPubKey pubkey(vchPubKey); + fSuccess = pubkey.Verify(message, vchSig); + } - if (!fSuccess && (flags & SCRIPT_VERIFY_NULLFAIL) && - vchSig.size()) { - return set_error(serror, SCRIPT_ERR_SIG_NULLFAIL); - } + if (!fSuccess && (flags & SCRIPT_VERIFY_NULLFAIL) && + vchSig.size()) { + return set_error(serror, SCRIPT_ERR_SIG_NULLFAIL); + } - popstack(stack); - popstack(stack); - popstack(stack); - stack.push_back(fSuccess ? vchTrue : vchFalse); - if (opcode == OP_CHECKDATASIGVERIFY) { - if (fSuccess) { - popstack(stack); - } else { - return set_error(serror, - SCRIPT_ERR_CHECKDATASIGVERIFY); - } - } - } break; + popstack(stack); + popstack(stack); + popstack(stack); + stack.push_back(fSuccess ? vchTrue : vchFalse); + if (opcode == OP_CHECKDATASIGVERIFY) { + if (fSuccess) { + popstack(stack); + } else { + return set_error(serror, + SCRIPT_ERR_CHECKDATASIGVERIFY); + } + } +} break; ``` -Sample Usage ------------- +## Sample Usage -The following example shows a spend and redeem script for a basic use of CHECKDATASIG. This example validates the signature of some data, provides a placeholder where you would then process that data, and finally allows one of 2 signatures to spend based on the outcome of the data processing. +The following example shows a spend and redeem script for a basic use of CHECKDATASIG. +This example validates the signature of some data, provides a placeholder where you would then process that data, and finally allows one of 2 signatures to spend based on the outcome of the data processing. + +### spend script -### spend script: ``` push txsignature push txpubkey push msg push sig ``` -### redeem script: + +### redeem script + ``` (txsig, txpubkey msg, sig) OP_OVER (txsig, txpubkey, msg, sig, msg) push data pubkey (txsig, txpubkey, msg, sig, msg, pubkey) OP_CHECKDATASIGVERIFY (txsig, txpubkey, msg) ``` -Now that msg is on the stack top, the script can write predicates on it, -resulting in the message being consumed and a true/false condition left on the stack: (txpubkey, txsig, boolean) + +Now that msg is on the stack top, the script can write predicates on it, resulting in the message being consumed and a true/false condition left on the stack: `(txpubkey, txsig, boolean)` + ``` OP_IF (txsig, txpubkey) OP_DUP (txsig, txpubkey, txpubkey) @@ -174,13 +182,12 @@ OP_ELSE OP_ENDIF ``` -History -------- +## History -This specification is based on Andrew Stone’s OP_DATASIGVERIFY proposal [6, 7]. It is modified from Stone's original proposal based on a synthesis of all the peer-review and feedback received [8]. +This specification is based on Andrew Stone’s OP_DATASIGVERIFY proposal [6][6], [7][7]. +It is modified from Stone's original proposal based on a synthesis of all the peer-review and feedback received [8][8]. -References ----------- +## References [1] [OP_CHECKSIG](https://en.bitcoin.it/wiki/OP_CHECKSIG) diff --git a/protocol/forks/replay-protected-sighash.md b/protocol/forks/replay-protected-sighash.md new file mode 100644 index 0000000..3898668 --- /dev/null +++ b/protocol/forks/replay-protected-sighash.md @@ -0,0 +1,220 @@ +# Replay Protected Sighash + + layout: specification + title: BUIP-HF Digest for replay protected signature verification across hard forks + category: spec + date: 2017-07-16 + activation: 1501590000 + version: 1.2 + +## Abstract + +This document describes proposed requirements and design for a reusable signing mechanism ensuring replay protection in the event of a chain split. +It provides a way for users to create transactions which are invalid on forks lacking support for the mechanism and a fork-specific ID. + +The proposed digest algorithm is adapted from BIP143[1][1] as it minimizes redundant data hashing in verification, covers the input value by the signature and is already implemented in a wide variety of applications[2][2]. + +The proposed digest algorithm is used when the `SIGHASH_FORKID` bit is set in the signature's sighash type. +The verification of signatures which do not set this bit is not affected. + +## Specification + +### Activation + +The proposed digest algorithm is only used when the `SIGHASH_FORKID` bit in the signature sighash's type is set. +It is defined as follows: + +````cpp + // ... + SIGHASH_SINGLE = 3, + SIGHASH_FORKID = 0x40, + SIGHASH_ANYONECANPAY = 0x80, + // ... +```` + +In presence of the `SIGHASH_FORKID` flag in the signature's sighash type, the proposed algorithm is used. + +Signatures using the `SIGHASH_FORKID` digest method must be rejected before [UAHF](/protocol/forks/bch-uahf.md) is activated. + +In order to ensure proper activation, the reference implementation uses the `SCRIPT_ENABLE_SIGHASH_FORKID` flag when executing `EvalScript` . + +### Digest algorithm + +The proposed digest algorithm computes the double SHA256 of the serialization of: + +1. nVersion of the transaction (4-byte little endian) +2. hashPrevouts (32-byte hash) +3. hashSequence (32-byte hash) +4. outpoint (32-byte hash + 4-byte little endian) +5. scriptCode of the input (serialized as scripts inside CTxOuts) +6. value of the output spent by this input (8-byte little endian) +7. nSequence of the input (4-byte little endian) +8. hashOutputs (32-byte hash) +9. nLocktime of the transaction (4-byte little endian) +10. sighash type of the signature (4-byte little endian) + +Items 1, 4, 7 and 9 have the same meaning as in the original algorithm[3][3]. + +#### hashPrevouts + +* If the `ANYONECANPAY` flag is not set, `hashPrevouts` is the double SHA256 of the serialization of all input outpoints; +* Otherwise, `hashPrevouts` is a `uint256` of `0x0000......0000`. + +#### hashSequence + +* If none of the `ANYONECANPAY`, `SINGLE`, `NONE` sighash type is set, `hashSequence` is the double SHA256 of the serialization of `nSequence` of all inputs; +* Otherwise, `hashSequence` is a `uint256` of `0x0000......0000`. + +#### scriptCode + +In this section, we call `script` the script being currently executed. +This means `redeemScript` in case of P2SH, or the `scriptPubKey` in the general case. + +* If the `script` does not contain any `OP_CODESEPARATOR`, the `scriptCode` is the `script` serialized as scripts inside `CTxOut`. +* If the `script` contains any `OP_CODESEPARATOR`, the `scriptCode` is the `script` but removing everything up to and including the last executed `OP_CODESEPARATOR` before the signature checking opcode being executed, serialized as scripts inside `CTxOut`. + +Notes: + +1. Contrary to the original algorithm, this one does not use `FindAndDelete` to remove the signature from the script. +2. Because of 1, it is not possible to create a valid signature within `redeemScript` or `scriptPubkey` as the signature would be part of the digest. +This enforces that the signature is in `sigScript` . +3. In case an opcode that requires signature checking is present in `sigScript`, `script` is effectively `sigScript`. +However, for reason similar to 2, it is not possible to provide a valid signature in that case. + +#### value + +The 8-byte value of the amount of Bitcoin this input contains. + +#### hashOutputs + +* If the sighash type is neither `SINGLE` nor `NONE`, `hashOutputs` is the double SHA256 of the serialization of all output amounts (8-byte little endian) paired up with their `scriptPubKey` (serialized as scripts inside CTxOuts); +* If sighash type is `SINGLE` and the input index is smaller than the number of outputs, `hashOutputs` is the double SHA256 of the output amount with `scriptPubKey` of the same index as the input; +* Otherwise, `hashOutputs` is a `uint256` of `0x0000......0000`. + +Notes: + +1. In the original algorithm[3][OP_CHECKSIG], a `uint256` of `0x0000......0001` is committed if the input index for a `SINGLE` signature is greater than or equal to the number of outputs. +In this BIP a `0x0000......0000` is committed, without changing the semantics. + +#### sighash type + +The sighash type is altered to include a 24-bit *fork id* in its most significant bits. + +````cpp + ss << ((GetForkID() << 8) | nHashType); +```` + +This ensure that the proposed digest algorithm will generate different results on forks using different *fork ids*. + +## Implementation + +Addition to `SignatureHash` : + +````cpp + if (nHashType & SIGHASH_FORKID) { + uint256 hashPrevouts; + uint256 hashSequence; + uint256 hashOutputs; + + if (!(nHashType & SIGHASH_ANYONECANPAY)) { + hashPrevouts = GetPrevoutHash(txTo); + } + + if (!(nHashType & SIGHASH_ANYONECANPAY) && + (nHashType & 0x1f) != SIGHASH_SINGLE && + (nHashType & 0x1f) != SIGHASH_NONE) { + hashSequence = GetSequenceHash(txTo); + } + + if ((nHashType & 0x1f) != SIGHASH_SINGLE && + (nHashType & 0x1f) != SIGHASH_NONE) { + hashOutputs = GetOutputsHash(txTo); + } else if ((nHashType & 0x1f) == SIGHASH_SINGLE && + nIn < txTo.vout.size()) { + CHashWriter ss(SER_GETHASH, 0); + ss << txTo.vout[nIn]; + hashOutputs = ss.GetHash(); + } + + CHashWriter ss(SER_GETHASH, 0); + // Version + ss << txTo.nVersion; + // Input prevouts/nSequence (none/all, depending on flags) + ss << hashPrevouts; + ss << hashSequence; + // The input being signed (replacing the scriptSig with scriptCode + + // amount). The prevout may already be contained in hashPrevout, and the + // nSequence may already be contain in hashSequence. + ss << txTo.vin[nIn].prevout; + ss << static_cast(scriptCode); + ss << amount; + ss << txTo.vin[nIn].nSequence; + // Outputs (none/one/all, depending on flags) + ss << hashOutputs; + // Locktime + ss << txTo.nLockTime; + // Sighash type + ss << ((GetForkId() << 8) | nHashType); + return ss.GetHash(); + } +```` + +Computation of midstates: + +````cpp +uint256 GetPrevoutHash(const CTransaction &txTo) { + CHashWriter ss(SER_GETHASH, 0); + for (unsigned int n = 0; n < txTo.vin.size(); n++) { + ss << txTo.vin[n].prevout; + } + + return ss.GetHash(); +} + +uint256 GetSequenceHash(const CTransaction &txTo) { + CHashWriter ss(SER_GETHASH, 0); + for (unsigned int n = 0; n < txTo.vin.size(); n++) { + ss << txTo.vin[n].nSequence; + } + + return ss.GetHash(); +} + +uint256 GetOutputsHash(const CTransaction &txTo) { + CHashWriter ss(SER_GETHASH, 0); + for (unsigned int n = 0; n < txTo.vout.size(); n++) { + ss << txTo.vout[n]; + } + + return ss.GetHash(); +} +```` + +Gating code: + +````cpp + uint32_t nHashType = GetHashType(vchSig); + if (nHashType & SIGHASH_FORKID) { + if (!(flags & SCRIPT_ENABLE_SIGHASH_FORKID)) + return set_error(serror, SCRIPT_ERR_ILLEGAL_FORKID); + } else { + // Drop the signature in scripts when SIGHASH_FORKID is not used. + scriptCode.FindAndDelete(CScript(vchSig)); + } +```` + +## Note + +In the UAHF, a `fork id` of 0 is used (see [4][4] REQ-6-2 NOTE 4), i.e. +the GetForkID() function returns zero. +In that case the code can be simplified to omit the function. + +## References + +[1]: https://github.com/bitcoin/bips/blob/master/bip-0143.mediawiki + +[2]: https://github.com/bitcoin/bips/blob/master/bip-0143.mediawiki#Motivation + +[3]: https://en.bitcoin.it/wiki/OP_CHECKSIG + +[4]: https://github.com/bitcoincashorg/bitcoincash.org/blob/master/spec/uahf-technical-spec.md