Block filtering: How it makes bitcoin network more robust

Crypto

Block filtering: How it makes bitcoin network more robust

akohad

June 7, 2023

Block filtering: How it makes bitcoin network more robust

[ad_1]

Introduction

Light clients on the Bitcoin network face resource constraints as the blockchain size continues to grow exponentially because approximately every 10 minutes, new blocks are mined and distributed to the Bitcoin network.

In the Bitcoin whitepaper, Satoshi Nakamoto acknowledged the possibility of verifying payments without running a full network node.

Light clients can verify payments by storing the block headers of the longest chain in the network, utilizing the Merkle tree hash of the relevant block headers and the associated Merkle path, they can verify the validity of transaction inputs and ensure their inclusion in the blockchain.

Light clients have a specific requirement they only want to receive blocks and transactions that are relevant to their wallets. To address this issue, bloom filters were introduced, it works and enables light clients to verify payments without having the whole blockchain copy, however, Bloom filters have proven to be problematic because they expose full nodes to attacks while attempting to assist light clients. The light client could potentially create a custom block filter that is malicious and serve it to the full node, thus becoming a potential denial-of-service (DoS) attack vector. Furthermore, dishonest nodes can intentionally withhold critical data when serving light clients. These vulnerabilities undermine the reliability and security of the system.

In this article, we will explore a better solution “Block filtering”, as outlined in BIP 157 /BIP 158, see how it works and discuss how this approach benefits both light clients and full nodes, the recent use of the feature to improve the performance of nodes by the reference implementation Bitcoin core.

Block Filter

Full nodes that support block filtering generate filters from the genesis block up to the tip of the chain and persist them. Upon the creation of a new block, they generate another filter and share it with light clients.

The block filter is deterministic, which means that the filter generated by one node will be the same as the filter generated by another peer. As such, they generate the filter once and persist it to disk, allowing them to use it anytime without repeating the same process.

Light clients receive a block filter for a block and match it with their descriptors or scriptPubkey. If there is a match, they download the block and verify the payment to their wallet.

They don’t have to request the block from the same peer, blocks can be sourced from other peers or anywhere else to improve privacy.

How a Block Filter is created

Currently, BIP157 supports one filter type in BIP 158, which defines a set of parameters for how to filters are constructed and used by light clients. BIP157 provides the ability to add new filter types to improve performance or address any issues with the current filter type, thus enabling extensibility.

The BIP158 basic filter type (0x00) is identified by 0x00 and includes integer parameters M = 784931 and P = 19. These values are chosen as integer parameters.

These values along with the set of items associated with the block (such as addresses/scriptPubKeys sent to and outpoints spending), are used. However, op_return scriptPubKeys are excluded.
These data will be compressed into a probabilistic structure called Golomb Coded Set (GCS) which is the block filter.

The set of items(addresses/scriptPubKeys sent to and outpoints e.t.c) is referred to as N.

Each item used to construct the set will be matched with a probability of 1. This means that when verifying against the constructed GCS set, you will receive a positive result, indicating that the block contains item i of N.
On the other hand, other items will have a probability of 1/M, which means they are considered negative, and the block does not contain the item.

How Block filter (Golomb Coded Set) is created.

Hashing Data Objects:

Each of the N items is hashed using the SipHash hashing algorithm, producing a 64-bit hash value. The resulting hash values are mapped uniformly over a range of integers from 0 to F, where F = N * M. This mapping ensures distribution across the desired range.
Mapping to 128-Bit Values: The mapped integers are multiplied by F to obtain a 128-bit result. The top 64 bits of the 128-bit value are extracted for further processing.
Sorting: The resulting 64-bit numbers from all the N items are sorted in ascending order. This sorting step prepares the values for the subsequent computation.
Computing Differences: The differences between each sorted value and the previous one are computed. These differences represent the output of the previous steps.

The set of computed differences is added to the filter, rather than the actual values themselves. By adding differences instead of the original values, greater compression is achieved.

2. Golomb Rice Coding (Compression)

Each of the resulting values of the previous step will be split into two values by taking the modulo of the value with 2^P, (n)ith % (2 ** 19) quotient and the remainder of the modulo. (q and r).

The quotient will be encoded as unary, while the remainder in little-endian.

A unary code consists of a string of q 1’s followed by a single 0. This encoding is efficient for values with repetitive patterns. For each quotient value, the number of 1’s in the unary code corresponds to the value of the quotient. There is a final 0 bit that serves as a delimiter.

For example if N = {0, 1, 2, 3, 4, 5} and P=2.

2**P = 4.

n = 0 , 0 % 4 = 0, q = 0, r = 0, c = 0, 00

n = 1 , 1 % 4 = 0, q = 0, r = 1, c = 0, 01

n = 2 , 2 % 4 = 0, q = 0, r = 2, c = 0, 10

n = 3 , 3 % 4 = 0, q = 0, r = 3, c = 0, 11

n = 4 , 4 % 4 = 0, q = 1, r = 0, c = 10, 00

n = 5 , 5 % 4 = 0, q = 1, r = 1, c = 10, 01

The c values are the resulting Golomb Coding set (GCS), Which is a block filter

How to check/query whether an item is in the block filter (GCS)

To determine the membership of an item in a compressed GCS, the compression process is reversed. The encoded successive differences deltas are decoded one by one and added cumulatively to generate the original hashed values of the set. Each intermediate sum represents a hashed value from the original set.

When querying an item, the queried item is hashed and mapped to an integer value between 0 and F, just like the original set of values. This integer value is then multiplied by F, and the top 64 bits are taken. If the resulting 64-bit integer matches any of the decoded reconstructed values, it indicates that the queried item is present in the filter.

By utilizing the decoded reconstructed hashed values, set intersections can be checked more efficiently, avoiding the need to check each item individually.

This process is commonly used by light clients to verify the existence of descriptors, scriptPubKey’s, or addresses in a new block filter.

Filter Header

The filter header is computed by combining the filter hash (double SHA256 of the serialized block filter GCS) with the previous block’s filter header using a double-hashing process. For the genesis block, the previous filter header is a 32-byte array of zeros.

Similar to block headers, filter headers provide commitments to the block filters and are connected to the previous filter header. Before downloading the block filters, a light client can obtain the filter headers for the current blockchain. These headers can be used to verify the authenticity of the filters. If there are discrepancies in the filter header chains among different peers, the client can identify the point of divergence. It can then download the corresponding block and calculate the correct filter. This mechanism helps identify any malicious nodes that serve invalid block filters.

By leveraging filter headers, light clients can effectively detect any dishonest nodes that provide incorrect block filters. As long as they are connected to at least one honest node, they can verify and identify the correct filter.

Light clients benefit from receiving only the new block headers and block filters. They can check these filters against their scriptPubkeys to determine whether they are included. If a scriptPubkey is not present in the filter, the client doesn’t need to download the corresponding block. However, if it is in the filter, the client downloads the block and verifies the new payment received. This approach significantly reduces bandwidth and resource usage for light clients.

On the other hand, full nodes have no inherent advantage in supporting block filters exclusively. While it doesn’t harm them, it does increase the resources required as block filters need to be computed and stored for every block in the Bitcoin blockchain.
This is not the case for Bitcoin Core v25, as the new version offers additional features that full nodes can utilize to enhance their services and performance with the filters they save

The two new features are:

scanblocks RPC: This feature allows full nodes to utilize the “scanblocks” RPC, which takes a set of descriptors and returns the relevant block hashes that match these descriptors. By leveraging block filters, full nodes can develop services that retrieve previous transactions and their unspent transaction outputs (UTXOs) for specific descriptors without the need to import the entire wallet or perform a wallet rescan.
Fast Wallet Rescan: With block filtering, wallet rescans on nodes supporting this feature have become significantly faster. After importing a new descriptor, the import speed is increased by avoiding the parsing of transactions in all the blockchain. Instead, the descriptors are matched against the block filters, and only relevant blocks are queried, resulting in a wallet rescan process that is nearly 10 times faster than before.

Conclusion

In this article, we embarked on an exciting journey to revolutionize the way light clients interact with the Bitcoin network. We discovered the power of block filtering as a solution to address the resource constraints faced by these clients.

Gone are the days of relying on vulnerable bloom filters. Block filtering takes center stage, offering enhanced privacy and efficiency.

We dove into the fascinating process of creating block filters using Golomb Coded Set (GCS) technique. Through hashing, mapping, sorting, and compression, these filters are formed,

We also explored the filter headers, providing trust and authenticity to the block filter chain. By cross-checking these headers, light clients can expose any fraudulent nodes attempting to serve up invalid filters.

In conclusion, block filtering has ushered in a new era for light clients on the Bitcoin network. It’s a thrilling adventure of reduced resource usage than full nodes, enhanced privacy, and fast verification. Lets not forget the added benefits for full nodes, empowering them with scanblocks RPC and faster wallet rescans, It is also important to note that block filtering increases the bandwidth requirements for light clients. They now need to download the full block when they receive a block filter match, and these blocks can sometimes be up to 4MB of data.

Thanks for reading

References

https://github.com/bitcoin/bips/blob/master/bip-0157.mediawiki
https://github.com/bitcoin/bips/blob/master/bip-0158.mediawiki
https://github.com/bitcoin/bitcoin/blob/master/doc/release-notes/release-notes-25.0.md

[ad_2]

Source link

LEAVE A REPLY Cancel reply