Overcoming Data Availability Attacks in Blockchain Systems: LDPC Code Design for Coded Merkle Tree
preprintposted on 01.09.2021, 06:42 by Debarnab MitraDebarnab Mitra, Lev Tauz, Lara Dolecek
In blockchain systems, full nodes store the entire blockchain ledger and validate all transactions in the system by operating on the entire ledger. However, for better scalability and decentralization of the system, blockchains also run light nodes that only store a small portion of the ledger. In blockchain systems having a majority of malicious full nodes, light nodes are vulnerable to a data availability (DA) attack. In this attack, a malicious node makes the light nodes accept an invalid block by hiding the invalid portion of the block from the nodes in the system. Recently, a technique based on LDPC codes called Coded Merkle Tree (CMT) was proposed by Yu et al. that enables light nodes to detect a DA attack by randomly requesting/sampling portions of the block from the malicious node. However, light nodes fail to detect a DA attack with high probability if a malicious node hides a small stopping set of the LDPC code. To mitigate this problem, Yu et al. used well-studied techniques to design random LDPC codes with high minimum stopping set size. Although effective, these codes are not necessarily optimal for this application. In this paper, we demonstrate that a suitable co-design of specialized LDPC codes and the light node sampling strategy can improve the probability of detection of DA attacks. We consider different adversary models based on their computational capabilities of finding stopping sets in LDPC codes. For a weak adversary model, we devise a new LDPC code construction termed as the entropy-constrained PEG (EC-PEG) algorithm which concentrates stopping sets to a small group of variable nodes. We demonstrate that the EC-PEG algorithm coupled with a greedy sampling strategy improves the probability of detection of DA attacks. For stronger adversary models, we provide a co-design of a sampling strategy called linear-programming-sampling (LP-sampling) and an LDPC code construction called linear-programming-constrained PEG (LC-PEG) algorithm. The new co-design demonstrates a higher probability of detection of DA attacks compared to approaches proposed in earlier literature.