On Models and Capacity Bounds for DNA-based Storage Channels
In recent years, DNA-based systems have become a promising medium for long-term data storage. There are two layers of errors in DNA-based storage systems. The first is the dropouts of the DNA strands, which has been characterized in the shuffling-sampling channel. The second is insertions, deletions, and substitutions of nucleotides in individual DNA molecules. In this paper, we describe a DNA noisy synchronization error channel to characterize the errors in individual DNA molecules. We derive non-trivial lower and upper capacity bounds of the DNA noisy synchronization error channel based on information theory. By cascading these two channels, we provide theoretical capacity limits of the DNA storage system. These results reaffirm that DNA is a reliable storage medium with high storage density potential.