TechRxiv
IteratedNB77.pdf (515.18 kB)

Data Amplification Using the Iterated Newcomb-Benford Distribution

Download (515.18 kB)
preprint
posted on 25.08.2021, 00:34 by Subhash KakSubhash Kak

The Newcomb-Benford (NB) distribution of first digits has been applied widely in many areas ranging from engineering to natural and biological sciences for the investigation of self-similarity and randomness. In this article, we consider systems for which the data is not enough to obtain proper first digit statistics, and we propose the use of an iterated version of the distribution where the statistics are aggregated over different scales on grounds that the first digit distribution is approximately scale invariant across a wide range of phenomena and also because scaling and recomputing first digits is not a linear process and so this process generates new data. We provide examples of the use of the iterated test for data in two different biological applications, viz. that of the secretome and the genetic code in both of which the raw data does not include all the nine different first digits. The paper includes proposals for further research on the idea of data amplification using scaling transformations.

History

Email Address of Submitting Author

subhash.kak@okstate.edu

ORCID of Submitting Author

https://orcid.org/0000-0001-5426-9759

Submitting Author's Institution

Oklahoma State University

Submitting Author's Country

United States of America

Usage metrics

Licence

Exports