Chusov.pdf (402.47 kB)
Download file

Outperforming Sequential Full-Word Long Addition with Parallelization and Vectorization

Download (402.47 kB)
posted on 2022-01-13, 17:09 authored by Andrey ChusovAndrey Chusov
The paper presents algorithms for parallel and vectorized full-word addition of big unsigned integers with carry propagation. Because of the propagation, software parallelization and vectorization of non-polynomial addition of big integers have long been considered impractical due to data dependencies between digits of the operands. The presented algorithms are based upon parallel and vectorized detection of carry origins within elements of vector operands, masking bits which correspond to those elements and subsequent scalar addition of the resulting integers. The acquired bits can consequently be taken into account to adjust the sum using the Kogge-Stone method.
Essentially, the paper formalizes and experimentally verifies parallel and vectorized implementation of carry-lookahead adders applied at arbitrary granularity of data. This approach is noticeably beneficial for manycore, CUDA and vectorized implementation using AVX-512 with masked instructions.


Email Address of Submitting Author

ORCID of Submitting Author


Submitting Author's Institution

Far-Eastern Federal University

Submitting Author's Country

  • Russian Federation