Comparing Effectiveness of Image Perturbation and Test-retest Imaging
Towards Establishment of Reliable Radiomic Models
Abstract
Image perturbation is the promising technique to assess radiomic feature
repeatability without test-retest imaging. However, whether it can
achieve the same effect on model reliability enhancement as test-retest
imaging is unknown. This study aimed to compare radiomic model
reliability based on repeatable features determined by image
perturbation and test-retest imaging. A 191-patient public breast cancer
dataset with 71 test-retest scans was used with pre-determined 117
training and 74 testing samples. We collected apparent diffusion
coefficient images and the manually segmented tumor structures for
radiomic feature extraction and pathological complete response record
for model prediction. Random translations, rotations, and contour
randomizations were performed on the training images, and intra-class
correlation coefficient (ICC) was used to quantify feature
repeatability. After removing volume correlated features, multiple ICC
thresholds were applied for repeatable feature filtering, and separate
logistic-regression models were developed using 5 most relevant and
independent features. We evaluated model reliability in both
generalizability and robustness, which were quantified by training and
testing area under the receiver operating characteristic curve (AUC) and
prediction ICC under perturbation and test-retest. Higher testing
performance was found at higher ICC thresholds, but it dropped
significantly at ICC=0.95 for the test-retest model. Similar optimal
reliability can be achieved with testing AUC = 0.76-0.77 and prediction
ICC>0.9 at the ICC threshold of 0.9. It is recommended to
include feature repeatability analysis using image perturbation in any
radiomic study when test-retest is not feasible, but care should be
taken when deciding the optimal feature repeatability criteria.