Beyond Medical Imaging - A Review of Multimodal Deep Learning in
Radiology
- Lars Heiliger ,
- Anjany Sekuboyina ,
- Bjoern Menze ,
- Jan Egger ,
- Jens Kleesiek
Abstract
Healthcare data are inherently multimodal. Almost all data generated and
acquired during a patient's life can be hypothesized to contain
information relevant to providing optimal personalized healthcare. Data
sources such as ECGs, doctor's notes, histopathological and radiological
images all contribute to inform a physician's treatment decision.
However, most machine learning methods in healthcare focus on
single-modality data. This becomes particularly apparent within the
field of radiology, which, due to its information density,
accessibility, and computational interpretability, constitutes a central
pillar in the healthcare data landscape and traditionally has been one
of the key target areas of medically-focused machine learning.
Computer-assisted diagnostic systems of the future should be capable of
simultaneously processing multimodal data, thereby mimicking physicians,
who also consider a multitude of resources when treating patients.
Before this background, this review offers a comprehensive assessment of
multimodal machine learning methods that combine data from radiology and
other medical disciplines. It establishes a modality-based taxonomy,
discusses common architectures and design principles, evaluation
approaches, challenges, and future directions. This work will enable
researchers and clinicians to understand the topography of the domain,
describe the state-of-the-art, and detect research gaps for future
research in multimodal medical machine learning.