Lung Grounded-SAM (LuGSAM): A Novel Framework for Integrating Text prompts to Segment Anything Model (SAM) for Segmentation Tasks of ICU Chest X-Rays

Dhanush Babu Ramesh; Rishika Iytha Sridhar; Pulakesh Upadhyaya; Rishikesan Kamaleswaran

doi:10.36227/techrxiv.24224761.v1

loading page

Lung Grounded-SAM (LuGSAM): A Novel Framework for Integrating Text prompts to Segment Anything Model (SAM) for Segmentation Tasks of ICU Chest X-Rays

Dhanush Babu Ramesh ,
Rishika Iytha Sridhar ,
Pulakesh Upadhyaya ,
Rishikesan Kamaleswaran

Abstract

Chest radiography is a commonly utilized imaging technique for acquiring Chest X-Ray (CXR) images due to its cost-effectiveness and its role in diagnosing lung?related disorders. Nevertheless, interpreting CXR images can be challenging, and the process of separating the lung field from CXR images can be a valuable tool for assessing and diagnosing lung diseases. While various segmentation methods exist, this study primarily focuses on META’s latest Segment Anything Model (SAM). SAM is an Artificial Intelligence (AI) model designed to segment objects within an image. This research aims to harness SAM’s capabilities for segmenting CXR images. Additionally, we explore the potential of another novel model called Grounding DINO. Grounding DINO is a zero-shot object detection model that utilizes a Swin (Shifted Windows) transformer for extracting image features and BERT (Bidirectional Encoder Representations from Transformers) for extracting textual information. It is primarily employed to detect objects in an image based on a provided text prompt, creating bounding boxes around the objects when certain text and box thresholds are met. These bounding boxes are then used as prompts for SAM to generate segmentation masks. The proposed framework has been assessed on CXRs obtained from patients at Emory Hospital in Atlanta, Georgia, USA and further evaluated using NIH clinical center’s CXR image dataset.