Paper Page - CRISP-SAM2: SAM2 With Cross-Modal Interaction And Semantic Prompting For Multi-Organ Segmentation

Multi-organ medical segmentation is a crucial component of medical image
processing, essential for doctors to make accurate diagnoses and develop
effective treatment plans. Despite significant progress in this field, current
multi-organ segmentation models often suffer from inaccurate details,
dependence on geometric prompts and loss of spatial information. Addressing
these challenges, we introduce a novel model named CRISP-SAM2 with CRoss-modal
Interaction and Semantic Prompting based on SAM2. This model represents a
promising approach to multi-organ medical segmentation guided by textual
descriptions of organs. Our method begins by converting visual and textual
inputs into cross-modal contextualized semantics using a progressive
cross-attention interaction mechanism. These semantics are then injected into
the image encoder to enhance the detailed understanding of visual information.
To eliminate reliance on geometric prompts, we use a semantic prompting
strategy, replacing the original prompt encoder to sharpen the perception of
challenging targets. In addition, a similarity-sorting self-updating strategy
for memory and a mask-refining process is applied to further adapt to medical
imaging and enhance localized details. Comparative experiments conducted on
seven public datasets indicate that CRISP-SAM2 outperforms existing models.
Extensive analysis also demonstrates the effectiveness of our method, thereby
confirming its superior performance, especially in addressing the limitations
mentioned earlier. Our code is available at:
https://github.com/YU-deep/CRISP\_SAM2.git.

Source link

What's Hot

HRM vs Claude OPUS 4: How a Small AI Model Outperformed a Giant

FieldAI raises $405M to build universal robot brains

IBM and NASA Release Groundbreaking Open-Source AI Model on Hugging Face to Predict Solar Weather and Help Protect Critical Technology

Paper page – CRISP-SAM2: SAM2 with Cross-Modal Interaction and Semantic Prompting for Multi-Organ Segmentation

LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos – Takara TLDR

Prompt Orchestration Markup Language – Takara TLDR

Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer – Takara TLDR

Tanya Bonakdar Gallery to Close Los Angeles Space

Dallas Museum of Art Names Brian Ferriso as Its Next Director

Rapa Nui’s Moai Statues Threatened by Rising Sea Levels, Flooding

Mickalene Thomas Accused of Harassment by Racquel Chevremont

HRM vs Claude OPUS 4: How a Small AI Model Outperformed a Giant

FieldAI raises $405M to build universal robot brains

IBM and NASA Release Groundbreaking Open-Source AI Model on Hugging Face to Predict Solar Weather and Help Protect Critical Technology

What's Hot

Paper page – CRISP-SAM2: SAM2 with Cross-Modal Interaction and Semantic Prompting for Multi-Organ Segmentation

Related Posts

Subscribe to Updates