Paper Page - Enhancing Step-by-Step And Verifiable Medical Reasoning In MLLMs

MICS, a novel reasoning-path searching scheme, enhances medical MLLMs like Chiron-o1 with robust generalizable reasoning and visual question-answering capabilities through comprehensive chain-of-thought data generation.

Multimodal large language models (MLLMs) have begun to demonstrate robust
reasoning capabilities on general tasks, yet their application in the medical
domain remains in its early stages. Constructing chain-of-thought (CoT)
training data is essential for bolstering the reasoning abilities of medical
MLLMs. However, existing approaches exhibit a deficiency in offering a
comprehensive framework for searching and evaluating effective reasoning paths
towards critical diagnosis. To address this challenge, we propose Mentor-Intern
Collaborative Search (MICS), a novel reasoning-path searching scheme to
generate rigorous and effective medical CoT data. MICS first leverages mentor
models to initialize the reasoning, one step at a time, then prompts each
intern model to continue the thinking along those initiated paths, and finally
selects the optimal reasoning path according to the overall reasoning
performance of multiple intern models. The reasoning performance is determined
by an MICS-Score, which assesses the quality of generated reasoning paths.
Eventually, we construct MMRP, a multi-task medical reasoning dataset with
ranked difficulty, and Chiron-o1, a new medical MLLM devised via a curriculum
learning strategy, with robust visual question-answering and generalizable
reasoning capabilities. Extensive experiments demonstrate that Chiron-o1,
trained on our CoT dataset constructed using MICS, achieves state-of-the-art
performance across a list of medical visual question answering and reasoning
benchmarks. Codes are available at GitHub – manglu097/Chiron-o1: Enhancing
Step-by-Step and Verifiable Medical Reasoning in MLLMs

Source link

What's Hot

Sparkvia AI — Inside the First AI-Powered Writing

Perplexity AI makes its play for government use

The Impact of AI Voice Assistants on Customer Interaction Trends » World Business Outlook

Paper page – Enhancing Step-by-Step and Verifiable Medical Reasoning in MLLMs

On Robustness and Reliability of Benchmark-Based Evaluation of LLMs – Takara TLDR

MedVista3D: Vision-Language Modeling for Reducing Diagnostic Errors in 3D CT Disease Detection, Understanding and Reporting – Takara TLDR

Behavioral Fingerprinting of Large Language Models – Takara TLDR

Tony Shafrazi and the Art of the Comeback

Basquiats Linked to 1MDB Scandal Auctioned by US Government

US Ambassador to UK Fills Residence with Impressionist Masters

New Code of Ethics Implores UK Museums to End Fossil Fuel Sponsorships

Sparkvia AI — Inside the First AI-Powered Writing

Perplexity AI makes its play for government use

The Impact of AI Voice Assistants on Customer Interaction Trends » World Business Outlook

What's Hot

Paper page – Enhancing Step-by-Step and Verifiable Medical Reasoning in MLLMs

Related Posts

Subscribe to Updates