ST-Raptor: LLM-Powered Semi-Structured Table Question Answering - Takara TLDR

Semi-structured tables, widely used in real-world applications (e.g.,
financial reports, medical records, transactional orders), often involve
flexible and complex layouts (e.g., hierarchical headers and merged cells).
These tables generally rely on human analysts to interpret table layouts and
answer relevant natural language questions, which is costly and inefficient. To
automate the procedure, existing methods face significant challenges. First,
methods like NL2SQL require converting semi-structured tables into structured
ones, which often causes substantial information loss. Second, methods like
NL2Code and multi-modal LLM QA struggle to understand the complex layouts of
semi-structured tables and cannot accurately answer corresponding questions. To
this end, we propose ST-Raptor, a tree-based framework for semi-structured
table question answering using large language models. First, we introduce the
Hierarchical Orthogonal Tree (HO-Tree), a structural model that captures
complex semi-structured table layouts, along with an effective algorithm for
constructing the tree. Second, we define a set of basic tree operations to
guide LLMs in executing common QA tasks. Given a user question, ST-Raptor
decomposes it into simpler sub-questions, generates corresponding tree
operation pipelines, and conducts operation-table alignment for accurate
pipeline execution. Third, we incorporate a two-stage verification mechanism:
forward validation checks the correctness of execution steps, while backward
validation evaluates answer reliability by reconstructing queries from
predicted answers. To benchmark the performance, we present SSTQA, a dataset of
764 questions over 102 real-world semi-structured tables. Experiments show that
ST-Raptor outperforms nine baselines by up to 20% in answer accuracy. The code
is available at https://github.com/weAIDB/ST-Raptor.

Source link

What's Hot

C3.AI DEADLINE FOR LEADERSHIP is October 21, 2025 in a Securities Fraud Lawsuit – Contact Kaplan Fox & Kilsheimer LLP

A^2Search: Ambiguity-Aware Question Answering with Reinforcement Learning – Takara TLDR

MIT president rejects proposal tying funding to Trump’s political agenda

ST-Raptor: LLM-Powered Semi-Structured Table Question Answering – Takara TLDR

A^2Search: Ambiguity-Aware Question Answering with Reinforcement Learning – Takara TLDR

Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks – Takara TLDR

Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training – Takara TLDR

The Rubin Names 2025 Art Prize, Research and Art Projects Grants

Kochi-Muziris Biennial Announces 66 Artists for December Exhibition

Instagram Launches ‘Rings’ Awards for Creators—With KAWS as a Judge

Museums Prepare to Close Their Doors as Government Shutdown Continues

C3.AI DEADLINE FOR LEADERSHIP is October 21, 2025 in a Securities Fraud Lawsuit – Contact Kaplan Fox & Kilsheimer LLP

A^2Search: Ambiguity-Aware Question Answering with Reinforcement Learning – Takara TLDR

MIT president rejects proposal tying funding to Trump’s political agenda

What's Hot

ST-Raptor: LLM-Powered Semi-Structured Table Question Answering – Takara TLDR

Related Posts

Subscribe to Updates