ST-Raptor: Answering Complex Questions About Semi-Structured Tables Without Training
In our data-driven world, tables are everywhere—from financial reports and academic papers to human resources forms and sales records. But what happens when these tables have complex, irregular layouts with merged cells, multi-level headers, and nested information? Traditional tools struggle with these semi-structured tables, leaving researchers and professionals to manually dig through spreadsheets for answers.
Meet ST-Raptor: an innovative tool that understands complex tables and answers your natural language questions about them with remarkable accuracy. Unlike many AI systems that require extensive training, ST-Raptor works right out of the box with any Excel table you provide.
What Makes ST-Raptor Special?
ST-Raptor represents a breakthrough in table question answering. It combines a vision-language model (VLM) with a sophisticated tree-construction algorithm called HO-Tree, seamlessly integrating with various large language models (LLMs). The system employs a two-stage validation mechanism that ensures reliable, accurate answers to your questions about tabular data.
The most impressive aspect? ST-Raptor requires no additional fine-tuning. You simply provide an Excel table and ask your question in plain English—the system handles the rest.

What Types of Tables Can ST-Raptor Handle?
ST-Raptor specializes in semi-structured tables—those with irregular layouts that challenge conventional data processing tools. These include:
-
Personal information forms and application documents -
Academic research tables with complex formatting -
Financial statements and budget reports -
Marketing analytics dashboards -
Warehouse inventory management systems -
Schedule and timetable documents -
Sales management records
These tables can originate from various sources including Excel files, HTML web pages, Markdown documents, and CSV files. The system expertly handles nested cells, multi-row/column headers, and other complex formatting arrangements that typically confuse standard table processing tools.

The SSTQA Benchmark: Rigorous Testing for Real-World Scenarios
To properly evaluate ST-Raptor’s capabilities, researchers created a specialized benchmark called SSTQA (Semi-Structured Table Question Answering). This wasn’t just any test set—it was carefully curated from over 2,031 real-world tables, selecting 102 representative tables and creating 764 meaningful questions that probe the system’s understanding.
The SSTQA benchmark covers 19 practical scenarios across diverse domains:
-
Human Resources management -
Corporate governance structures -
Financial Management systems -
Marketing analytics -
Warehouse inventory tracking -
Academic research data -
Schedule management -
Application forms processing -
Education-related records -
Sales Management systems
This comprehensive testing ensures that ST-Raptor performs well not just in laboratory conditions but in the messy reality of everyday data challenges.
Performance That Speaks for Itself
When evaluated against leading alternatives, ST-Raptor demonstrates superior performance across multiple benchmarks:
Method | WikiTQ-ST Accuracy (%) | TempTabQA-ST Accuracy (%) | SSTQA Accuracy (%) | SSTQA ROUGE-L (%) |
---|---|---|---|---|
OpenSearch-SQL | 38.89 | 4.76 | 24.00 | 23.87 |
TableLLaMA | 35.01 | 32.70 | 40.39 | 26.71 |
TableLLM | 62.40 | 9.13 | 7.84 | 2.93 |
ReAcTable | 68.00 | 35.88 | 37.24 | 7.49 |
TAT-LLM | 23.32 | 61.86 | 39.78 | 19.26 |
TableLLaVA | 20.41 | 6.91 | 9.52 | 5.92 |
mPLUG-DocOwl1.5 | 39.80 | 39.80 | 29.56 | 28.43 |
GPT-4o | 60.71 | 74.83 | 62.12 | 43.86 |
DeepSeekV3 | 69.64 | 63.81 | 62.16 | 46.17 |
ST-Raptor | 71.17 | 77.59 | 72.39 | 52.19 |
As the results clearly show, ST-Raptor outperforms all other methods across all benchmarks, achieving particularly impressive results on the challenging SSTQA dataset with 72.39% accuracy—a significant margin over the next best approach.

The performance advantage becomes even more pronounced when considering table complexity. ST-Raptor maintains its accuracy even as tables become more structurally complex, while other methods show notable performance degradation.
Getting Started with ST-Raptor
Step 1: Clone the Repository
Begin by obtaining the ST-Raptor source code:
git clone git@github.com:weAIDB/ST-Raptor.git
cd ST-Raptor
Step 2: Set Up Your Environment
Create a Virtual Environment
# Create a Python virtual environment
conda create -n straptor python=3.10
conda activate straptor
# Install required packages
pip install -r requirements.txt
Install HTML Rendering Plugin
ST-Raptor requires wkhtmltox for proper HTML rendering:
wget https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6.1-2/wkhtmltox_0.12.6.1-2.jammy_amd64.deb
sudo apt install -f ./wkhtmltox_0.12.6.1-2.jammy_amd64.deb
Step 3: Prepare Your Data
Download the SSTQA benchmark dataset and save it in the ./data directory. Then modify the paths in ./main.py to match your setup:
# Update these paths to match your system
input_jsonl = 'PATH_TO_YOUR_INPUT_JSONL'
table_dir = 'PATH_TO_YOUR_TABLE_DIR'
output_jsonl = 'PATH_TO_YOUR_OUTPUT_JSONL'
log_dir = 'PATH_TO_YOUR_LOG_DIR'
The question-answer data should be in JSONL format, with each record structured as follows:
{
"id": "QUESTION_ID",
"table_id": "TABLE_ID",
"query": "NATURAL_LANGUAGE_QUESTION",
"label": "CORRECT_ANSWER"
}
Step 4: Configure Your Models
ST-Raptor supports multiple model configurations. The reference implementation uses:
-
Deepseek-V3 (LLM API) -
InternVL2.5 26B (Vision Language Model) -
Multilingual-E5-Large-Instruct (Embedding Model)
This configuration requires approximately 160GB of GPU memory. You can adjust the models based on your available hardware or switch to API-based options to reduce local resource requirements.
Update the model settings in /utils/constants.py:
# Configure your LLM API settings
API_URL = "YOUR_LLM_API_URL"
API_KEY = "YOUR_LLM_API_KEY"
# Settings for locally deployed LLM
LLM_PORT = YOUR_LLM_PORT
LLM_MODEL_TYPE = YOUR_LLM_MODEL_NAME
# Settings for locally deployed VLM (using vllm by default)
VLM_PORT = YOUR_VLM_PORT
VLM_MODEL_TYPE = YOUR_VLM_MODEL_NAME
# Path to your Multilingual-E5 model
MULTILINGUAL_E5_MODEL_PATH = "YOUR_MODEL_PATH"
For custom API formats, modify the code in ./utils/api_utils.py accordingly.
Step 5: Deploy InternVL2.5 Model
To deploy InternVL2.5 using vLLM:
Install vLLM Package
pip install vllm
Deploy the VLM Service
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m vllm.entrypoints.openai.api_server \
--model=PATH_TO_INTERNVL \
--port 8138 \
--trust-remote-code \
--max-num-batched-tokens 8192 \
--seed 42 \
--tensor-parallel-size 4
Step 6: Start Asking Questions!
With everything set up, you can now run ST-Raptor:
python ./main.py
Real-World Performance Examples
To illustrate ST-Raptor’s capabilities, here’s how it compares to other methods on practical table questions:
Question | Correct Answer | TableLLaMA | TableLLM | ReAcTable | TAT-LLM | TableLLaVA | mPLUG-DocOwl1.5 | DeepseekV3 | GPT-4o | ST-Raptor |
---|---|---|---|---|---|---|---|---|---|---|
What is the value of the employment service satisfaction indicator in the overall budget performance target table for municipal departments in 2024? | ≧90% | 75.0 | 737 | ≧95% | ≧90% | 80% | ≧90% | ≧90% | ≧90% | ≧90% |
How many items are there in the drawing specifications? | 15 | 2 | To change the template, you can follow these steps: … | 7 | 108 | 17 | 4 | 15 | 23 | 15 |
How many status codes are there in the status code table? | 3 | 3 | To change the template, you can follow these steps: … | 7 | 5 | 33 | 3 | 3 | 4 | 3 |
Which month had the lowest expenditure in 2020? | February | Travel expenses | To find the total expenditure amount in June 2019 … | June 5th | “” | June 5th | Long Boat Festival welfare | February | January | February |
How many sales records did the brand “Tengyuan Mingju” have in June? | 7 | 3 | “” | 7 | “” | 13 | 5 | 7 | 8 | 7 |
What was the business hospitality expense of the Comprehensive Management Office in February? | 5106.36 | 5106.36 | “” | “” | SELECT SUM(Amount incurred ) FROM DF WHERE Project Content = ‘Business entertainment expenses’ … |
3500 | 130,168 | 5106.36 | 5106.36 | 5106.36 |
What is the proposed funding for the social insurance gap and living allowance for college graduates under the “Three Supports and One Assistance” program? | 587.81 million yuan | 587.81 | To find the number of financially supported personnel … | To find the proposed investment amount for the social insurance gap and living allowance … | 587.81 | 1.2 billion | 1140 | 587.81 | 587.81 | 587.81 |
What is the target value for the number of new urban employment in the 2024 Municipal Department Overall Budget Performance Target Table? | 50000 people | 50000 | To find the number of financially supported personnel in… | The question asks for the indicator value for the number of new urban employment … | 50000 | 1484 | 50000 | 50000 | 50000 | 50000 |
How many first-level indicators are there in the performance metrics? | 3 | 10 | 10 | 10 | 10 | 100 | 2 | 3 | 4 | 3 |
How many third-level indicators are there in the quantity indicators of the performance metrics? | 4 | 2 | To change the template, you can follow these steps: … | To determine how many information items in the information item comparison… | 12#13#14#15#16#17#18#19#20#21#22#23#24#25#26#27#28#29#30… | 108 | 4 | 8 | 3 | 8 |
Note: Empty cells (“”) indicate where the baseline method failed to generate any answer
These examples demonstrate ST-Raptor’s consistent accuracy across diverse question types and table structures. While other methods occasionally produce correct answers, ST-Raptor delivers reliable results across the board.
Future Development Roadmap
The ST-Raptor development team continues to enhance the system’s capabilities:
Extended Format Support
-
✅ Excel table input (currently supported) -
⌛ HTML, CSV, JSON, and Markdown table input (coming soon) -
⌛ Web demo interface and API access (in development)
Framework Enhancements
-
⌛ Expanded table extraction module supporting additional table types -
⌛ Improved handling of extremely complex nested structures -
⌛ Enhanced multilingual support for global applications
Frequently Asked Questions
Does ST-Raptor require training on my specific tables?
No. ST-Raptor uses a zero-shot learning approach, meaning it can understand and answer questions about your tables without any additional training.
What types of table formats does ST-Raptor support?
Currently, ST-Raptor supports Excel format tables. Support for HTML, CSV, JSON, and Markdown formats is under development.
How much computational resources does ST-Raptor require?
The reference implementation requires approximately 160GB of GPU memory. However, you can adjust the model configurations based on your available hardware or use API-based options to reduce local resource requirements.
Can ST-Raptor handle tables in languages other than English?
Yes. The system uses multilingual models that can process tables in various languages, though performance may vary depending on the specific language and writing system.
What makes ST-Raptor different from using a general-purpose LLM like GPT-4?
ST-Raptor specializes in understanding table structure and content relationships. While general-purpose LLMs can sometimes answer simple table questions, ST-Raptor’s specialized architecture provides significantly better performance on complex semi-structured tables, particularly those with irregular layouts, merged cells, and hierarchical headers.
How does ST-Raptor handle extremely large tables?
ST-Raptor employs efficient processing techniques that can handle tables of substantial size. For exceptionally large tables, the system can process sections incrementally while maintaining context awareness.
Is my data secure when using ST-Raptor?
When using the local deployment option, your data remains on your own infrastructure. If using API-based model options, data would be subject to the privacy policies of the API providers.
Conclusion
ST-Raptor represents a significant advancement in table understanding and question answering. By combining visual understanding with structural analysis and language processing, it achieves remarkable accuracy on even the most challenging semi-structured tables.
For researchers, analysts, and professionals working with complex tabular data, ST-Raptor offers a powerful tool that saves time and reduces errors. Its ability to understand intricate table structures without requiring custom training makes it accessible to organizations of all sizes.
As development continues, ST-Raptor is poised to become an indispensable tool for anyone working with the complex tables that are so common in real-world data analysis.
Join the Community
Interested in semi-structured table analysis? Join the ST-Raptor community to connect with other researchers and practitioners: