ATLAS: Benchmarking And Adapting LLMs For Global Trade Via Harmonized Tariff Code Classification - Takara TLDR

Accurate classification of products under the Harmonized Tariff Schedule
(HTS) is a critical bottleneck in global trade, yet it has received little
attention from the machine learning community. Misclassification can halt
shipments entirely, with major postal operators suspending deliveries to the
U.S. due to incomplete customs documentation. We introduce the first benchmark
for HTS code classification, derived from the U.S. Customs Rulings Online
Search System (CROSS). Evaluating leading LLMs, we find that our fine-tuned
Atlas model (LLaMA-3.3-70B) achieves 40 percent fully correct 10-digit
classifications and 57.5 percent correct 6-digit classifications, improvements
of 15 points over GPT-5-Thinking and 27.5 points over Gemini-2.5-Pro-Thinking.
Beyond accuracy, Atlas is roughly five times cheaper than GPT-5-Thinking and
eight times cheaper than Gemini-2.5-Pro-Thinking, and can be self-hosted to
guarantee data privacy in high-stakes trade and compliance workflows. While
Atlas sets a strong baseline, the benchmark remains highly challenging, with
only 40 percent 10-digit accuracy. By releasing both dataset and model, we aim
to position HTS classification as a new community benchmark task and invite
future work in retrieval, reasoning, and alignment.

Source link

What's Hot

OpenAI Launches ChatGPT Pulse, a Personal Assistant for Summarization

Paid, the AI agent ‘results-based billing’ startup from Manny Medina, raises huge $21M seed

NVIDIA Just Solved The Hardest Problem in Physics Simulation!

ATLAS: Benchmarking and Adapting LLMs for Global Trade via Harmonized Tariff Code Classification – Takara TLDR

SimpleFold: Folding Proteins is Simpler than You Think – Takara TLDR

CompLLM: Compression for Long Context Q&A – Takara TLDR

OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps – Takara TLDR

Judge Rejects Ronald Perelman’s $400 M. Art Insurance Claim

Drag Queen Alexis Stone Became the Mona Lisa for Milan Fashion Show

Steve McQueen’s Granddaughter Lawsuit for $68 M. Pollock Painting

Marina Abramović to Have Exhibition at Venice’s Accademia in 2026

OpenAI Launches ChatGPT Pulse, a Personal Assistant for Summarization

Paid, the AI agent ‘results-based billing’ startup from Manny Medina, raises huge $21M seed

NVIDIA Just Solved The Hardest Problem in Physics Simulation!

What's Hot

ATLAS: Benchmarking and Adapting LLMs for Global Trade via Harmonized Tariff Code Classification – Takara TLDR

Related Posts

Subscribe to Updates