French AI Startup Mistral AI on Thursday launched a new enterprise-grade Document AI platform, claiming to set a new benchmark in speed and accuracy for OCR-based document processing.
The offering, capable of parsing everything from low-resolution scans to handwritten forms, is being positioned as a full-stack solution for businesses dealing with large volumes of paperwork.
The company highlights that the platform is powered by a state-of-the-art OCR engine with reported 99%+ accuracy across over 11 global languages.
Unlike traditional systems that struggle with mixed layouts, Mistral’s AI can interpret complex documents, including tables, forms, contracts, and invoices, and convert them into structured JSON with custom extraction templates.
Processing speeds reportedly reach up to 2,000 pages per minute on a single GPU, making it one of the fastest tools in its category.
A demonstration using a decades-old legal contract from Washington Public Power Supply System showed the platform parsing dense paragraphs, legacy formatting and embedded clauses into clearly structured outputs. Even handwritten notes, audit disclaimers and historical equipment delivery records were extracted with accuracy that outperformed legacy systems.
Document AI also includes AI tooling for automating full document lifecycles, from digitisation and classification to compliance monitoring. It supports on-premise and private cloud deployment, catering to sectors with strict data sovereignty rules.
Mistral’s push into document intelligence follows broader enterprise trends toward digitisation of archives and automation of compliance workflows. For research institutions and multinational firms juggling multilingual paperwork, this could be useful.
This comes right after Mistral launched Devstral, an open-source AI model for real-world coding tasks, outperforming peers with a 46.8% score on SWE-Bench. It runs on consumer hardware and is available on platforms like HuggingFace. The company also recently unveiled Mistral Small 3.1, its state-of-the-art multimodal, multilingual, open-source model available under an Apache license.
For enterprises still drowning in paperwork, Mistral’s latest bet suggests that OCR might finally be ready for critical workloads.