In the mortgage servicing industry, efficient document processing can mean the difference between business growth and missed opportunities. This post explores how Onity Group, a financial services company specializing in mortgage servicing and origination, used Amazon Bedrock and other AWS services to transform their document processing capabilities.
Onity Group, founded in 1988, is headquartered in West Palm Beach, Florida. Through its primary operating subsidiary, PHH Mortgage Corporation, and Liberty Reverse Mortgage brand, the company provides mortgage servicing and origination solutions to homeowners, business clients, investors, and others.
Onity processes millions of pages across hundreds of document types annually, including legal documents such as deeds of trust where critical information is often contained within dense text. The company also had to manage inconsistent handwritten entries and the need to verify notarization and legal seals—tasks that traditional optical character recognition (OCR) and AI and machine learning (AI/ML) solutions struggled to handle effectively. By using foundation models (FMs) provided by Amazon Bedrock, Onity achieved a 50% reduction in document extraction costs while improving overall accuracy by 20% compared to their previous OCR and AI/ML solution.
Onity’s intelligent document processing (IDP) solution dynamically routes extraction tasks based on content complexity, using the strengths of both its custom AI models and generative AI capabilities provided by Amazon Web Services (AWS) through Amazon Bedrock. This dual-model approach enabled Onity to address the scale and diversity of its mortgage servicing documents more efficiently, driving significant improvements in both cost and accuracy.
“We needed a solution that could evolve as quickly as our document processing needs,” says Raghavendra (Raghu) Chinhalli, VP of Digital Transformation at Onity Group.
“By combining AWS AI/ML and generative AI services, we achieved the perfect balance of cost, performance, accuracy, and speed to market,” adds Priyatham Minnamareddy, Director of Digital Transformation & Intelligent Automation.
Why traditional OCR and ML models fall short
Traditional document processing presented several fundamental challenges that drove Onity’s search for a more sophisticated solution. The following are key examples:
Verbose documents with data elements not clearly identified
Issue – Key documents in mortgage servicing contain verbose text with critical data elements embedded without clear identifiers or structure
Example – Identifying the exact legal description from a deed of trust, which might be buried within paragraphs of legalese
Inconsistent handwritten text
Issue – Documents contain handwritten elements that vary significantly in quality, style, and legibility
Example – Simple variations in writing formats—such as state names (GA and Georgia) or monetary values (200K or 200,000)—create significant extraction challenges
Notarization and legal seal detection
Issue – Identifying whether a document is notarized, detecting legal court stamps, verifying if a notary’s commission has expired, or extracting data from legal seals, which come in multiple shapes, requires a deeper understanding of visual and textual cues that traditional methods might miss
Limited contextual understanding
Issue – Traditional OCR models, although adept at digitizing text, often lack the capacity to interpret the semantic context within a document, hindering a true understanding of the information contained
These complexities in mortgage servicing documents—ranging from verbose text to inconsistent handwriting and the need for specialized seal detection—proved to be significant limitations for traditional OCR and ML models. This drove Onity to seek a more sophisticated solution to address these fundamental challenges.
Solution overview
To address these document processing challenges, Onity built an intelligent solution combining AWS AI/ML and generative AI services.
Amazon Textract is a ML service that automates the extraction of text, data, and insights from documents and images. By using Amazon Textract, organizations can streamline document processing workflows and unlock valuable data to power intelligent applications.
Amazon Bedrock is a fully managed service that offers a choice of high-performing FMs from leading AI companies. Through a single API, Amazon Bedrock provides access to models from providers such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon, along with a broad set of capabilities to build secure, private, and responsible generative AI applications.
Amazon Bedrock gives you the flexibility to choose the FM that best suits your needs. For IDP, common solutions use text and vision models such as Amazon Nova Pro or Anthropic’s Claude Sonnet. Beyond model access, Amazon Bedrock provides enterprise-grade security with data processing within your Amazon virtual private cloud (VPC), built-in guardrails for responsible AI use, and comprehensive data protection capabilities that are essential for handling sensitive financial documents. You can select the model that strikes the right balance of accuracy, performance, and cost efficiency for your specific application.
The following figure shows how the solution works.
Document ingestion – Documents are uploaded to Amazon Simple Storage Service (Amazon S3). Uploading triggers automated processing workflows.
Preprocessing – Before analysis, documents undergo optimization through image enhancement, noise reduction, and layout analysis. These preprocessing steps help facilitate maximum accuracy for subsequent OCR processing.
Classification – Classification occurs through a three-step intelligent workflow orchestrated by Onity’s document classification application. The process outputs each page’s document type and page number in JSON format:
The application uses Amazon Textract to extract document contents.
Extracted content is processed by Onity’s custom AI model. If the model’s confidence score meets the predetermined threshold, classification is complete.
If the document isn’t recognized because the model isn’t trained with that document type, the application automatically routes the document to Anthropic’s Claude Sonnet in Amazon Bedrock. This foundation model, along with other text and vision models such as Anthropic’s Claude and Amazon Nova, can classify documents without additional training, analyzing both text and images. This dual-model approach, using both Onity’s custom model and the generative AI capabilities of Amazon, helps to optimally balance cost efficiency with speed to market.
Extraction – Onity’s document extraction application employs an algorithm-driven approach that queries an internal database to retrieve specific extraction rules for each document type and data element. It then dynamically routes extraction tasks between Amazon Textract and Amazon Bedrock FMs based on the complexity of the content.
For example, verifying notarization requires complex visual and textual analysis. In these cases, the application uses the capabilities of Amazon Bedrock advanced text and vision models. The solution is built on the Amazon Bedrock API, which allows Onity to use different FMs that provide the optimal balance of cost and accuracy for each document type. This dynamic routing of extraction tasks allows Onity to optimize the balance between cost, performance, and accuracy.
Persistence – The extracted information is stored in a structured format in Onity’s operational databases and in a semi-structured format in Amazon S3 for further downstream processing.
Security overview
When processing sensitive financial documents, Onity implements robust data protection measures. Data is encrypted at rest using AWS Key Management Service (AWS KMS) and in transit using TLS protocols. Access to data is strictly controlled using AWS Identity and Access Management (IAM) policies. For architectural best practices building financial services Industry (FSI) applications in AWS, refer to AWS Financial Services Industry Lens. This solution is implemented using AWS Security best practice guidance using Security Pillar – AWS Well-Architected Framework. For AWS security and compliance best practices, refer to Best Practices for Security, Identity, & Compliance.
Transforming document processing with Amazon Bedrock: Sample use cases
This section demonstrates how Onity uses Amazon Bedrock to automate the extraction of critical information from complex mortgage servicing documents.
Deed of trust data extraction
A deed of trust is a critical legal document that creates a security interest in real property. These documents are typically verbose, containing multiple pages of legal text with critical information including notarization details, legal stamps, property descriptions, and rider attachments. The intelligent extraction solution has reduced data extraction costs by 50% while improving overall accuracy by 20% compared to the previous OCR and AI/ML solution.
Notarization information extraction
The following is a sample of a notarized document that combines printed and handwritten text and a notary seal. The document image is passed to the application with a prompt to extract the following information: state, county, notary date, notary expiry date, presence of notary seal, person signed before notary, and notary public name. The prompt also instructs that if a field is manually crossed out or modified, the manually written or modified text should be used for that field in the output.
Example output:
Extract rider information
The following image is of a rider that includes text and a series of check boxes (selected and unselected). The document image is passed to the application with a prompt to extract both checked riders and other riders listed on the document in a provided JSON format.
Example output:
Automation of the checklist review of home appraisal documents
Home appraisal reports contain detailed property comparisons and valuations that require careful review of multiple data points, including room counts, square footage, and property features. Traditionally, this review process required manual verification and cross-referencing, making it time-consuming and prone to errors. The automated solution now validates property comparisons and identifies potential discrepancies, significantly reducing review times while improving accuracy by 65% over the manual process.
The following example shows a document in a grid layout with rows and columns of information. The document image is passed to the application with a prompt to verify if the room counts are identical across the subject and comparables in the appraisal report and if square footages are within a specified percentage of the subject property’s square footage. The prompt also requests an explanation of the analysis results. The application then extracts the required information and provides detailed justification for its findings.
Example output:
Automated credit report analysis
Credit reports are essential documents in mortgage servicing that contain critical borrower information from multiple credit bureaus. These reports arrive in diverse formats with scattered information, making manual data extraction time-consuming and error-prone. The solution automatically extracts and standardizes credit scores and scoring models across different report formats, achieving approximately 85% accuracy.
The following image shows a credit report that combines rows and columns with number and text values. The document image is passed to the application using a prompt instructing it to extract the required information.
Example output:
Conclusion
Onity’s implementation of intelligent document processing, powered by AWS generative AI services, demonstrates how organizations can transform complex document handling challenges into strategic advantages. By using the generative AI capabilities of Amazon Bedrock, Onity achieved a remarkable 50% reduction in document extraction costs while improving overall accuracy by 20% compared to their previous OCR and AI/ML solution. The impact was even more dramatic in specific use cases—their credit report processing achieved accuracy rates of up to 85%—demonstrating the solution’s exceptional capability in handling complex, multiformat documents.
The flexible FM selection provided by Amazon Bedrock enables organizations to choose and evolve their AI capabilities over time, helping to strike the optimal balance between performance, accuracy, and cost for each specific use case. The solution’s ability to handle complex documents, including verbose legal documents, handwritten text, and notarized materials, showcases the transformative potential of modern AI technologies in financial services. Beyond the immediate benefits of cost savings and improved accuracy, this implementation provides a blueprint for organizations seeking to modernize their document processing operations while maintaining the agility to adapt to evolving business needs. The success of this solution proves that thoughtful application of AWS AI/ML and generative AI services can deliver tangible business results while positioning organizations for continued innovation in document processing capabilities.
If you have similar document processing challenges, we recommend starting with Amazon Textract to evaluate if its core OCR and data extraction capabilities meet your needs. For more complex use cases requiring advanced contextual understanding and visual analysis, use Amazon Bedrock text and vision foundation models, such as Amazon Nova Lite, Nova Pro, Anthropic’s Claude Sonnet, and Anthropic’s Claude. Using an Amazon Bedrock model playground, you can quickly experiment with these multimodal models and then compare the best foundation models across different metrics such as accuracy, robustness, and cost using Amazon Bedrock model evaluation. Through this process, you can make informed decisions about which model provides the best balance of performance and cost-effectiveness for your specific use case.
About the author
Ramesh Eega is a Global Accounts Solutions Architect based out of Atlanta, GA. He is passionate about helping customers throughout their cloud journey.