Loan origination has a document problem and a consistency problem. Borrowers submit pay stubs (often photographed), W-2s, bank statements, and tax returns in every imaginable format. Loan officers across branches apply different standards to the same data.
Databricks ai_parse_document processes scanned, photographed, and handwritten documents natively. No separate OCR pipeline. It handles pay stubs with variable layouts, photographed W-2 forms, bank statements from different institutions, and tax returns with handwritten amendments. It captures tables, figures, and document structure.
The origination pipeline uses three agents:
ai_parse_document parses all submitted documents, then ai_extract pulls income, employer, account balances, and tax dataai_query cross-references extracted data across documents (income on pay stub vs. W-2 vs. tax return) and flags discrepanciesAgent Bricks Multi-Agent Supervisor coordinates the pipeline, handling re-extraction on low-confidence results and routing edge cases to loan officers.
Feature Store serves the same credit scoring features to every branch: debt-to-income ratios, payment history patterns, collateral valuations. Point-in-time correctness ensures fair lending compliance (ECOA, HMDA).
Unity Catalog governs everything: data lineage for regulatory audit, function registry for validation rules, model governance, and serving endpoints. All models accessed via AI Gateway on Databricks Model Serving.
Time-to-close drops by 40%. Manual document review decreases by 30%. Underwriting standards become consistent across all branches because the same models and validation rules apply everywhere.