Underwriters spend their days buried in submissions: 50-page engineering reports, scanned loss runs, handwritten field notes, financial statements with embedded tables. 64% of insurers cite document processing as their top AI priority, and for good reason.
Insurance submissions are not clean PDFs. They include scanned pages, photographs, handwritten annotations, tables that span multiple pages, and inconsistent formatting across brokers.
Databricks ai_parse_document is a built-in AI Function that extracts structured content from unstructured documents. It handles PDFs, images (JPG, PNG), DOCX, and PPTX files. It captures layout information, parsed tables, bounding boxes, figures, and comprehensive document structure, all directly in SQL or PySpark. No separate OCR pipeline needed.
The underwriting copilot uses three agents:
ai_parse_document to extract text and structure, then ai_extract to pull COPE data (Construction, Occupancy, Protection, Exposure) and risk factorsai_gen to suggest policy terms and pricing, flagging anomalies for underwriter attentionSubmission review time drops by 70%. Underwriter capacity doubles because they focus on judgment calls, not data extraction. Risk scoring becomes consistent across all underwriters because the same models and features apply to every submission.
The key insight: ai_parse_document eliminates the need to choose a specific external model for multimodal parsing. Document intelligence is built into the platform.