Troubleshooting Document Extraction
Common issues and fixes for document extraction accuracy, ERP matching confidence, and performance.
Last updated: April 8, 2026
Low extraction accuracy
Run the Improve step to invoke the GEPA learning loop. It rewrites the per-field prompts based on the ground-truth corrections you saved. Two or three iterations usually push accuracy above 95% on consistent document layouts.
Matching confidence too low
Check that your ERP catalog is synced. Low matches usually mean the canonical SKU in the ERP differs from how it appears on the document. Add aliases to the matching config to bridge the gap.
Slow extraction
Extraction time scales with the number of fields and the size of the document. Trim unused fields from the schema, and avoid extracting line items if you do not need them.