Document Extraction API Reference
REST API reference for the dataextractor.io document extraction API. Authentication, datasets, extraction, and webhooks.
Last updated: April 8, 2026
Authentication
All API requests require a Bearer token in the Authorization header. Generate a key from the Developer page in your account. Keys are scoped to your account and can be revoked at any time.
List datasets
GET /api/v1/datasets — list all datasets. Filter by customer_id, dataset_type, is_verified, or search query. Returns paginated results.
Get a dataset
GET /api/v1/datasets/{id} — return a single dataset with its extracted fields, line items, and ground truth.
Webhooks
Subscribe to extraction.completed and matching.completed events. Configure a webhook URL from the Integrations page. Payloads include the dataset ID and a signed URL to fetch the extracted data.