Document Verification API: Developer Integration Guide
Integrate document verification via REST API with OAuth 2.0, webhooks and SDKs. Endpoints, code samples

Summarize this article with
A document verification API is a programmatic interface that lets developers submit identity documents, invoices, certificates or proof-of-address files and receive structured verification results -- authenticity checks, data extraction, fraud signals -- without building the underlying AI models themselves. CheckFile's REST API processes a single document in 4.2 seconds on average, returns 98.7% OCR accuracy across 24 languages, and handles 3,200+ document types across 32 jurisdictions including all 50 US states.
For US financial institutions, the FFIEC IT Examination Handbook establishes standards for technology risk management that apply to AI-powered document processing systems. Any document verification API integrated into a BSA/AML compliance program must produce auditable decision traces and maintain complete processing records -- requirements the CheckFile API satisfies through its deterministic rule engine layer.
This guide covers authentication, core endpoints, webhook configuration, error handling, SDK options, and pricing. It is written for backend engineers, DevOps teams and technical leads evaluating document verification APIs for production integration in the US market.
This article is for informational purposes only and does not constitute legal, financial, or regulatory advice.
Authentication and Security
The CheckFile API uses OAuth 2.0 client credentials for machine-to-machine authentication, following RFC 6749, Section 4.4. You exchange your client_id and client_secret for a short-lived bearer token (60-minute expiry), then include that token in the Authorization header of every subsequent request.
All API traffic is encrypted with TLS 1.3. Document payloads are encrypted at rest using AES-256, and PII is automatically redacted from logs, satisfying the FTC Safeguards Rule requirements for appropriate technical measures to protect customer information under the Gramm-Leach-Bliley Act. For organizations operating in California, the CCPA (Cal. Civ. Code 1798.100) imposes additional data handling obligations.
# Obtain access token
curl -X POST https://api.checkfile.ai/oauth/token \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=client_credentials&client_id=YOUR_ID&client_secret=YOUR_SECRET"
# Response
{
"access_token": "eyJhbGciOi...",
"token_type": "Bearer",
"expires_in": 3600
}
Key security features:
- IP allowlisting -- restrict API access to known server IPs
- Rate limiting -- configurable per plan (see pricing section)
- Webhook signatures -- HMAC-SHA256 verification on every callback
- Audit log -- every API call is logged with timestamp, client ID, document type, and result
Scopes and Permissions
| Scope | Permission | Use case |
|---|---|---|
documents:write |
Upload and submit documents | Standard verification flow |
documents:read |
Retrieve results and status | Polling-based integrations |
webhooks:manage |
Create and configure webhooks | Event-driven architectures |
analytics:read |
Access usage metrics | Monitoring dashboards |
admin:manage |
Manage API keys and team access | DevOps and administration |
Core API Endpoints
The API follows RESTful conventions with JSON payloads. Base URL: https://api.checkfile.ai/v1.
Document Submission
POST /v1/documents/verify
Content-Type: multipart/form-data
Authorization: Bearer {token}
# Fields:
# file (required) โ document image or PDF (max 20 MB)
# document_type (optional) โ "passport", "drivers_license", "invoice", "proof_of_address"
# country (optional) โ ISO 3166-1 alpha-2 code
# webhook_url (optional) โ callback URL for async results
# reference_id (optional) โ your internal reference for correlation
Response (HTTP 202 Accepted):
{
"document_id": "doc_8f3a2b1c",
"status": "processing",
"estimated_completion_seconds": 4,
"created_at": "2026-03-19T10:15:00Z"
}
When document_type is omitted, the API uses its AI classification engine -- which achieves 96.1% classification accuracy on our benchmark of 3,200+ document types -- to detect the type automatically.
Retrieve Results
GET /v1/documents/{document_id}
Authorization: Bearer {token}
Response (HTTP 200):
{
"document_id": "doc_8f3a2b1c",
"status": "completed",
"document_type": "drivers_license",
"country": "US",
"verification": {
"authentic": true,
"confidence": 0.97,
"fraud_signals": [],
"checks": {
"barcode_valid": true,
"photo_tamper": false,
"expiry_valid": true,
"data_consistency": true
}
},
"extracted_data": {
"full_name": "Jane Smith",
"date_of_birth": "1990-05-12",
"document_number": "D123-4567-8901",
"expiry_date": "2031-05-11",
"state": "NY"
},
"processing_time_ms": 3840,
"created_at": "2026-03-19T10:15:00Z",
"completed_at": "2026-03-19T10:15:03.840Z"
}
The extracted_data object uses 94.3% field extraction accuracy on our internal benchmark, covering structured fields across all supported document types.
Batch Verification
For high-volume integrations, the batch endpoint accepts up to 50 documents per request:
POST /v1/documents/verify/batch
Content-Type: multipart/form-data
Authorization: Bearer {token}
# files[] โ array of document files
# options โ JSON object with shared settings
Batch requests return a batch_id and deliver results via webhook as each document completes.
Webhook Configuration
Event-driven architectures avoid polling overhead. Register a webhook endpoint to receive real-time notifications when verifications complete.
POST /v1/webhooks
Authorization: Bearer {token}
Content-Type: application/json
{
"url": "https://your-app.com/webhooks/checkfile",
"events": ["document.completed", "document.failed", "document.review_required"],
"secret": "whsec_your_secret_key"
}
Every webhook delivery includes an X-CheckFile-Signature header containing an HMAC-SHA256 hash of the payload. Verify it before processing:
import hmac
import hashlib
def verify_webhook(payload: bytes, signature: str, secret: str) -> bool:
expected = hmac.new(
secret.encode(), payload, hashlib.sha256
).hexdigest()
return hmac.compare_digest(f"sha256={expected}", signature)
Webhook retry policy: 3 attempts with exponential backoff (5s, 30s, 300s). After 3 failures, the webhook is disabled and your team receives an email alert.
| Event | Trigger | Payload includes |
|---|---|---|
document.completed |
Verification finished successfully | Full result object |
document.failed |
Processing error (corrupt file, unsupported format) | Error code and message |
document.review_required |
Low-confidence result flagged for human review | Partial result + confidence score |
batch.completed |
All documents in a batch are processed | Summary with per-document statuses |
SDK and Integration Options
While the REST API works from any language, official SDKs reduce integration time from days to hours.
Available SDKs
| Language | Package | Install |
|---|---|---|
| Python | checkfile-sdk |
pip install checkfile-sdk |
| Node.js | @checkfile/sdk |
npm install @checkfile/sdk |
| Java | com.checkfile:sdk |
Maven Central |
| Go | github.com/checkfile/sdk-go |
go get |
Python Integration Example
from checkfile import CheckFileClient
client = CheckFileClient(
client_id="YOUR_CLIENT_ID",
client_secret="YOUR_CLIENT_SECRET"
)
# Synchronous verification
result = client.documents.verify(
file=open("drivers_license.pdf", "rb"),
document_type="drivers_license",
country="US"
)
print(f"Authentic: {result.verification.authentic}")
print(f"Name: {result.extracted_data.full_name}")
print(f"Processing time: {result.processing_time_ms}ms")
Node.js Integration Example
import { CheckFileClient } from '@checkfile/sdk';
import { readFileSync } from 'fs';
const client = new CheckFileClient({
clientId: process.env.CHECKFILE_CLIENT_ID,
clientSecret: process.env.CHECKFILE_CLIENT_SECRET,
});
const result = await client.documents.verify({
file: readFileSync('drivers_license.pdf'),
documentType: 'drivers_license',
country: 'US',
});
console.log(`Authentic: ${result.verification.authentic}`);
console.log(`Confidence: ${result.verification.confidence}`);
SDKs handle token refresh, retries with exponential backoff, and webhook signature verification automatically. Our analysis shows that SDK-based integrations reduce median time-to-production from 8 days (raw REST) to 2 days.
Error Handling and Rate Limits
The API uses standard HTTP status codes with structured error bodies:
{
"error": {
"code": "DOCUMENT_UNREADABLE",
"message": "The uploaded file could not be parsed. Ensure DPI >= 300.",
"details": { "min_dpi": 300, "detected_dpi": 72 },
"request_id": "req_9f2c4d1e"
}
}
Common Error Codes
| HTTP Status | Error Code | Resolution |
|---|---|---|
| 400 | INVALID_FILE_FORMAT |
Use PDF, JPEG, PNG or TIFF |
| 400 | DOCUMENT_UNREADABLE |
Increase scan resolution to 300+ DPI |
| 401 | TOKEN_EXPIRED |
Refresh your OAuth token |
| 413 | FILE_TOO_LARGE |
Reduce file below 20 MB limit |
| 429 | RATE_LIMIT_EXCEEDED |
Wait for Retry-After header duration |
| 503 | SERVICE_DEGRADED |
Retry with exponential backoff |
Rate Limits by Plan
| Plan | Requests/minute | Burst | Concurrent uploads |
|---|---|---|---|
| Starter | 60 | 10 | 5 |
| Business | 500 | 50 | 25 |
| Enterprise | 2,000+ | 200 | 100 |
Rate limit headers (X-RateLimit-Remaining, X-RateLimit-Reset) are included in every response. Build your retry logic around these rather than hardcoding delays.
Compliance and Data Handling
Document verification touches PII across multiple jurisdictions. The API is designed with compliance as a first-class concern for US organizations.
The Bank Secrecy Act (BSA) and AMLA 2020 require financial institutions to maintain robust customer identification programs (CIP) and customer due diligence (CDD) procedures. Document verification APIs integrated into BSA/AML compliance programs must produce auditable records that can withstand regulatory examination by FinCEN and federal banking regulators.
Data handling guarantees:
- Retention: Documents are deleted after processing unless you explicitly request storage (configurable from 0 to 365 days)
- Residency: US-based processing available; EU and APAC regions available on Enterprise plans
- Audit trail: Every API call generates an immutable audit record with document hash, timestamp, result, and client ID
- SOC 2 Type II certification covers the API infrastructure
- PCI DSS compliant document handling for financial documents
For integrations subject to OCC guidance on third-party risk management (OCC 2023-17), CheckFile provides the required third-party assurance documentation, business continuity testing results, and exit strategy terms. The NIST Cybersecurity Framework (CSF 2.0) provides additional guidance on vendor risk management.
Pricing Structure
CheckFile uses a per-document pricing model with volume discounts. All plans include full API access, webhooks, and audit logs.
| Plan | Monthly price | Included verifications | Extra verification | Support |
|---|---|---|---|---|
| Starter | Free | 100 | -- | Community |
| Business | From $299/mo | 2,000 | $0.12 | Priority email (< 4h) |
| Enterprise | Custom | Custom volume | Negotiated | Dedicated CSM + SLA |
See the full pricing page for details on volume tiers, annual billing discounts, and Enterprise SLA terms.
Our platform analysis shows that organizations switching from manual document checks to API-based verification reduce cost per dossier by 67% and processing time by 83%. The average payback period for Business plan customers is under 3 months when processing 500+ documents per month.
| Manual process | API-automated | Saving |
|---|---|---|
| 12 min/document | 4.2 seconds | 99.4% time reduction |
| $5.20/document (labor) | $0.12-0.15/document | 67-97% cost reduction |
| 89% accuracy (human error) | 98.7% OCR accuracy | Fewer re-checks |
| Business hours only | 99.94% uptime, 24/7 | No scheduling constraints |
Integration Architecture Patterns
Pattern 1: Synchronous (Simple)
For low-volume integrations (< 60 req/min), submit and poll:
Client -> POST /v1/documents/verify -> 202 Accepted
Client -> GET /v1/documents/{id} (poll every 2s) -> 200 with results
Suitable for onboarding flows where the user waits for verification, such as CIP verification at account opening.
Pattern 2: Async with Webhooks (Recommended)
For production workloads, submit and receive results via webhook:
Client -> POST /v1/documents/verify (with webhook_url) -> 202 Accepted
CheckFile -> POST webhook_url (signed payload) -> Your handler processes result
Decouples submission from processing. Scales linearly with volume.
Pattern 3: Batch Pipeline
For back-office processing (nightly BSA reviews, bulk compliance checks, periodic CDD refresh):
Client -> POST /v1/documents/verify/batch (up to 50 files) -> batch_id
CheckFile -> POST webhook_url per document as each completes
CheckFile -> POST webhook_url with batch.completed summary
Our platform processes over 180,000 documents per month using these patterns. The async webhook pattern handles 94% of production integrations.
Getting Started
Integration follows four steps:
- Create an account at checkfile.ai and generate API credentials from the dashboard
- Test in sandbox -- the sandbox environment mirrors production with synthetic documents (no billing)
- Integrate using the SDK or direct REST calls, starting with the synchronous pattern
- Go live -- switch to production credentials and configure webhooks
The API documentation includes an interactive playground for testing endpoints, and the pricing page details plan options for your expected volume.
For teams building automated document verification workflows, the API integrates directly with the patterns described in our workflow setup guide. If you are evaluating verification solutions more broadly, our automation verification guide covers the full landscape of document verification approaches.
Frequently Asked Questions
What document types does the API support?
The CheckFile API supports 3,200+ document types across 32 jurisdictions, including US passports, US driver's licenses (all 50 states), Social Security cards, W-2 forms, 1099 forms, invoices, bank statements, proof of address, tax notices, payslips, and corporate registration documents such as Certificates of Good Standing and Articles of Incorporation. The AI classification engine identifies document types automatically with 96.1% accuracy when the document_type parameter is omitted.
How long does verification take?
Average processing time is 4.2 seconds per document. P95 latency is under 12 seconds for standard document types. Batch submissions process documents in parallel, so a 50-document batch typically completes within 30-60 seconds depending on document complexity.
Is the API compliant with US data privacy regulations?
Yes. CheckFile maintains SOC 2 Type II and ISO 27001 certifications covering the API infrastructure. US-based processing is available for data residency requirements. The platform satisfies the FTC Safeguards Rule under GLBA for encryption and access controls, and supports CCPA data subject rights including deletion requests. For financial institutions, the API produces the auditable processing records required for BSA/AML examinations by FinCEN and federal banking regulators.
Can I test the API before committing to a paid plan?
The Starter plan includes 100 free verifications per month, and the sandbox environment allows unlimited testing with synthetic documents at no cost. No credit card is required to start.
What happens if the API cannot verify a document?
Documents that fall below the confidence threshold are flagged with review_required status and routed to your human review queue via webhook. The response includes the partial result with the confidence score, extracted data, and specific fraud signals that triggered the flag. This ensures no document falls through the cracks.
Stay informed
Get our compliance insights and practical guides delivered to your inbox.