Document AI
Document AI uses Anthropic Claude to extract structured DPP field data from your uploaded compliance documents. Instead of manually transcribing values from a test report or datasheet into DPP fields, you can let Document AI read the document and propose the extracted values — which you then review and selectively accept or reject.
Document AI is a productivity tool, not an autonomous data entry system. Every extracted field requires your explicit acceptance before it is saved to a DPP. The platform is designed this way because AI extraction, while highly accurate on well-structured documents, can misread units, confuse similar model numbers, or misinterpret table layouts. Your review is the quality gate.
AI Consent Requirement
Before using Document AI for the first time, you must provide consent to AI processing. This is a one-time step per user account.
When you first click Process with AI on any document, a consent modal appears:
"Traceable uses Anthropic Claude to analyse the content of documents you submit for AI processing. Document content is transmitted securely to Anthropic's API for extraction. Anthropic's data processing terms apply. Do you consent to AI processing of your documents?"
You must click I Consent to proceed. If you click Decline, Document AI features are not activated for your account. You can activate them later from Settings > Account Preferences > Document AI.
Consent applies only to documents you explicitly submit for AI processing. Documents you upload but do not process with AI are never sent to any AI service.
Single Document Processing
Single document processing is best for reviewing the extraction results carefully before accepting them into a DPP. Use this approach when processing important documents such as Declarations of Conformity, test reports for new product categories, or any document where you want to review the AI's output field by field.
How to Process a Single Document
- Navigate to Compliance > Documents.
- Open the document you want to process, or upload it if it has not been uploaded yet.
- Click Process with AI.
- In the processing dialog, select the target product — the product whose DPP fields you want to populate with data extracted from this document.
- Optionally, select the DPP section you expect the document to relate to (e.g., "Technical Specifications", "Carbon Footprint"). This helps the AI focus on relevant fields. If you are not sure, leave this set to "Auto-detect".
- Click Run Extraction. Processing typically takes 10–30 seconds for a standard PDF.
Reviewing Extracted Fields
After processing, the AI extraction results panel appears. It shows:
- A list of DPP fields the AI identified values for.
- For each field: the proposed value, the confidence level (High / Medium / Low), and the source excerpt — the exact text in the document that the AI used to derive the value.
For each extracted field, you can:
- Accept — click the green tick to accept the value. The field is immediately updated in the target product's DPP.
- Reject — click the red cross to dismiss the extracted value. The existing DPP field value (if any) is unchanged.
- Edit then Accept — click the pencil icon to modify the extracted value before accepting. Use this when the AI extracted the right information but the format or unit differs from what the DPP field expects.
Review all extracted fields, even those with High confidence. Pay particular attention to:
- Numeric values with units — the AI may extract the correct number but apply the wrong unit (e.g., extracting Wh when the field expects kWh).
- Model numbers — test reports often cover multiple models; confirm the extracted model number applies to the specific product you selected.
- Dates — the AI may confuse document issue date, test date, and certificate expiry date.
After reviewing all fields, click Done. Any fields you Accepted are saved. Any fields you Rejected or left unreviewed are not saved.
Bulk Processing
Bulk processing is designed for operators who have a large library of existing documents to process — for example, when migrating to Traceable with an existing document archive.
How to Run Bulk Processing
- Go to Compliance > Documents.
- Select multiple documents using the checkboxes.
- Click Process Selected with AI in the bulk action toolbar.
- In the bulk processing dialog:
- Select the target product (all selected documents will be processed against the same product, unless you use the per-document override option).
- Optionally assign a DPP section for auto-focus.
- Click Start Batch. Batch processing runs in the background. You can navigate away and return to review results when processing completes.
Reviewing Batch Results
When the batch is complete, a notification appears in the platform header. Click the notification to go to the Batch Extraction Review page.
The review page shows all documents processed in the batch, grouped by extraction status:
- Ready to review — extraction completed successfully.
- Extraction failed — the document could not be processed (see failure reasons below).
- No fields extracted — the document was processed but no DPP-relevant data was found.
Click any document in the "Ready to review" group to open its extraction results. Review and accept/reject fields as described in the single document workflow above.
You must review and confirm each document individually — there is no "accept all" button for batch results. This is intentional, to ensure each extracted value is explicitly reviewed before it enters a DPP.
Document Types That Work Well
Document AI performs best on documents that have a clear, structured layout with typed text. Ideal document types include:
- Technical datasheets from battery manufacturers — typically well-structured with clearly labelled specification tables.
- IEC and UN test reports from accredited laboratories — standardised formats with named test parameters and tabulated results.
- Specification sheets for cells and packs — usually include capacity, voltage, energy, mass, dimensions, and chemistry information in a consistent layout.
- Carbon footprint declarations following ISO 14067 or EU regulatory templates.
- CE certificates — typically short documents with clear identification fields.
Document Types That Do Not Work Well
Document AI will attempt to process any PDF or image, but results are unreliable for:
- Handwritten forms — AI optical character recognition (OCR) performs poorly on handwriting, especially technical values and model numbers.
- Scanned documents with low OCR quality — documents scanned at low resolution (below 150 DPI) or with significant skew, staining, or fading produce inaccurate text recognition.
- Multi-language documents without clear section delineation — AI may mix field values from different language versions of the same document.
- Spreadsheets exported as images — lose the structural metadata that makes tabular data interpretable.
- Certificates with decorative backgrounds or security watermarks — visual noise can interfere with text recognition.
If a document falls into one of these categories, enter the values manually in the DPP editor.
Always Review Before Accepting
This point is worth repeating explicitly:
Document AI output should always be reviewed before accepting, regardless of the confidence level shown.
High confidence means the AI identified a clear, unambiguous source for the value. It does not mean the value is correct in the context of your DPP — the document may contain values for a different product variant, an older product revision, or a different market specification. You are the subject matter expert; the AI is a reading assistant.
A review workflow that takes two minutes per document is faster than correcting a published DPP after a regulator or verifier identifies a data discrepancy.
Extraction Failures
If a document fails extraction, the batch review page or the single document processing dialog will show an error reason:
| Error | Cause | Resolution |
|---|---|---|
File could not be parsed | The PDF is encrypted, corrupt, or uses an unsupported encoding | Re-export the PDF from the source application without encryption |
No extractable text found | The document is a pure image scan with no machine-readable text layer | Use OCR software to add a text layer before uploading, or enter data manually |
Document too long (exceeds 100 pages) | The document exceeds the processing limit | Split the document into shorter sections |
AI processing timed out | Network or service timeout during extraction | Retry by clicking Reprocess; most timeout failures succeed on retry |