Document AI — hardening, bulk processing, and routing improvements
A major Document AI update covering bulk batch processing, ten functional bug fixes, fifteen security hardening items, and improved document type routing — all validated by an expanded integration and unit test suite against real compliance document formats.
Added
Bulk document processing
Documents processed in bulk are now mirrored into the full document management system as first-class records. Each document processed via a bulk AI extraction job creates a proper document record and an associated verification task, with the AI extraction noted in the audit trail. The operation is idempotent — re-running a bulk job on previously processed documents does not create duplicates.
Real-document integration tests
The integration test suite now covers the five compliance document formats most commonly uploaded by operators: UN 38.3 transport test reports, CE Declarations of Conformity, PEF/LCA lifecycle assessment studies, CMRT conflict minerals reports, and Safety Data Sheets. These tests run against the live Anthropic Claude API (not mocked) and are excluded from the default test run to keep CI free.
New platform routes
- Authentication flow improvements including dedicated sign-out handling and login redirects
- Supplier data request management view in the Company Portal
- Dynamic
robots.txt— staging disallows all crawlers; production allows full indexing with sitemap - Brand logo assets available in SVG format (standard and white variants)
Fixed
Document AI — ten functional bug fixes
- Collision detection — documents of the same type are no longer overwritten when multiple are uploaded in the same session
- Two-pass auto-detection — document type detection now runs a second pass when the first pass confidence is below threshold
- Suffix-match confidence — partial filename matches are now scored proportionally, reducing false-positive type assignments
- Low-confidence pre-selection guard — the AI no longer pre-selects a document type when confidence is below a safe threshold, prompting the operator to confirm
- Field deduplication — duplicate field extractions from multi-page documents are now collapsed to a single canonical value
- Completeness recalculation — DPP completeness scores are recalculated immediately after AI extraction rather than on next page load
- Country parsing — null safety added to country field parsing; malformed values no longer cause extraction to abort
- Category fallback — the document-to-category mapping now falls back gracefully when the detected category has no direct match
- Confidence boundary handling — word-overlap confidence scores no longer produce values outside the 0–1 range
- Field pre-selection — the UI no longer pre-selects extracted fields that fall below the minimum confidence threshold
Document type routing — significantly expanded
Document type detection now uses 30+ word-bounded pattern matchers covering all major compliance document formats in English, French, and German. This replaces the previous substring-based detection which was prone to false positives.
Security
Fifteen hardening items applied to Document AI server actions:
- Authentication and active-company checks on every server action — unauthenticated calls are rejected before any processing begins
- 10 MB payload cap on base64-encoded document submissions
- Rate limiter now fails closed — if the rate limiting layer is unavailable, the request is rejected rather than passed through
- Template field allowlist — AI-extracted values can only populate a defined set of expected DPP fields, blocking mass-assignment
- Explicit error capture sent to the error monitoring service on every catch path
- Path traversal check on all file references
- Batch state machine gating — bulk jobs can only transition through valid states in sequence
- Memory-aware chunked download helper for large documents
Cookie consent
Cookie consent state is now deferred until the component is mounted on the client, resolving a server-client hydration mismatch that caused the consent banner to flicker on navigation between pages.