Why We Open Source
Our philosophy on open source and the story behind open-xtract.
We believe the best infrastructure software should be open. That's why we built open-xtract and released it for anyone to use.
The Problem
Unstructured documents are everywhere in enterprise. PDFs, invoices, contracts, reports — all containing critical data locked inside formats that resist extraction.
Existing solutions fall into two camps:
- Template-based extractors that break the moment a document layout changes
- Black-box APIs that offer no transparency into how decisions are made
Neither is acceptable when you need reliable, auditable data extraction at scale.
Our Approach
open-xtract takes a different path. It uses structured extraction powered by language models, giving you:
- Schema-driven extraction — define what you want, get exactly that
- Transparent processing — every extraction decision can be traced and verified
- Local execution — your documents never leave your infrastructure
uv add open-xtract
Getting started is intentionally simple. Define a schema, point it at a document, and get structured data back.
Open Source as a Foundation
For us, open source isn't charity — it's strategy. When developers can inspect, modify, and extend the tools they depend on, everyone builds better software.
We use open-xtract ourselves in client projects. The feedback loop between real-world usage and open development makes the tool better for everyone.