Why We Open Source

Our philosophy on open source and the story behind open-xtract.

We believe the best infrastructure software should be open. That's why we built open-xtract and released it for anyone to use.

The Problem

Unstructured documents are everywhere in enterprise. PDFs, invoices, contracts, reports — all containing critical data locked inside formats that resist extraction.

Existing solutions fall into two camps:

  • Template-based extractors that break the moment a document layout changes
  • Black-box APIs that offer no transparency into how decisions are made

Neither is acceptable when you need reliable, auditable data extraction at scale.

Our Approach

open-xtract takes a different path. It uses structured extraction powered by language models, giving you:

  • Schema-driven extraction — define what you want, get exactly that
  • Transparent processing — every extraction decision can be traced and verified
  • Local execution — your documents never leave your infrastructure
uv add open-xtract

Getting started is intentionally simple. Define a schema, point it at a document, and get structured data back.

Open Source as a Foundation

For us, open source isn't charity — it's strategy. When developers can inspect, modify, and extend the tools they depend on, everyone builds better software.

We use open-xtract ourselves in client projects. The feedback loop between real-world usage and open development makes the tool better for everyone.