OUR SERVICES
Convert and transform information into structured formats: XML, S1000D, DITA, SPL, proprietary schemas, and others.
Independent review of previously converted content. Provide third-party validation and peace of mind.
Enrich content with new or inferred metadata to improve the utility, discovery, and interoperability of content.
Analyze large document collections to identify content reuse across multiple documents and source formats.
Extract free-form text from textual and form-based documents, PDFs, and other formats, then generate target XML schema.
Submit structured content to platforms such as PubMed, Silverchair, HighWire, and more.
Website harvesting and AI transformations that scan HTML, PDF, Excel, etc. and deliver structured data to your systems.
DCL can automate the creation of training sets and structure data sets to support your AI and machine learning projects.

THE LATEST FROM DCL
How Structured Content Makes AI Work Better
Generative AI systems work best when the information they consume is organized, explicit, and precise. Structured content formats like XML and JSON provide exactly that: machine‑readable semantically rich consistently organized Unstructured Documents Are Ambiguous for AI Document processing is not simply one problem; rather, it comprises three components that must be considered text extraction table extraction graph/figure/image interpretation Each of these components introduces ambiguity...
eBooks Are Older Than You Think: Vannevar Bush and the Story of the Memex
Fall of 2007 is the time frame I personally think of as the "birth of ebooks." I know that's not exactly accurate but it is around the time when the first version of the Kindle was released and I became a regular ebook reader. But the truth is, the idea behind ebooks stretches back far earlier than the Kindle era and long before digital screens, file formats, or wireless downloads. In fact, decades before anyone could imagine a portable reading device, one particular visionary thinker already...
Attention Engine Optimization: It’s All About the R in RAG
As AI becomes increasingly integrated into all industries, organizations are striving to ensure that the insights generated by AI systems are reliable, accurate, and grounded in truth. Achieving trustworthy AI requires not only robust algorithms but also careful attention to the quality and structure of the information fed into them.
Trustworthy AI: Optimizing Content for Large Language Models
Whether you’re developing AI-driven knowledge tools or simply want to make your organization’s content AI ready, learn how content structure can focus an AI’s attention, improve response quality, and ensure your most valuable information doesn’t get lost in the noise.
The Legal Fine Lines of Fair Use and Generative AI
Training sets for large language models (LLMs) is changing how we think about copyright but the law hasn’t quite caught up yet. In this sharp, timely webinar, legal experts unpack the evolving landscape of fair use as it applies to generative AI: What counts as truly “transformative”?















