top of page

DCL Insights
DCL in the News, Published Research, White Papers

DCL in the News

DCL in the News
industry today.png

Vice Magazine

Librarians Are Finding Thousands Of Books No Longer Protected By Copyright Law

“It is the history of American creativity. I think that the data should be usable not just by us, by the libraries, but by everyone. I think it belongs to the people and is the people’s data.” 

- Greg Cram, New York Public Library

Industry Today

Fluid & Powerful: The American Water Works Association Transforms its Publishing Strategy

The creation and application of standards related to the management and treatment of drinking and wastewater are among the most crucial to the well-being of modern human populations.

Information Today

Hallucinate, Confabulate, Obfuscate: The State of Artificial Intelligence Today

If there were an award for webinar titles, Data Conversion Laboratory (DCL) and its Oct. 12 program, “Hallucinate, Confabulate, Obfuscate: The Perils of Generative AI Going Rogue,” would surely be in the running.

The New York Times

Ending the Pain of Data Transfers

''The most expensive part of switching to a new generation, more than the hardware and software, is in converting the documents and data that they have accumulated over the years,'' said Mark Gross, president of Data Conversion Laboratories. That company was one of the first to identify the need for data conversion, which is now a multibillion-dollar industry.

Industry Publications

Industry Publications

SSP Annual Meeting 2024

Fluid Article Production:
Using AI to Automate XML Creation

Consistently good XML post-peer review, delivers advantages to publishers and authors alike. Good XML at the beginning of the publishing workflow streamlines downstream editorial and production tasks and tagging a manuscript immediately after peer review provides the mechanisms to support research integrity and AI iniatives.

JATS Con 2021

Leaping the Field: Jumping From PDF to NISO STS

In 2017, AWWA embarked on an initiative to transform its publishing workflow by taking the corpus of current and historical Standards and Manuals and converting the content stored in PDF documents into standardized XML formats: NISO STS  and BITS.

Balisage 2023

Pulling All Production Processes Together with an XML-First System

Creating a seamless centralized workflow that starts with XML has long been the siren song of scholarly journal production workflows. Yet the definition of “start” is the critical piece in this publishing puzzle. Innovating article production truly means starting with XML as soon as a manuscript is accepted after peer review.

JATS Con 2022

Identifying XML Issues That Impact Content Interchange

Publishers’ content collections are complex, often spanning decades, during which time standards have evolved. Version 1.0 of the Journal Publishing Tag Set (aka NLM DTD) was released in February 2003. Thus, content that used the NLM DTD is significantly different from the current specification—NISO JATS Version 1.3.

White Papers

White Papers


content structure,  semantic enrichment, digital transformation, New York Public Library, American Water Works Association


NLM, NLM conversion, agile development, quality assurance, Optica Publishing Group, XML conversion services


data harvesting, web crawling, DCL Data Harvester, XML feeds, automation


image-based PDFs, automated transformation, United States Patent and Trademark Office, content structure, XML feeds

DCL Newsletters

June 2024

June 2024.JPG

AI in automating XML creation, AMPP's vast collection, how cows touched millions of lives, and more

March 2024

March 2024.JPG

Anatomy of an audio file, research integrity and citation cartels, generative AI in a nutshell, and more

December 2023

December 2023.JPG

AI in scholarly publishing, Merriam-Webster's word of the year, ChatGPT's first birthday, and more.

September 2023

September 2023.JPG

Using AI to generate complex mathematical equations, the Data-Centric Manifesto, and more.

June 2023

June 2023.JPG

AI Hallucinations, the Pet Shop Boys, and LLMs; ten questions for life sciences writers, and more.

March 2023

Mar 2023 DCL Newsletter.png

Infants outperform AI, books no longer protected by copyright law, data migration—analog style, and more.

May 2024

May 2024.JPG

Structured content in RAG, AI in scholarly publishing, S1000D, a tale of technical debt, and more

February 2024

February 2024.JPG

Major project for USPTO, a PDF that's bigger than Germany, the problem with digital preservation, and more.

November 2023

November 2023.JPG

AI Going Rogue, Six Reasons to Train Your LLMs With Structured Content, Text on Ancient Scrolls, and more.

August 2023

August 2023.JPG

A true XML-early publishing system, multi-lingual eye test chart, eLearning content digitization, and more.

May 2023

May 2023.JPG

Reality check: considerations beyond the CCMS, PubMed and PubMed Central—what's it all about?, and more.

February 2023

Feb 2023 DCL Newsletter

Tough tables, AI and supply-chain management, Washington Post's first accessibility engineer, and more.

April 2024

April 2024.JPG

Content reuse strategy pocast episode, the mystery of LLMs, ancient artificial intelligence, and more

January 2024

2 - January 2024.JPG

AACR's entire journal collection, making the greatest medical library, the scale of all things, and more.

October 2023

October 2023.JPG

Copyright infringement in the brave new AI world, the future of documents is structured content, and more.

July 2023

July 2023.JPG

Not your mother's migration, print never goes out of style, top eDiscovery litigation support, and more.

April 2023

Apr 2023 DCL Newsletter.png

Transforming legacy aircraft documentation, a use of ChatGPT we can all get behind, and more.

January 2023

Jan 2023 DCL Newsletter

The New York Public Library, data extraction, document formats Leo hates, and more.

DCL Newsletters

bottom of page