Unlocking Document Intelligence with Open-Source AI
2026-05-05 , Ballroom

Unlocking the full potential of AI starts with your data, but real-world documents come in countless formats and levels of complexity. This session introduces Docling, an open-source Python library designed to convert complex documents into AI-ready formats. Learn how Docling simplifies document processing, enabling you to efficiently harness all your data for downstream AI and analytics applications.


Most organizational knowledge is still locked inside complex documents, making it difficult to extract and use the information effectively. Traditional tools often fail when working with real-world document formats, particularly PDFs.

In this talk, I'll be introducing Docling, an open-source project that takes a different approach, using deep learning models to parse documents the way humans read them. It preserves hierarchy, extracts structured data through a consistent API, and supports over ten common file formats out of the box. We will walk through Docling's architecture and its unique DoclingDocument data structure, demonstrate how Docling's features help build document processing pipelines that can be leveraged for various applications. Lastly, I'll share how teams have leveraged Docling to ingest previously inaccessible data while reducing document processing costs.

And of course most importantly, all of Docling is open-source and under an MIT license, which allows for fully local execution with low latency to reduce data privacy exposure.

Ming Zhao is an open source developer and Developer Advocate at IBM Research, where he helps IBM leverage open technologies while building impactful tools and growing vibrant open-source communities. He’s passionate about making open tech accessible to all and ensuring developers have the tools they need to succeed in the rapidly developing AI space. Ming now leads community efforts around Docling, IBM’s fastest-growing open source project, recently welcomed into the LF AI & Data Foundation.

Abby Tse is a Software Engineer in IBM CIO driving the development of internal AI applications. Her work focuses on using generative AI to streamline business operations and boost employee productivity.