
Artificial Intelligence
AI Workflows Get New Open Source Tools To Advance Document Intelligence, Data Quality, And Decentralized AI With IBM's Contribution Of 3 Projects To Linux Foundation AI And Data
By PR Newswire

New projects strengthen the open source AI and data ecosystem and expand the Foundation's technical portfolio
SAN FRANCISCO, April 29, 2025 /PRNewswire/ -- LF AI & Data Foundation, an umbrella foundation of the Linux Foundation supporting open source innovation in artificial intelligence and data, today announced the induction of three new open source projects contributed by IBM: Docling, Data Prep Kit, and BeeAI. All three projects have officially been inducted by the LF AI & Data Technical Advisory Committee.
These contributions significantly enhance LF AI & Data's technical landscape in three rapidly growing domains—semantic document understanding, enterprise-grade data preparation, and privacy-preserving federated learning—reinforcing the foundation's mission to build a sustainable and open AI ecosystem.
The New Projects:
- BeeAI is the first open-source agent-to-agent platform for developers to build, discover, run, and compose agents and create multi-agent workflows. Powered by the open Agent Communication Protocol (ACP), BeeAI makes it easy to discover and connect AI agents from any framework or tech stack.
- Docling is an open-source, state-of-the-art ecosystem of tools (python packages) to do document conversion, generation and manipulation. It enables users to easily build pipelines to extract structured information from complex documents. With over 27K stars on github, Docling is already well on its way to becoming the de facto standard.
- Data Prep Kit is a modular suite of tools designed to clean, transform, and trace unstructured data for LLMs with a focus on quality, transparency, and scalability. It supports both batch and streaming data scenarios and integrates easily with modern AI workflows.
"We are excited to welcome Docling, Data Prep Kit, and BeeAI into the LF AI & Data family," said Todd Moore, SVP, Community Operations at the Linux Foundation and interim Executive Director, LF AI & Data. "These contributions from IBM reflect a strong commitment to open collaboration and responsible AI. I love BeeAI's commitment to both Javascript and Python for aggregated learning."
"Docling, Data Prep Kit, and BeeAI were born from a need to fill critical gaps in AI development tooling and accelerate innovation in the Generative AI space. We're proud to see them as a catalyst enabling the broader open-source community to build AI applications and agentic workflows," said Brad Topol, Distinguished Engineer and Director of Open Source IBM. "We're excited to collaborate with the open-source community to evolve these technologies and solve real-world challenges together."
Governance & Community Collaboration
The projects will benefit from the governance, technical support, and ecosystem engagement that LF AI & Data provides to its hosted projects. All three projects have officially been inducted by the LF AI & Data Technical Advisory Committee (TAC) and will establish neutral, community-driven technical steering committees.
The projects are now publicly available for exploration and contribution. Developers, data scientists, and researchers are encouraged to get involved and shape the future of these impactful technologies.
For more information and to get involved, visit:https://lfaidata.foundation
About the Linux Foundation
The Linux Foundation is the world's leading home for collaboration on open source software, hardware, standards, and data. Linux Foundation projects are critical to the world's infrastructure, including Linux, Kubernetes, LF Decentralized Trust, Node.js, ONAP, OpenChain, OpenSSF, PyTorch, RISC-V, SPDX, Zephyr, and more. The Linux Foundation focuses on leveraging best practices and addressing the needs of contributors, users, and solution providers to create sustainable models for open collaboration. For more information, please visit us at linuxfoundation.org.
The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see its trademark usage page: www.linuxfoundation.org/trademark-usage. Linux is a registered trademark of Linus Torvalds.
Media Contact
Jill Lovato
The Linux Foundation
jlovato@linuxfoundation.org
View original content to download multimedia:https://www.prnewswire.com/news-releases/ai-workflows-get-new-open-source-tools-to-advance-document-intelligence-data-quality-and-decentralized-ai-with-ibms-contribution-of-3-projects-to-linux-foundation-ai-and-data-302441483.html
SOURCE LF AI & Data Foundation
Frequently Asked Questions
What are the three new open-source AI projects?
Three projects from IBM—BeeAI (agent-to-agent platform), Docling (document processing tools), and Data Prep Kit (data preparation for LLMs)—have joined the LF AI & Data Foundation.
What problem do these projects solve?
These tools address critical gaps in AI development, particularly in document understanding, data quality, and building privacy-preserving AI applications.
How can I get involved?
Visit the LF AI & Data Foundation website to explore and contribute to these projects. They're publicly available for developers and researchers.
First published on Wed, Apr 30, 2025
Enjoyed what you've read so far? Great news - there's more to explore!
Stay up to date with the latest news, a vast collection of tech articles including introductory guides, product reviews, trends and more, thought-provoking interviews, hottest AI blogs and entertaining tech memes.
Plus, get access to branded insights such as informative white papers, intriguing case studies, in-depth reports, enlightening videos and exciting events and webinars from industry-leading global brands.
Dive into TechDogs' treasure trove today and Know Your World of technology!
Disclaimer - Reference to any specific product, software or entity does not constitute an endorsement or recommendation by TechDogs nor should any data or content published be relied upon. The views expressed by TechDogs' members and guests are their own and their appearance on our site does not imply an endorsement of them or any entity they represent. Views and opinions expressed by TechDogs' Authors are those of the Authors and do not necessarily reflect the view of TechDogs or any of its officials. While we aim to provide valuable and helpful information, some content on TechDogs' site may not have been thoroughly reviewed for every detail or aspect. We encourage users to verify any information independently where necessary.
Trending PR Newswire
Alithya Appoints Pierre Blanchette As Chief Financial Officer
By PR Newswire
Amendola Earns Three Prestigious Healthcare And Health Tech-Focused PR Awards
By PR Newswire
Bingx Introduces Copy Trading 2.0: Greater Control, Transparency, And Flexibility
By PR Newswire
Chargeflow Launches Industry's First Fully Automated Chargeback Solution In The Woocommerce Marketplace
By PR Newswire
Chatmeter Launches AI Tools For Healthcare Providers, Integrates With Webmd And Vitals.Com To Elevate Patient Experience And Ensure Compliance
By PR Newswire
Join Our Newsletter
Get weekly news, engaging articles, and career tips-all free!
By subscribing to our newsletter, you're cool with our terms and conditions and agree to our Privacy Policy.
Join The Discussion