IBM’s Think 2025 showcases watsonx.data as the cornerstone of its generative AI strategies.
IBM
One of the themes for IBM Think 2025 — this week’s flagship event for customers, partners and analysts — is exploring how AI and automation are being put to work in the real world. One of the big product updates is for the watsonx.data platform, which is continuing to evolve to address common roadblocks in scaling generative and agent-based AI. At the event, IBM has emphasized how useful this is for its customers, especially when dealing with fragmented or hard-to-use enterprise data, including unstructured formats.
By simplifying the data-for-AI stack with an open, hybrid architecture, IBM positions watsonx.data as a platform for enterprises looking to deliver faster, more accurate and scalable generative AI outcomes. In this article, I’ll look at the challenges of enterprise AI adoption and how IBM is seeking to use watsonx.data’s capabilities to address these challenges and create value in day-to-day operations.
(Note: IBM is an advisory client of my firm, Moor Insights & Strategy.)
The Challenges Of Generative AI Adoption
Before getting into the specifics announced at Think, it’s important to understand the prevailing problems that watsonx.data is trying to address. Enterprise adoption of generative AI is accelerating, but many organizations are discovering that their legacy data environments are not equipped for the demands of AI. According to IBM, less than 1% of enterprise data is being used for generative AI initiatives today, while approximately 90% of data is unstructured — and scattered across diverse locations, formats and platforms. Despite significant investments in AI models and applications, the real barrier to generative AI success for most enterprises is not inference costs or model optimization. It is the data itself.
Many enterprises are misaligned in their generative AI strategies, focusing on application development without first addressing the foundational data challenges that limit model performance. To overcome this, organizations require trustworthy, company-specific data to produce high-performing and accurate AI outcomes. However, in many enterprises, large volumes of unstructured data remain locked within e-mails, documents, presentations, videos and the like, making it inaccessible to large language models and generative AI tools.
Unstructured data presents a unique challenge because it is dynamic, fragmented across systems, lacks clear labels, and often requires additional context for meaningful interpretation. Retrieval-augmented generation, while useful for structured knowledge retrieval, can be ineffective when attempting to extract and harmonize unstructured information at the enterprise scale. Meanwhile, enterprises are saddled with disjointed stacks of data lakes, warehouses, governance tools and integration platforms, adding complexity rather than reducing it.
All of this creates a huge missed opportunity for companies, because there is enormous value to be gained if they can find a way to use their in-house enterprise data in their AI efforts to address their specific challenges. In a recent discussion I had with IBM’s Edward Calvesbert, vice president for watsonx product management, he said, “If everyone’s using the same AI models trained on the same data, how do you stand out? The real edge comes from using your enterprise data — plugging it into your apps and systems to actually get work done and move the needle.”
Watsonx.data’s Role In AI Adoption
IBM’s strategic response for solving the enterprise’s unstructured data problem is watsonx.data. I wrote about the initial launch of this platform back in 2023. At IBM Think this week, IBM previewed the new generation of watsonx.data that transforms the platform into a hybrid, open data lakehouse with data fabric capabilities. Innovations include “watsonx.data integration” (note IBM’s lowercase nomenclature), which can make it easier to access and manage data in different formats, and “watsonx.data intelligence,” which uses AI to automate data curation, management and governance. If IBM is allowed to complete its intended acquisition of DataStax, the company also hopes to incorporate DataStax’s NoSQL and vector database capabilities to further enhance the unstructured data management in watsonx.data.
IBM watsonx.data is a hybrid, open data lakehouse architecture that facilitates AI applications by … More
IBM
The watsonx.data architecture emphasizes separation of storage and computing, support for open formats such as Apache Iceberg and Presto, hybrid deployment across clouds and on-premises environments and deep integration with governance and security tools. With it, IBM wants to give enterprises the ability to ingest, govern and retrieve both structured and unstructured data at scale. According to the company, this could enable the creation of generative AI applications and agentic AI models that are 40% more accurate and performant, and much faster than before. As Calvesbert put it, “Today’s generative AI tools mostly help employees find and summarize information. What’s next is unlocking real impact — by strengthening the data layer so AI can deliver accurate, trusted results at scale.”
IBM Integrates Watsonx With Db2
IBM is continuing to modernize Db2 by embedding watsonx capabilities directly into Db2 12.1 (as I wrote about late in 2024) to enhance the platform with AI-powered automation. At IBM Think, the company introduced new features such as the Database Assistant — a natural-language tool that acts as a real-time advisor for DBAs, helping to monitor performance, diagnose issues and optimize system operations.
These operational updates reflect a broader evolution underway within Db2. With the announcement of Db2 version 12.1.2, the platform now plays a broader role in IBM’s hybrid, AI-ready data strategy. The new version includes native support for vector embedding and similarity search to enable faster development of AI applications that blend curated, structured data with unstructured sources like documents and logs. Through watsonx.data, Db2 workloads can now participate in AI pipelines with shared governance, unified metadata and federated access. Enhancements also include support for open table formats (this is where Apache Iceberg comes in) and integration with vector databases, allowing Db2 to bridge structured and unstructured data. In doing so, Db2 is evolving from a traditional relational database into a foundational component of the enterprise AI stack — one that by design supports automation, observability and scalability across hybrid environments.
How Watsonx.data Is Delivering Business Results
At a time when many companies — and their shareholders — are skeptical about the real-world effects of enterprise AI, IBM is ready with examples of how the business impact is already measurable across different industries. For example, BanFast, one of the largest construction firms in Sweden, used watsonx.data to reduce manual data input by 75% and then leveraged that data to enhance worker health and safety. A U.S.-based financial services firm saved $5.7 million by creating a unified view of its operational IT data using watsonx.data, enabling self-service access, consistent governance and automated processing.
Meanwhile, a global manufacturing client partnered with IBM and EY to automate the ingestion and consolidation of indirect tax data across 34 source systems in 73 countries, improving compliance efficiency. IBM and EY also recently launched EY.ai for tax, a product that integrates EY’s tax expertise with IBM’s AI technology, including watsonx.data. In sports and media, IBM is a key partner for The US Open and The Masters, where millions of data points are processed in real time to generate AI-driven player commentary and fan insights. These deployments highlight how watsonx.data is helping modernize data infrastructure to enable faster insights, greater operational efficiency and competitive differentiation for enterprises that are under pressure to scale AI initiatives quickly and responsibly.
Challenges For Watsonx.data
IBM’s watsonx.data offers a promising approach to managing enterprise data, especially by making both structured and unstructured data more accessible and usable for generative AI. However, managing data comes with challenges, especially for organizations that are still early in their data modernization efforts. Integrating unstructured and structured data across cloud and on-prem environments remains complex. Many customers will face issues with data sprawl, inconsistent governance policies and internal silos that slow adoption. Even with a unified platform, getting data ready for AI — clean, labeled and trustworthy — is a major effort.
Another challenge is organizational readiness. Teams may not have the skills or processes in place to take full advantage of watsonx.data’s capabilities, particularly when it comes to aligning the teams that manage data with those responsible for AI application development. There’s also the question of cost and operational complexity. While watsonx.data is designed for flexibility, deploying it in a hybrid environment with multiple components and tools — such as data processing engines, storage systems and governance frameworks — can stretch already limited IT resources. These components often require careful integration and ongoing coordination across teams.
That said, as the examples offered by IBM show, the business impact could be significant for companies that can work through these issues. Watsonx.data provides a way to connect siloed systems, reduce reliance on brittle point-solutions and make better use of internal data — especially the 90% that’s unstructured and often ignored. But the path forward will require more than just technology. It will take coordination across teams, clear ownership of data quality and a realistic view of what’s needed to move from pilot to production.
IBM’s position is straightforward: solving the data problem is the first step toward making AI useful at scale. For customers, the choice isn’t just about tools — it’s about whether they’re ready to do the work needed to get their data in shape.
Moor Insights & Strategy provides or has provided paid services to technology companies, like all tech industry research and analyst firms. These services include research, analysis, advising, consulting, benchmarking, acquisition matchmaking and video and speaking sponsorships. Of the companies mentioned in this article, Moor Insights & Strategy currently has (or has had) a paid business relationship with IBM.