Monte Carlo turns its gaze to unstructured data quality issues

Data observability firm Monte Carlo Data Inc. is turning its attention to unstructured information, introducing a new capability that will allow enterprises to monitor the enormous volumes of text, images, video and audio files that are so vital for artificial intelligence workloads.

The company’s new unstructured data monitoring engine represents an effort to fix one of the biggest blind spots in most enterprises’ data estates. According to a report by International Data Corp., up to 90% of the information stored on the average enterprise’s servers is unstructured – a jumble of chat logs, Word and PDF documents, PowerPoints and so on. This presents a big problem in terms of reliability, as there’s no easy way to monitor it for data quality issues.

Monte Carlo is a leading player in the data observability market, providing tools that can help businesses to ensure the “quality” of their datasets meets the highest standards. Its platform works similarly to application monitoring tools such as Datadog and AppDynamics, only it’s applied to data pipelines rather than telemetry and other app metrics. It uses machine learning algorithms to understand the normal baseline of any given data stream, so it can alert users to any abnormal behavior. Much of what it does can now be automated by AI agents.

The problem is that, until now, Monte Carlo’s data monitoring tools were always aimed at structured data – the kind of information that’s stored neatly in rows and columns in databases such as Oracle. Unstructured information is an entirely different ballgame. Nevertheless, it’s important for organizations to be able to trust this information, as it’s the main fodder for most generative artificial intelligence applications and agents, which are becoming increasingly prevalent in enterprise computing environments.

Monte Carlo says it’s the first data observability company in the business to focus on unstructured data, enabling companies to apply customizable and AI-powered checks to any quality they believe is relevant to their business-critical workloads. Some of the use cases put forward by Monte Carlo include flagging customer reviews with negative sentiment, before they reach dashboards, and detecting personally identifiable information or other sensitive data in contract text fields. Crucially, its checks can also be used to validate AI model outputs for their factual accuracy, consistency, tone and structure, the company said.

Monte Carlo co-founder and Chief Technology Officer Lior Gavish said it’s vital for businesses to be able to proactively detect data issues in order to build AI systems they can trust.

“High-quality unstructured data, like customer feedback, support tickets, or internal documentation, isn’t just important; it’s foundational to building powerful, reliable AI,” he said. “It can be the difference between a model that performs and one that fails.”

The focus on unstructured data quality points to the start of a new trend that will see consolidation across the AI and data observability markets, which have traditionally always been separate, said analyst Michael Ni of Constellation Research Inc. He believes that chief data analytics officers will welcome this consolidation, because they’re not only drowning in data, but completely blind in terms of the 90% that’s thought to be unstructured.

Because AI workloads are powered mostly by unstructured data, companies need visibility into their vector database stores and the data behind each prompt, the analyst said. Simply monitoring data pipelines and tables is no longer enough.

“Monte Carlo’s move finally puts documents, chat logs and transcripts under observability, and it’s a move that represents where trust in AI finally begins,” Ni said. “This marks the beginning of the end for siloed data observability, and the next platform battle will be around ‘decision observability,’ where AI signals come together in one trusted view.”

The company said its new unstructured data monitoring tool is compatible with platforms such as Snowflake, Databricks and Google BigQuery, and can natively integrate with those platform’s AI function libraries and large language models. For instance, it’s fully compatible with Snowflake Inc.’s Cortex Agents, which are intelligent bots that aim to orchestrate structured and unstructured information together to guide more reliable AI decision-making. It can also provide observability for Databricks Inc.’s AI/BI tool, which is a hybrid AI system that helps to generate rich insights relating to data lineage, data pipelines and more.

“By enabling native support for Snowflake Cortex Agents and Databricks AI/BI, Monte Carlo helps data teams ensure their foundational data is reliable and trustworthy enough to support real-time business insights driven by AI,” said Monte Carlo’s head of AI Shane Murray.

Monte Carlo said its move into unstructured data monitoring represents a key milestone in its broader mission to provide visibility across the full data and AI application lifecycle.

Image: SiliconANGLE/Meta AI

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU

Source link

What's Hot

$750 Target Stays as Analysts Expect AI Gaps to Close

A.I. May Be the Future, but First It Has to Study Ancient Roman History

OpenAI CEO Sam Altman issues big warning for ChatGPT users: Here are all the details – Technology News

Monte Carlo turns its gaze to unstructured data quality issues

Scalability trends reshape enterprise AI stack

Diskover gets $7.5M from Snowflake, NetApp and others to facilitate unstructured data discovery for AI

Startup Typedef gets $5.5M seed funding to build customized data pipelines for AI model workloads

David Geffen Sued By Estranged Husband for Breach of Contract

Auction House Will Sell Egyptian Artifact Despite Concern From Experts

Anish Kapoor Lists New York Apartment for $17.75 M.

Street Fighter 6 Community Rocked by AI Art Controversy

$750 Target Stays as Analysts Expect AI Gaps to Close

A.I. May Be the Future, but First It Has to Study Ancient Roman History

OpenAI CEO Sam Altman issues big warning for ChatGPT users: Here are all the details – Technology News

What's Hot

Monte Carlo turns its gaze to unstructured data quality issues

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

Related Posts

Subscribe to Updates