Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Bell Canada strikes AI deal with Cohere

Jus Mundi 1st Legal Tech To Gain ISO AI Cert – Artificial Lawyer

Optimizing enterprise AI assistants: How Crypto.com uses LLM reasoning and feedback for enhanced efficiency

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Industry AI
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
VentureBeat AI

Why most enterprise AI agents never reach production and how Databricks plans to fix it

By Advanced AI EditorJune 12, 2025No Comments8 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more

Many enterprise AI agent development efforts  never make it to production and it’s not because the technology isn’t ready. The problem, according to Databricks, is that companies are still relying on manual evaluations with a process that’s slow, inconsistent and difficult to scale.

Today at the Data + AI Summit, Databricks launched Mosaic Agent Bricks as a solution to that challenge. The technology builds on and extends the Mosaic AI Agent Framework the company announced in 2024. Simply put, it’s no longer good enough to just be able to build AI agents in order to have real-world impact.

The Mosaic Agent Bricks platform automates agent optimization using a series of research-backed innovations. Among the key innovations is the integration of TAO (Test-time Adaptive Optimization), which provides a novel approach to AI tuning without the need for labeled data. Mosaic Agent Bricks also generates domain-specific synthetic data, creates task-aware benchmarks and optimizes quality-to-cost balance without manual intervention.

Fundamentally the goal of the new platform is to solve an issue that Databricks users had with existing AI agent development efforts.

“They were flying blind, they had no way to evaluate these agents,” Hanlin Tang, Databricks’ Chief Technology Officer of Neural Networks, told VentureBeat. “Most of them were relying on a kind of manual, manual vibe tracking to see if the agent sounds good enough, but this doesn’t give them the confidence to go into production.”

From research innovation to enterprise AI production scale

Tang was previously the co-founder and CTO of Mosaic, which was acquired by Databricks in 2023 for $1.3 billion.

At Mosaic, much of the research innovation didn’t necessarily have an immediate enterprise impact. That all changed after the acquisition.

“The big light bulb moment for me was when we first launched our product on Databricks, and instantly, overnight, we had, like thousands of enterprise customers using it,” Tang said.

In contrast, prior to the acquisition Mosaic would spend months trying to get just a handful of enterprises to try out products. The integration of Mosaic into Databricks has given Mosaic’s research team direct access to enterprise problems at scale and revealed new areas to explore.

This enterprise contact revealed new research opportunities. 

“It’s only when you have contact with enterprise customers, you work with them deeply, that you actually uncover kind of interesting research problems to go after,” Tang explained. “Agent Bricks….is, in some ways, kind of an evolution of everything that we’ve been working on at Mosaic now that we’re all fully, fully bricksters.”

Solving the agentic AI evaluation crisis

Enterprise teams face a costly trial-and-error optimization process. Without task-aware benchmarks or domain-specific test data, every agent adjustment becomes an expensive guessing game. Quality drift, cost overruns and missed deadlines follow.

Agent Bricks automates the entire optimization pipeline. The platform takes a high-level task description and enterprise data. It handles the rest automatically.

First, it generates task-specific evaluations and LLM judges. Next, it creates synthetic data that mirrors customer data. Finally, it searches across optimization techniques to find the best configuration.

“The customer describes the problem at a high level and they don’t go into the low level details, because we take care of those,” Tang said. “The system generates synthetic data and builds custom LLM judges specific to each task.”

The platform offers four agent configurations:

Information Extraction: Converts documents (PDFs, emails) into structured data. One use case could be retail organizations that use it to pull product details from supplier PDFs, even with complex formatting.

Knowledge Assistant: Provides accurate, cited answers from enterprise data. For example, manufacturing technicians can get instant answers from maintenance manuals without digging through binders.

Custom LLM: Handles text transformation tasks (summarization, classification). For example, healthcare organizations can customize models that summarize patient notes for clinical workflows.

Multi-Agent Supervisor: Orchestrates multiple agents for complex workflows. One use case example is financial services firms that can coordinate agents for intent detection, document retrieval and compliance checks.

Agents are great, but don’t forget about data

Building and evaluating agents is a core part of making AI enterprise ready, but it’s not the only part that’s needed.

Databricks positions Mosaic Agent Bricks as the AI consumption layer sitting atop its unified data stack. At the Data + AI Summit, Databricks also announced the general availability of its Lakeflow data engineering platform, which was first previewed in 2024.

Lakeflow solves the data preparation challenge. It unifies three critical data engineering journeys that previously required separate tools. Ingestion handles getting both structured and unstructured data into Databricks. Transformation provides efficient data cleaning, reshaping and preparation. Orchestration manages production workflows and scheduling.

The workflow connection is direct: Lakeflow prepares enterprise data through unified ingestion and transformation, then Agent Bricks builds optimized AI agents on that prepared data. 

“We help get the data into the platform, and then you can do ML, BI and AI analytics,” Bilal Aslam,  Senior Director of Product Management at Databricks told VentureBeat. 

Going beyond data ingestion, Mosaic Agent Bricks also benefits from Databricks’ Unity Catalog’s governance features. That includes access controls and data lineage tracking. This integration ensures that agent behavior respects enterprise data governance without additional configuration.

Agent Learning from Human Feedback eliminates prompt stuffing

One of the common approaches to guiding AI agents today is to use a system prompt. Tang referred to the practice of ‘prompt stuffing’ where users shove all kinds of guidance into a prompt in the hope that the agent will follow it.

Agent Bricks introduces a new concept called – Agent Learning from Human Feedback. This feature automatically adjusts system components based on natural language guidance. It solves what Tang calls the prompt stuffing problem. According to Tang, the prompt stuffing approach often fails because agent systems have multiple components that need adjustment.

Agent Learning from Human Feedback is a system that automatically interprets natural language guidance and adjusts the appropriate system components. The approach mirrors reinforcement learning from human feedback (RLHF) but operates at the agent system level rather than individual model weights.

The system handles two core challenges. First, natural language guidance can be vague. For example, what does ‘respect your brand’s voice’ actually mean? Second, agent systems contain numerous configuration points. Teams struggle to identify which components need adjustment.

The system eliminates the guesswork about which agent components need adjustment for specific behavioral changes.

“This we believe will help agents become more steerable,” Tang said.

Technical advantages over existing frameworks

There is no shortage of agentic AI development frameworks and tools in the market today. Among the growing list of vendor options are tools from Langchain, Microsoft and Google.

Tang argued that what makes Mosaic Agent Bricks different is the optimization. Rather than requiring manual configuration and tuning, Agent Bricks incorporates multiple research techniques automatically: TAO, in-context learning, prompt optimization and fine-tuning.

When it comes to agent to agent communications, there are a few options in the market today, including Google’s Agent2Agent protocol. According to Tang, Databricks is currently exploring various agent protocols and hasn’t committed to a single standard.

Currently, Agent Bricks handles agent-to-agent communication through two primary methods:

Exposing agents as endpoints that can be wrapped in different protocols.

Using a multi-agent supervisor that is MCP (Model Context Protocol) aware.

Strategic implications for enterprise decision-makers

For enterprises looking to lead the way in AI, it’s critical to have the right technologies in place to evaluate quality and effectiveness.

Deploying agents without evaluation isn’t going to lead to an optimal outcome and neither will having agents without a solid data foundation. When considering agent development technologies, it’s critical to have proper mechanisms to evaluate the best options.

The Agent Learning from Human Feedback approach is also noteworthy for enterprise decision makers as it helps to guide agentic AI to the best outcome.

For enterprises looking to lead in AI agent deployment, this development means evaluation infrastructure is no longer a blocking factor. Organizations can focus resources on use case identification and data preparation rather than building optimization frameworks.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleMeta’s V-JEPA 2 model teaches AI to understand its surroundings
Next Article Nebius Stock Soars on $1B AI Funding, Analyst Sees 75% Upside
Advanced AI Editor
  • Website

Related Posts

How E2B became essential to 88% of Fortune 100 companies and raised $21 million

July 28, 2025

When progress doesn’t feel like home: Why many are hesitant to join the AI migration

July 27, 2025

Why AI is making us lose our minds (and not in the way you’d think)

July 26, 2025
Leave A Reply

Latest Posts

Picasso’s ‘Demoiselles’ May Not Have Been Inspired by African Art

Catalan National Assembly protested the restitution of murals to Aragon.

UNESCO Adds 26 Sites to World Heritage List

Scottish Museum Group Warns of ‘Policing of Gender’—and More Art News

Latest Posts

Bell Canada strikes AI deal with Cohere

July 28, 2025

Jus Mundi 1st Legal Tech To Gain ISO AI Cert – Artificial Lawyer

July 28, 2025

Optimizing enterprise AI assistants: How Crypto.com uses LLM reasoning and feedback for enhanced efficiency

July 28, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Bell Canada strikes AI deal with Cohere
  • Jus Mundi 1st Legal Tech To Gain ISO AI Cert – Artificial Lawyer
  • Optimizing enterprise AI assistants: How Crypto.com uses LLM reasoning and feedback for enhanced efficiency
  • China’s AI monster is here and it’s coming for DeepSeek’s throne
  • Alibaba-backed Moonshot releases new Kimi AI model that beats ChatGPT, Claude in coding — and it costs less – NBC Bay Area

Recent Comments

  1. 🖨 🔵 Incoming Message: 1.95 Bitcoin from exchange. Claim transfer => https://graph.org/ACTIVATE-BTC-TRANSFER-07-23?hs=40f06aae45d2dc14b01045540f836756& 🖨 on SFC Dialogue丨Jeffrey Sachs says he uses DeepSeek every hour_to_facts_its
  2. 📪 ✉️ Unread Notification: 1.65 BTC from user. Claim transfer >> https://graph.org/ACTIVATE-BTC-TRANSFER-07-23?hs=63f0a8159ef8316c31f5a9a8aca50f39& 📪 on Sean Carroll: Arrow of Time
  3. 🔋 📬 Unread Alert - 1.65 BTC from exchange. Accept funds > https://graph.org/ACTIVATE-BTC-TRANSFER-07-23?hs=db3ef91843302da628b83636ef7db949& 🔋 on Rohit Prasad: Amazon Alexa and Conversational AI | Lex Fridman Podcast #57
  4. 📟 ✉️ New Alert: 1.95 Bitcoin from partner. Review funds => https://graph.org/ACTIVATE-BTC-TRANSFER-07-23?hs=945d7d4685640a791a641ab7baaf111d& 📟 on OpenAI’s $3 Billion Windsurf Acquisition Changes AI Forever
  5. 📉 📬 New Alert: 1.95 BTC from user. Accept transfer > https://graph.org/ACTIVATE-BTC-TRANSFER-07-23?hs=ec44c54ac11760a830a6e2539d842264& 📉 on OpenAI Pushes Back Against Court’s Data Retention Order

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.