Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

EU Commission: “AI Gigafactories” to strengthen Europe as a business location

United States, China, and United Kingdom Lead the Global AI Ranking According to Stanford HAI’s Global AI Vibrancy Tool

AI Workflows Get New Open Source Tools to Advance Document Intelligence, Data Quality, and Decentralized AI with IBM’s Contribution of 3 projects to Linux Foundation AI and Data

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • Adobe Sensi
    • Aleph Alpha
    • Alibaba Cloud (Qwen)
    • Amazon AWS AI
    • Anthropic (Claude)
    • Apple Core ML
    • Baidu (ERNIE)
    • ByteDance Doubao
    • C3 AI
    • Cohere
    • DataRobot
    • DeepSeek
  • AI Research & Breakthroughs
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Education AI
    • Energy AI
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Media & Entertainment
    • Transportation AI
    • Manufacturing AI
    • Retail AI
    • Agriculture AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
Advanced AI News
Home » Phonely’s new AI agents hit 99% accuracy—and customers can’t tell they’re not human
VentureBeat AI

Phonely’s new AI agents hit 99% accuracy—and customers can’t tell they’re not human

Advanced AI BotBy Advanced AI BotJune 3, 2025No Comments9 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

A three-way partnership between AI phone support company Phonely, inference optimization platform Maitai, and chip maker Groq has achieved a breakthrough that addresses one of conversational artificial intelligence’s most persistent problems: the awkward delays that immediately signal to callers they’re talking to a machine.

The collaboration has enabled Phonely to reduce response times by more than 70% while simultaneously boosting accuracy from 81.5% to 99.2% across four model iterations, surpassing GPT-4o’s 94.7% benchmark by 4.5 percentage points. The improvements stem from Groq’s new capability to instantly switch between multiple specialized AI models without added latency, orchestrated through Maitai’s optimization platform.

The achievement solves what industry experts call the “uncanny valley” of voice AI — the subtle cues that make automated conversations feel distinctly non-human. For call centers and customer service operations, the implications could be transformative: one of Phonely’s customers is replacing 350 human agents this month alone.

Why AI phone calls still sound robotic: the four-second problem

Traditional large language models like OpenAI’s GPT-4o have long struggled with what appears to be a simple challenge: responding quickly enough to maintain natural conversation flow. While a few seconds of delay barely registers in text-based interactions, the same pause feels interminable during live phone conversations.

“One of the things that most people don’t realize is that major LLM providers, such as OpenAI, Claude, and others have a very high degree of latency variance,” said Will Bodewes, Phonely’s founder and CEO, in an exclusive interview with VentureBeat. “4 seconds feels like an eternity if you’re talking to a voice AI on the phone – this delay is what makes most voice AI today feel non-human.”

The problem occurs roughly once every ten requests, meaning standard conversations inevitably include at least one or two awkward pauses that immediately reveal the artificial nature of the interaction. For businesses considering AI phone agents, these delays have created a significant barrier to adoption.

“This kind of latency is unacceptable for real-time phone support,” Bodewes explained. “Aside from latency, conversational accuracy and humanlike responses is something that legacy LLM providers just haven’t cracked in the voice realm.”

How three startups solved AI’s biggest conversational challenge

The solution emerged from Groq’s development of what the company calls “zero-latency LoRA hotswapping” — the ability to instantly switch between multiple specialized AI model variants without any performance penalty. LoRA, or Low-Rank Adaptation, allows developers to create lightweight, task-specific modifications to existing models rather than training entirely new ones from scratch.

“Groq’s combination of fine-grained software controlled architecture, high-speed on-chip memory, streaming architecture, and deterministic execution means that it is possible to access multiple hot-swapped LoRAs with no latency penalty,” explained Chelsey Kantor, Groq’s chief marketing officer, in an interview with VentureBeat. “The LoRAs are stored and managed in SRAM alongside the original model weights.”

This infrastructure advancement enabled Maitai to create what founder Christian DalSanto describes as a “proxy-layer orchestration” system that continuously optimizes model performance. “Maitai acts as a thin proxy layer between customers and their model providers,” DalSanto said. “This allows us to dynamically select and optimize the best model for every request, automatically applying evaluation, optimizations, and resiliency strategies such as fallbacks.”

The system works by collecting performance data from every interaction, identifying weak points, and iteratively improving the models without customer intervention. “Since Maitai sits in the middle of the inference flow, we collect strong signals identifying where models underperform,” DalSanto explained. “These ‘soft spots’ are clustered, labeled, and incrementally fine-tuned to address specific weaknesses without causing regressions.”

From 81% to 99% accuracy: the numbers behind AI’s human-like breakthrough

The results demonstrate significant improvements across multiple performance dimensions. Time to first token — how quickly an AI begins responding — dropped 73.4% from 661 milliseconds to 176 milliseconds at the 90th percentile. Overall completion times fell 74.6% from 1,446 milliseconds to 339 milliseconds.

Perhaps more significantly, accuracy improvements followed a clear upward trajectory across four model iterations, starting at 81.5% and reaching 99.2% — a level that exceeds human performance in many customer service scenarios.

“We’ve been seeing about 70%+ of people who call into our AI not being able to distinguish the difference between a person,” Bodewes told VentureBeat. “Latency is, or was, the dead giveaway that it was an AI. With a custom fine tuned model that talks like a person, and super low-latency hardware, there isn’t much stopping us from crossing the uncanny valley of sounding completely human.”

The performance gains translate directly to business outcomes. “One of our biggest customers saw a 32% increase in qualified leads as compared to a previous version using previous state-of-the-art models,” Bodewes noted.

350 human agents replaced in one month: call centers go all-in on AI

The improvements arrive as call centers face mounting pressure to reduce costs while maintaining service quality. Traditional human agents require training, scheduling coordination, and significant overhead costs that AI agents can eliminate.

“Call centers are really seeing huge benefits from using Phonely to replace human agents,” Bodewes said. “One of the call centers we work with is actually replacing 350 human agents completely with Phonely just this month. From a call center perspective this is a game changer, because they don’t have to manage human support agent schedules, train agents, and match supply and demand.”

The technology shows particular strength in specific use cases. “Phonely really excels in a few areas, including industry-leading performance in appointment scheduling and lead qualification specifically, beyond what legacy providers are capable of,” Bodewes explained. The company has partnered with major firms handling insurance, legal, and automotive customer interactions.

The hardware edge: why Groq’s chips make sub-second AI possible

Groq’s specialized AI inference chips, called Language Processing Units (LPUs), provide the hardware foundation that makes the multi-model approach viable. Unlike general-purpose graphics processors typically used for AI inference, LPUs optimize specifically for the sequential nature of language processing.

“The LPU architecture is optimized for precisely controlling data movement and computation at a fine-grained level with high speed and predictability, allowing the efficient management of multiple small ‘delta’ weights sets (the LoRAs) on a common base model with no additional latency,” Kantor said.

The cloud-based infrastructure also addresses scalability concerns that have historically limited AI deployment. “The beauty of using a cloud-based solution like GroqCloud, is that Groq handles orchestration and dynamic scaling for our customers for any AI model we offer, including fine-tuned LoRA models,” Kantor explained.

For enterprises, the economic advantages appear substantial. “The simplicity and efficiency of our system design, low power consumption, and high performance of our hardware, allows Groq to provide customers with the lowest cost per token without sacrificing performance as they scale,” Kantor said.

Same-day AI deployment: how enterprises skip months of integration

One of the partnership’s most compelling aspects is implementation speed. Unlike traditional AI deployments that can require months of integration work, Maitai’s approach enables same-day transitions for companies already using general-purpose models.

“For companies already in production using general-purpose models, we typically transition them to Maitai on the same day, with zero disruption,” DalSanto said. “We begin immediate data collection, and within days to a week, we can deliver a fine-tuned model that’s faster and more reliable than their original setup.”

This rapid deployment capability addresses a common enterprise concern about AI projects: lengthy implementation timelines that delay return on investment. The proxy-layer approach means companies can maintain their existing API integrations while gaining access to continuously improving performance.

The future of enterprise AI: specialized models replace one-size-fits-all

The collaboration signals a broader shift in enterprise AI architecture, moving away from monolithic, general-purpose models toward specialized, task-specific systems. “We’re observing growing demand from teams breaking their applications into smaller, highly specialized workloads, each benefiting from individual adapters,” DalSanto said.

This trend reflects maturing understanding of AI deployment challenges. Rather than expecting single models to excel across all tasks, enterprises increasingly recognize the value of purpose-built solutions that can be continuously refined based on real-world performance data.

“Multi-LoRA hotswapping lets companies deploy faster, more accurate models customized precisely for their applications, removing traditional cost and complexity barriers,” DalSanto explained. “This fundamentally shifts how enterprise AI gets built and deployed.”

The technical foundation also enables more sophisticated applications as the technology matures. Groq’s infrastructure can support dozens of specialized models on a single instance, potentially allowing enterprises to create highly customized AI experiences across different customer segments or use cases.

“Multi-LoRA hotswapping enables low-latency, high-accuracy inference tailored to specific tasks,” DalSanto said. “Our roadmap prioritizes further investments in infrastructure, tools, and optimization to establish fine-grained, application-specific inference as the new standard.”

For the broader conversational AI market, the partnership demonstrates that technical limitations once considered insurmountable can be addressed through specialized infrastructure and careful system design. As more enterprises deploy AI phone agents, the competitive advantages demonstrated by Phonely may establish new baseline expectations for performance and responsiveness in automated customer interactions.

The success also validates the emerging model of AI infrastructure companies working together to solve complex deployment challenges. This collaborative approach may accelerate innovation across the enterprise AI sector as specialized capabilities combine to deliver solutions that exceed what any single provider could achieve independently. If this partnership is any indication, the era of obviously artificial phone conversations may be coming to an end faster than anyone expected.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleAnthropic’s AI is writing its own blog — with human oversight
Next Article Protein Industries Canada Targets Supply Chain Resilience with $15 Million in Genomics and AI Funding – vegconomist
Advanced AI Bot
  • Website

Related Posts

Databricks, Noma Tackle CISOs’ AI Inference Nightmare

June 5, 2025

Stop guessing why your LLMs break: Anthropic’s new tool shows you exactly what goes wrong

June 4, 2025

OpenAI hits 3M business users and launches workplace tools to take on Microsoft

June 4, 2025
Leave A Reply Cancel Reply

Latest Posts

The Science Of De-Extinction Is Providing Hope For Nature’s Future

Influential French Gallerist Daniel Lelong Dies at 92

Anish Kapoor backs appeal to buy Barbara Hepworth sculpture

Why Hollywood Stars Make Bank On Broadway—For Producers

Latest Posts

EU Commission: “AI Gigafactories” to strengthen Europe as a business location

June 5, 2025

United States, China, and United Kingdom Lead the Global AI Ranking According to Stanford HAI’s Global AI Vibrancy Tool

June 5, 2025

AI Workflows Get New Open Source Tools to Advance Document Intelligence, Data Quality, and Decentralized AI with IBM’s Contribution of 3 projects to Linux Foundation AI and Data

June 5, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

YouTube LinkedIn
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.