Anthropic is launching a voice mode for its Claude AI, enabling spoken conversations on its iOS and Android mobile apps. This beta feature, powered by the new Claude Sonnet 4 model, is rolling out in English over the next few weeks. This is a major step for Anthropic, positioning it directly against established AI voice assistants from OpenAI, Google, and Meta.
Claude voice mode users a new, potentially more intuitive way to interact with artificial intelligence, especially in hands-free situations.
The new voice mode aims for a more natural interaction style. Key features include on-screen display of important points during conversation and the ability to discuss documents and images.
While the core voice interaction is becoming more accessible, advanced capabilities like Google Workspace integration for accessing calendar and email data are tied to its paid subscription plans.
Free users will face usage limits; Anthropic’s Help Center indicates most can expect around 20-30 voice conversations. The company also emphasized that safety was a top priority during development, and its support documentation offers troubleshooting tips for users.
How Claude’s Voice Mode Works
Anthropic’s new voice mode allows users to initiate a voice session by tapping a microphone icon within the Claude mobile app. Users can select from five distinct voice options, which can be changed later in settings.
A key difference from simple dictation is its full conversational capability, where Claude both listens and speaks. Anthropic’s documentation explains that chat transcripts and summarized voice notes are saved in the user’s chat history, similar to text-based interactions.
We’re rolling out voice mode in beta on mobile.
Try starting a voice conversation and asking Claude to summarize your calendar or search your docs. pic.twitter.com/xVo5VHiCEb
— Anthropic (@AnthropicAI) May 27, 2025
The feature is designed for various scenarios, including hands-free operation, brainstorming, learning, and enhancing accessibility. For optimal performance, Anthropic advises using the voice mode in a quiet environment and speaking clearly. Specific controls like pause/resume, mute/unmute, and end conversation are available, according to the Anthropic Help Center.
The Competitive Voice AI Landscape
Anthropic’s entry into the voice assistant market comes as competitors are rapidly advancing their offerings. OpenAI has been progressively expanding its ChatGPT Advanced Voice Mode, which in March was extended to the web with improved conversational flow.
While OpenAI initially reserved its best voice features for subscribers, it made a version powered by its smaller GPT-4o-mini model available to free users in February 2025, though with some limitations. Microsoft has taken a more aggressive stance by making its Copilot voice interactions, including advanced reasoning features, completely free.
Google’s Gemini Live has also been enhancing its capabilities, including features to respond based on screen content. Meta recently launched a standalone Meta AI app, powered by its new Llama 4 models and featuring voice interaction, including an experimental “full-duplex” mode for more natural conversation flow.
Amazon is also improving its Alexa assistant after announcing Alexa+ in February, a premium AI-driven version of its popular voice helper. Notably, this involves a $4 billion investment and partnership with Anthropic itself to integrate Claude AI, highlighting Anthropic’s growing influence.
Amid this competitive landscape, Anthropic is playing catch-up in voice. However, their focus on enterprise-friendly features could give them an edge with professional user segments.
Broader Trends and Considerations
The push for more natural AI voices is an industry-wide trend and evolving fast. Specialized firms like Sesame AI are already developing hyper-realistic voices that mimic human imperfections like hesitations.
This drive for realism is balanced by ongoing challenges. For instance, OpenAI acknowledged that its AI can still experience hallucinations when interpreting live video input, a feature added to ChatGPT’s Advanced Voice Mode in December 2024.
As these AI voice technologies become more integrated into daily life, the focus remains on balancing innovation with user experience, safety, and the ethical implications of increasingly human-like AI interactions.