Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Beijing Is Using Soft Power to Gain Global Dominance

Alibaba previews its first AI-powered glasses, joining China’s heated smart wearable race

Monitor AI’s Decision-Making Black Box: Here’s Why

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Industry AI
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
TechRepublic

Anthropic Explores How Claude ‘Thinks’

By Advanced AI EditorMarch 29, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


It can be difficult to determine how generative AI arrives at its output.

On March 27, Anthropic published a blog post introducing a tool for looking inside a large language model to follow its behavior, seeking to answer questions such as what language its model Claude “thinks” in, whether the model plans ahead or predicts one word at a time, and whether the AI’s own explanations of its reasoning actually reflect what’s happening under the hood.

In many cases, the explanation does not match the actual processing. Claude generates its own explanations for its reasoning, so those explanations can feature hallucinations, too.

A ‘microscope’ for ‘AI biology’

Anthropic published a paper on “mapping” Claude’s internal structures in May 2024, and its new paper on describing the “features” a model uses to link concepts together follows that work. Anthropic calls its research part of the development of a “microscope” into “AI biology.”

In the first paper, Anthropic researchers identified “features” connected by “circuits,” which are paths from Claude’s input to output. The second paper focused on Claude 3.5 Haiku, examining 10 behaviors to diagram how the AI arrives at its result. Anthropic found:

Claude definitely plans ahead, particularly on tasks such as writing rhyming poetry.
Within the model, there is “a conceptual space that is shared between languages.”
Claude can “make up fake reasoning” when presenting its thought process to the user.

The researchers discovered how Claude translates concepts between languages by examining the overlap in how the AI processes questions in multiple languages. For example, the prompt “the opposite of small is” in different languages gets routed through the same features for “the concepts of smallness and oppositeness.”

This latter point dovetails with Apollo Research’s studies into Claude Sonnet 3.7’s ability to detect an ethics test. When asked to explain its reasoning, Claude “will give a plausible-sounding argument designed to agree with the user rather than to follow logical steps,” Anthropic found.

SEE: Microsoft’s AI cybersecurity offering will debut two personas, Researcher and Analyst, in early access in April.

Generative AI isn’t magic; it’s sophisticated computing, and it follows rules; however, its black-box nature means it can be difficult to determine what those rules are and under what conditions they arise. For example, Claude showed a general hesitation to provide speculative answers but might process its end goal faster than it provides output: “In a response to an example jailbreak, we found that the model recognized it had been asked for dangerous information well before it was able to gracefully bring the conversation back around,” the researchers found.

How does an AI trained on words solve math problems?

I mostly use ChatGPT for math problems, and the model tends to come up with the right answer despite some hallucinations in the middle of the reasoning. So, I’ve wondered about one of Anthropic’s points: Does the model think of numbers as a sort of letter? Anthropic might have pinpointed exactly why models behave like this: Claude follows multiple computational paths at the same time to solve math problems.

“One path computes a rough approximation of the answer and the other focuses on precisely determining the last digit of the sum,” Anthropic wrote.

So, it makes sense if the output is right but the step-by-step explanation isn’t.

More must-read AI coverage

Claude’s first step is to “parse out the structure of the numbers,” finding patterns similarly to how it would find patterns in letters and words. Claude can’t externally explain this process, just as a human can’t tell which of their neurons are firing; instead, Claude will produce an explanation of the way a human would solve the problem. The Anthropic researchers speculated this is because the AI is trained on explanations of math written by humans.

What’s next for Anthropic’s LLM research?

Interpreting the “circuits” can be very difficult because of the density of the generative AI’s performance. It took a human a few hours to interpret circuits produced by prompts with “tens of words,” Anthropic said. They speculate it might take AI assistance to interpret how generative AI works.

Anthropic said its LLM research is intended to be sure AI aligns with human ethics; as such, the company is looking into real-time monitoring, model character improvements, and model alignment.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleSam Altman firing drama detailed in new book excerpt
Next Article Nvidia’s next big bet? Physical AI
Advanced AI Editor
  • Website

Related Posts

AI Benchmark Discrepancy Reveals Gaps in Performance Claims

April 22, 2025

Huawei Readies Ascend 920 Chip to Replace Restricted NVIDIA H20

April 21, 2025

‘AI Is Fundamentally Incompatible With Environmental Sustainability’

April 21, 2025
Leave A Reply

Latest Posts

David Geffen Sued By Estranged Husband for Breach of Contract

Auction House Will Sell Egyptian Artifact Despite Concern From Experts

Anish Kapoor Lists New York Apartment for $17.75 M.

Street Fighter 6 Community Rocked by AI Art Controversy

Latest Posts

Beijing Is Using Soft Power to Gain Global Dominance

July 27, 2025

Alibaba previews its first AI-powered glasses, joining China’s heated smart wearable race

July 27, 2025

Monitor AI’s Decision-Making Black Box: Here’s Why

July 27, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Beijing Is Using Soft Power to Gain Global Dominance
  • Alibaba previews its first AI-powered glasses, joining China’s heated smart wearable race
  • Monitor AI’s Decision-Making Black Box: Here’s Why
  • ChatGPT therapy conversations may not be private, warns OpenAI CEO Sam Altman
  • For Now, AI Helps IBM’s Bottom Line More Than Its Top Line

Recent Comments

  1. Rejestracja on Online Education – How I Make My Videos
  2. Anonymous on AI, CEOs, and the Wild West of Streaming
  3. MichaelWinty on Local gov’t reps say they look forward to working with Thomas
  4. 4rabet mirror on Former Tesla AI czar Andrej Karpathy coins ‘vibe coding’: Here’s what it means
  5. Janine Bethel on OpenAI research reveals that simply teaching AI a little ‘misinformation’ can turn it into an entirely unethical ‘out-of-the-way AI’

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.