Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

AI fuels false claims after Charlie Kirk’s death, CBS News analysis reveals

Google is a ‘bad actor’ says People CEO, accusing the company of stealing content

ASML Partners With Mistral AI in Strategic €1.3 Billion Deal

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
VentureBeat AI

New open source AI company Deep Cogito releases first models and they’re already topping the charts

By Advanced AI EditorApril 9, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Deep Cogito, a new AI research startup based in San Francisco, officially emerged from stealth today with Cogito v1, a new line of open source large language models (LLMs) fine-tuned from Meta’s Llama 3.2 and equipped with hybrid reasoning capabilities — the ability to answer quickly and immediately, or “self-reflect” like OpenAI’s “o” series and DeepSeek R1.

The company aims to push the boundaries of AI beyond current human-overseer limitations by enabling models to iteratively refine and internalize their own improved reasoning strategies. It’s ultimately on a quest toward developing superintelligence — AI smarter than all humans in all domains — yet the company says that “All models we create will be open sourced.”

Deep Cogito’s CEO and co-founder Drishan Arora — a former Senior Software Engineer at Google who says he led the large language model (LLM) modeling for Google’s generative search product —also said in a post on X they are “the strongest open models at their scale – including those from LLaMA, DeepSeek, and Qwen.”

The initial model lineup includes five base sizes: 3 billion, 8 billion, 14 billion, 32 billion, and 70 billion parameters, available now on AI code sharing community Hugging Face, Ollama and through application programming interfaces (API) on Fireworks and Together AI.

They’re available under the Llama licensing terms which allows for commercial usage — so third-party enterprises could put them to work in paid products — up to 700 million monthly users, at which point they need to obtain a paid license from Meta.

The company plans to release even larger models — up to 671 billion parameters — in the coming months.

Arora describes the company’s training approach, iterated distillation and amplification (IDA), as a novel alternative to traditional reinforcement learning from human feedback (RLHF) or teacher-model distillation.

The core idea behind IDA is to allocate more compute for a model to generate improved solutions, then distill the improved reasoning process into the model’s own parameters — effectively creating a feedback loop for capability growth. Arora likens this approach to Google AlphaGo’s self-play strategy, applied to natural language.

Benchmarks and evaluations

The company shared a broad set of evaluation results comparing Cogito models to open-source peers across general knowledge, mathematical reasoning, and multilingual tasks. Highlights include:

Cogito 3B (Standard) outperforms LLaMA 3.2 3B on MMLU by 6.7 percentage points (65.4% vs. 58.7%), and on Hellaswag by 18.8 points (81.1% vs. 62.3%).

In reasoning mode, Cogito 3B scores 72.6% on MMLU and 84.2% on ARC, exceeding its own standard-mode performance and showing the effect of IDA-based self-reflection.

Cogito 8B (Standard) scores 80.5% on MMLU, outperforming LLaMA 3.1 8B by 12.8 points. It also leads by over 11 points on MMLU-Pro and achieves 88.7% on ARC.

In reasoning mode, Cogito 8B achieves 83.1% on MMLU and 92.0% on ARC. It surpasses DeepSeek R1 Distill 8B in nearly every category except the MATH benchmark, where Cogito scores significantly lower (60.2% vs. 80.6%).

Cogito 14B and 32B models outperform Qwen2.5 counterparts by around 2–3 percentage points on aggregate benchmarks, with Cogito 32B (Reasoning) reaching 90.2% on MMLU and 91.8% on the MATH benchmark.

Cogito 70B (Standard) outperforms LLaMA 3.3 70B on MMLU by 6.4 points (91.7% vs. 85.3%) and exceeds LLaMA 4 Scout 109B on aggregate benchmark scores (54.5% vs. 53.3%).

Against DeepSeek R1 Distill 70B, Cogito 70B (Reasoning) posts stronger results in general and multilingual benchmarks, with a notable 91.0% on MMLU and 92.7% on MGSM.

Cogito models generally show their highest performance in reasoning mode, though some trade-offs emerge — particularly in mathematics.

For instance, while Cogito 70B (Standard) matches or slightly exceeds peers in MATH and GSM8K, Cogito 70B (Reasoning) trails DeepSeek R1 in MATH by over five percentage points (83.3% vs. 89.0%).

In addition to general benchmarks, Deep Cogito evaluated its models on native tool-calling performance — a growing priority for agents and API-integrated systems.

Cogito 3B supports four tool-calling tasks natively (simple, parallel, multiple, and parallel-multiple), whereas LLaMA 3.2 3B does not support tool calling.

Cogito 3B scores 92.8% on simple tool calls and over 91% on multiple tool calls.

Cogito 8B scores over 89% across all tool call types, significantly outperforming LLaMA 3.1 8B, which ranges between 35% and 54%.

These improvements are attributed not only to model architecture and training data, but also to task-specific post-training, which many baseline models currently lack.

Looking ahead

Deep Cogito plans to release larger-scale models in upcoming months, including mixture-of-expert variants at 109B, 400B, and 671B parameter scales. The company will also continue updating its current model checkpoints with extended training.

The company positions its IDA methodology as a long-term path toward scalable self-improvement, removing dependence on human or static teacher models.

Arora emphasizes that while performance benchmarks are important, real-world utility and adaptability are the true tests for these models — and that the company is just at the beginning of what it believes is a steep scaling curve.

Deep Cogito’s research and infrastructure partnerships include teams from Hugging Face, RunPod, Fireworks AI, Together AI, and Ollama. All released models are open source and available now.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleTrump to endorse coal for data center power in the face of grim market realities
Next Article First election forum draws packed house
Advanced AI Editor
  • Website

Related Posts

Software is 40% of security budgets as CISOs shift to AI defense

August 30, 2025

How Intuit killed the chatbot crutch – and built an agentic AI playbook you can copy

August 29, 2025

Forget data labeling: Tencent’s R-Zero shows how LLMs can train themselves

August 29, 2025
Leave A Reply

Latest Posts

Ohio Auction of Two Paintings Looted By Nazis Halted By Foundation

Lee Ufan Painting at Center of Bribery Investigation in Korea

Nicholas Galanin Pulls Out of Smithsonian Event, Claiming Censorship

Two More Staffers Fired from Kennedy Center after Trump Takeover

Latest Posts

AI fuels false claims after Charlie Kirk’s death, CBS News analysis reveals

September 13, 2025

Google is a ‘bad actor’ says People CEO, accusing the company of stealing content

September 12, 2025

ASML Partners With Mistral AI in Strategic €1.3 Billion Deal

September 12, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • AI fuels false claims after Charlie Kirk’s death, CBS News analysis reveals
  • Google is a ‘bad actor’ says People CEO, accusing the company of stealing content
  • ASML Partners With Mistral AI in Strategic €1.3 Billion Deal
  • Law Offices of Frank R. Cruz Encourages C3.ai, Inc. (AI) Investors To Inquire About Securities Fraud Class Action
  • Unlock model insights with log probability support for Amazon Bedrock Custom Model Import

Recent Comments

  1. CarltonSog on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  2. HarryMoF on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. Ghép thận trái phép on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. Bernard on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. Chrisdak on C3 AI and Arcfield Announce Partnership to Accelerate AI Capabilities to Serve U.S. Defense and Intelligence Communities

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.