Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

TPC 2025 Session Overview – Transforming Science: Frontier Models, Hybrid Systems and Agentic Systems

Automate the creation of handout notes using Amazon Bedrock Data Automation

Google Gemma 2 AI model architecture, training data and more explained

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Industry AI
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Alibaba Cloud (Qwen)

Deep Cogito releases open-source language models that outperform Llama

By Advanced AI EditorApril 19, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Startup Deep Cogito Inc. launched today with a series of language models that it claims can outperform comparably sized open-source alternatives. 

According to TechCrunch, the company was founded last June by former Google LLC staffers Drishan Arora and Dhruv Malhotra. Arora worked as a senior software engineer at the search giant. Malhotra, in turn, was a product manager at the Google DeepMind machine learning lab. The duo have raised an undisclosed amount of funding from South Park Commons.

Deep Cogito’s lineup of open-source language models is known as the Cogito v1 series. The algorithms are available in five sizes ranging from 3 billion to 70 billion parameters. They’re based on the open-source Llama and Qwen language model families, which are developed by Meta Platforms Inc. and Alibaba Group Holding Ltd., respectively.

Deep Cogito’s models use a hybrid architecture. They combine elements of standard large language models, which answer simple prompts near-instantaneously, and reasoning models. Algorithms in the latter category spend more time generating an answer, which increases their output quality. Deep Cogito’s models can respond to prompts either instantly or perform more extensive reasoning depending on user preferences.  

The company customized its models using a new training method it calls IDA. The technique shares some similarities with distillation, a widely used method of developing hardware-efficient language models.

With distillation, developers send a collection of prompts to a hardware-intensive LLM and save the answers. They then input those answers into a more efficient model. This latter model thereby absorbs some of the larger LLM’s knowledge, which means it can answer the same questions using less hardware.

Deep Cogito’s IDA method likewise uses an LLM’s prompt answers for training purposes. The difference is that those answers aren’t used to improve a different, more hardware-efficient model but rather the LLM that generated the answers. 

Deep Cogito researchers detailed in a blog post today that the IDA workflow involves two steps.

First, an LLM generates an answer to a prompt using methods “similar” to the ones that reasoning models rely on to process data. Those methods increase the amount of time the LLM requires to produce output. Once the prompt response is ready, the LLM distills “the higher intelligence back to the model’s parameters to internalize the amplified capability,” the researchers explained.

“By repeating these two steps, each cycle builds upon the progress of the previous iteration,” they elaborated in the blog post. “This iterative framework creates a positive feedback loop.”

In an internal test, Deep Cogito compared its most advanced model with Meta’s Llama 3.3. Both algorithms feature 70 billion parameters. Deep Cogito says that its model outperformed Llama 3.3 across all seven of the benchmarks that were used in the evaluation. 

The startup claims that its smaller models likewise outperform comparably-sized open-source alternatives. The algorithms feature 3 billion, 8 billion, 14 billion and 32 billion parameters, respectively. Deep Cogito plans to release new models over the next few weeks that will feature 109 billion to 671 billion parameters. 

Image: Unsplash

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleNVIDIA’s Vision For AI Factories – ‘Major Trend in the Data Center World’
Next Article US Officials Claim DeepSeek AI App Is ‘Designed To Spy on Americans’
Advanced AI Editor
  • Website

Related Posts

Alibaba Unveils Intelligent Cockpits, Enterprise Partnerships And AI Glasses At WAIC 2025

July 29, 2025

How to Offer a Working Solution in 24 Hours Using Multimodal LLM – Communications of the ACM

July 29, 2025

Alibaba introduces ‘Quark AI Glasses’ powered by Qwen language model and a Snapdragon chipset

July 28, 2025
Leave A Reply

Latest Posts

Trump’s ‘Big Beautiful Bill’ Orders Museum to Relocate Space Shuttle

Millennial and Gen Z Gallerists Looking to ‘Redefine Success’ and more

Artlogic, ArtCloud Merge in Bid to Shape Art World’s Digital Backbone

Met Museum Trustee Among Those Killed in NYC Shooting

Latest Posts

TPC 2025 Session Overview – Transforming Science: Frontier Models, Hybrid Systems and Agentic Systems

July 30, 2025

Automate the creation of handout notes using Amazon Bedrock Data Automation

July 30, 2025

Google Gemma 2 AI model architecture, training data and more explained

July 30, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • TPC 2025 Session Overview – Transforming Science: Frontier Models, Hybrid Systems and Agentic Systems
  • Automate the creation of handout notes using Amazon Bedrock Data Automation
  • Google Gemma 2 AI model architecture, training data and more explained
  • Mistral launches full AI coding stack alongside Codestral 25.08
  • Baidu AI Linked To China’s PLA: Report 01/15/2024

Recent Comments

  1. 🔏 Security - Transfer 1.8 BTC incomplete. Fix here >> https://graph.org/OBTAIN-CRYPTO-07-23?hs=85ce984e332839165eff00f10a4fc17a& 🔏 on The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies (Paper Explained)
  2. 💾 System: Transfer 0.5 Bitcoin incomplete. Verify now >> https://graph.org/OBTAIN-CRYPTO-07-23?hs=e1378433e58a7b696e3632102c97ef63& 💾 on Qwen 2.5 Coder and Qwen 3 Lead in Open Source LLM Over DeepSeek and Meta
  3. 📞 Security; Transaction 0.5 BTC failed. Verify now => https://graph.org/OBTAIN-CRYPTO-07-23?hs=ec8b72524f993be230f3c8fd50d7bbae& 📞 on OpenAI Five: Dota Gameplay
  4. 📨 System: Transfer 0.5 Bitcoin on hold. Verify now => https://graph.org/OBTAIN-CRYPTO-07-23?hs=b25dab3fe579278f363cd6d123369e86& 📨 on New ChatGPT voice mode updates ⬇️
  5. 🖊 System; Deposit 0.3 Bitcoin failed. Authorize here => https://graph.org/OBTAIN-CRYPTO-07-23?hs=e9fac00a4f303105cc60c701c8ee35b9& 🖊 on Meta, Booz Allen develop ‘Space Llama’ AI system for the International Space Station

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.