😺 IBM Just Beat Models 12x Its Size

Your browser does not support the audio element.

OpenAI is now officially the most valuable private company in the world at $500B (SpaceX is #2 at $400B) after it completed a recent employee stock sale, where the company authorized employees to sell $10B in shares, but only $6.6B was ultimately sold.

Of course, this momentary blip at the top will only last until Elon Musk goes out and raises another round at SpaceX… out of spite.

Here’s what happened in AI today:

IBM released Granite 4.0 open-source model running on 70% less memory.

China mandated AI education for 200+ million children.

MIT launched America’s most powerful university AI supercomputer.

AI attracted $192B+ in venture capital, capturing 64% of global deal value.

P.S: Corey and Grant got access to Sora 2 yesterday, so we hopped on the YouTube channel to record a hands on-demo showing you everything we’ve learned about how it works. We then turned that demo in a blog; it even comes with a Sora 2 meta prompt creator prompt you copy + paste to test out!

If you aren’t subscribed to our YouTube channel yet, and you’d like an invite code, Grant has 3 left to share, so go subscribe right now and we’ll pick a random new sub to give it to! Then you give 4 invites to your friends, and they each get 4 invites, etc etc.

We’re hearing rumors and whispers (and seeing potential glimpses?) that next week is going to be a big week for Google. So it’s probably a good time to check in on the latest in open source:

This week, IBM released Granite 4.0, a powerful open language model you can run on significantly cheaper GPUs (NVIDIA’s AI chips) w/ 70% less memory needed.

Here’s the deal: Most AI models get slower and more expensive as you throw more work at them. IBM’s solution? It built a hybrid architecture mixing two completely different AI designs: transformers (the tech behind ChatGPT) and Mamba (a newer, more efficient approach). For more of the technical details, read this.

Think of transformers like reading an entire book at once to understand context, while Mamba reads page by page. Combining both gives you the best of speed and smarts.

Here’s how that worked out for them:

Tiny but mighty: Even the smallest Granite 4.0 model (3B parameters, which = AI’s brain size) beats IBM’s previous 8B model. The efficiency gains are insane.

Linear scaling: While normal AI models slow down and get expensive with longer inputs, Granite 4.0 actually speeds up as you throw more at it.

First ISO 42001 certified open model: IBM’s the only company with internationally certified AI governance standards, plus they’re offering a $100K bug bounty for security researchers.

Oh, and on Stanford’s instruction-following benchmark, Granite 4.0 beats nearly every open-source model except Meta’s Llama 4 Maverick, which is 12x larger. That’s the efficiency advantage in action.

VentureBeat says engineers are already calling it “Western Qwen” after the very popular Alibaba Qwen models. IBM’s stepping into the open source vacuum left by Meta with models that are open-source (Apache 2.0 license), enterprise-ready, and certified for security, making them appealing to Western companies wary of Chinese AI.

Here’s your options for how to use it:

Small (32B/9B active) handles complex multi-agent workflows (10-20 GB VRAM minimum, you’ll wanna use these “quantized” versions)

Tiny (7B/1B active) works for faster edge applications and function calling (this is the one you want for your computer, you need 5GB VRAM memory minimum, but more = better).

H-Micro (3B) is the smallest hybrid model for local/edge use, and regular Micro (3B) is the fallback if your platform doesn’t support the new hybrid architecture yet.

Basically, pick one of the above based on how much computing power you have and whether you need heavy reasoning vs. speed. Here it is on HuggingFace, Docker, Replicate and LM Studio, which is our pick for ease of use.

A startup was on the rise—steady growth, happy customers, investors circling. Then a big prospect asked for SOC 2.

They signed up with one of the big platforms. Suddenly: endless tasks, confusing checklists, engineers pulled off product. Deals stalled. Competitors moved in.

15 hours later, they were audit-ready and closed the deal.

The first startup never recovered. One by one, their deals slipped away. Keys handed over. Game over.

Delve automates compliance—SOC 2, ISO 27001, HIPAA, GDPR, PCI-DSS and more—so you’re ready in days, not months.

Today, compliance is done in Delve.

Build a master doc with everything: Dwarkesh keeps one 20K-word Google Doc with all his work context—problem logs, meeting notes, email templates, common prompts. He pastes it at the start of relevant AI sessions. LLMs can instantly digest hundreds of pages of your company knowledge before answering. Humans can’t do that.

Use AI as a Socratic tutor, not a lecturer: Instead of “explain X,” try: “Act as a Socratic tutor. Ask me questions that help me understand this concept. Don’t move on until I’ve proven I get it.” Research by Benjamin Bloom found that one-on-one tutoring beats classroom learning by two standard deviations. We can finally access expert tutors across every field at a moment’s notice—but only if we prompt them right (or use the “study and learn” mode).

Don’t wait for your org’s AI tools: They’re slow and often outdated. Instead, experiment yourself. Most AI tools are free or $20/month (though the best tiers cost ~$200, and the level of quality you get is often much higher, and if you use the API, you can pay per use via a tool like OpenRouter; here’s how).

Our favorite insight: LLMs can’t learn on the job over months like humans, but they can instantly absorb your entire institutional knowledge before every single response… a superpower even humans don’t have.

Friends of the newsletter (and excellent AI podcast) AI for Humans just released a hilarious launch video for an actually real product called “And Then” that offers voice-controlled games where you talk to AI characters convince a dockmaster to let you dock, debate someone who thinks Earth is flat, or solve cozy mysteries. At a minimum, you gotta watch the video lmao…

Julius analyzes your spreadsheets and databases when you ask questions in plain English (like “predict customer churn from purchase history”), creating instant charts and automated reports sent to your Slack or email.

Graphite speeds up your GitHub workflow by letting you stack pull requests (work on new code before old PRs merge), automatically reviews your code for bugs, and manages all your PRs in one inbox.

Linear organizes your product development by tracking issues, planning sprints, and managing roadmaps—file a bug, assign it to Sprint 12, and watch your roadmap progress from idea to shipped code.

turbopuffer searches billions of documents at 10x cheaper cost using S3 storage instead of expensive vector databases.

How To Solve It With Code teaches you to build real projects with AI by breaking problems into small pieces you actually understand, so you can modify and extend your code instead of hitting walls when AI-generated code needs changes.

Perplexity acquired Visual Electric’s team to expand beyond search into creative AI, and released its Comet AI browser free for all users (previously $200/month) featuring autonomous task capabilities across 800+ apps.

Google’s Jules, Google’s coding agent, now comes with tools and works directly in your terminal, so you can assign tasks like “write unit tests” or “fix bugs” directly from your terminal and script it into your existing workflows.

AI dominated venture capital investing in 2025, attracting $192B+ globally with 64% of global VC deal value going to AI in Q3.

Also, AI startup valuations reached unprecedented levels with companies like OpenAI valued at nearly $500B, fueling AI bubble concerns.

MIT Lincoln Laboratory launched TX-GAIN, America’s most powerful university AI supercomputer, delivering two AI exaflops using 600+ NVIDIA GPUs while reducing energy by 80%.

China’s mandatory AI education for 200+ million children starting in 2025 is systematically creating “AI natives” from age 6 onward, potentially giving China an insurmountable advantage in future global AI development.

Check out this week’s Intelligent Insights roundup: we cover Karpathy’s Artificial “Ghost” Intelligence take, new AI security disasters, the bull and bear case for the AI bubble why your job isn’t going anywhere (but it’s definitely changing).

You don’t have to stay stuck.

Stop doing it all. Start leading again. Access the next phase of growth with BELAY’s free resource, Delegate to Elevate.

Source link

What's Hot

MedQ-Bench: Evaluating and Exploring Medical Image Quality Assessment Abilities in MLLMs – Takara TLDR

Huawei Ascend Roadmap Could Challenge Nvidia AI Leadership

No More Pikachu Oppenheimer? OpenAI Promises Rightsholders More Control Over Sora Creations

😺 IBM just beat models 12x its size

Stocks to Gain From Quantum Computing in 2025: MSFT, IBM, QBTS, IONQ – October 2, 2025

Is IBM’s Quantum Leap Driving Shares Too High in 2025?

IBM’s Granite 4.0 family of hybrid models uses much less memory during inference

Record Exec and Art Collector Gets Over 4 Years

Chicago’s Art Scene Offers a Beacon of Hope for Artists and Dealers

Pace to Close Hong Kong Gallery at H Queen’s This Month

Taylor Swift’s ‘Fate of Ophelia’ Has a Lot in Common with This Artwork