Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

China tells tech firms to stop buying Nvidia’s AI chips: Report

Critics Question OpenAI’s $100 Billion Gift to Its Nonprofit

ChatGPT teen-safety measures to include age verification, OpenAI says

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Alibaba Cloud (Qwen)

Down and out with Cerebras Code

By Advanced AI EditorSeptember 17, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Out of Fireworks and into the fire

However, my start with Cerebras’s hosted Qwen was not the same as what I experienced (for a lot more money) on Fireworks, another provider. Initially, Cerebras’s Qwen didn’t even work in my CLI. It also didn’t seem to work in Roo Code or any other tool I knew how to use. After taking a bug report, Cerebras told me it was my code. My same CLI that worked on Fireworks, for Claude, for GPT-4.1 and GPT-5, for o3, for Qwen hosted by Qwen/Alibaba was at fault, said Cerebras. To be fair, my log did include deceptive artifacts when Cerebras fragmented the stream, putting out stream parts as messages (which Cerebras still does on occasion). However, this has been generally their approach. Don’t fix their so-called OpenAI compatibility—blame and/or adapt the client. I took the challenge and adapted my CLI, but it was a lot of workarounds. This was a massive contrast with Fireworks. I had issues with Fireworks when it started and showed them my debug output; they immediately acknowledged the problem (occasionally it would spit out corrupt, native tool calls instead of OpenAI-style output) and fixed it overnight. Cerebras repeatedly claimed their infrastructure was working perfectly and requests were all successful—in direct contradiction to most commentary on their Discord.

Feeling like I had finally cracked the nut after three weeks of on-and-off testing and adapting, I grabbed a second Cerebras Code Max account when the window opened again. This was after discovering that for part of the time, Cerebras had charged me for a Max account but given me a Pro account. They fixed it and offered no compensation for the days my service was set to Pro, not Max, and it is difficult to prove because their analytics console is broken, in part because it provides measurements in local time, but the limits are in UTC.

Then I did the math. One Cerebras Code Max account is limited to 120 million tokens per day at a cost equivalent to four times that of a Cerebras Code Pro account. The Pro account is 24 million tokens per day. If you multiply that by four, you get 96 million tokens. However, the Pro account is limited to 300k tokens per minute, compared to 400k for the Max. Using Cerebras is a bit frustrating. For 10 to 20 seconds, it really flies, then you hit the cap on tokens per minute, and it throws 429 errors (too many requests) until the minute is up. If your coding tool is smart, it will just retry with an exponential back-off. If not, it will break the stream. So, had I bought four Pro accounts, I could have had 1,200,000 TPM in theory, a much better value than the Max account.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleGoogle DeepMind Alumni Launch Hiverge With $5 Million Seed Funding for ‘Algorithm Factory’ to Discover and Deploy Algorithms Beyond Human Capabilities
Next Article Why UBS Is Still on the Sidelines With C3.ai (AI) Despite a Higher Target
Advanced AI Editor
  • Website

Related Posts

Qwen Code is good but not great

September 16, 2025

Alibaba’s New Speech Recognition Model Pushes Accuracy But Keeps Weights Closed

September 16, 2025

Why Qwen3 Next Is the Most Efficient AI Model Yet

September 15, 2025

Comments are closed.

Latest Posts

Jennifer Packer and Marie Watt Win $250,000 Heinz Award

KAWS Named Uniqlo’s First Artist-in-Residence

Sylvester Stallone Owns Works by Warhol, Condo, and Other Art Stars

LA Louver Gallery to Shutter Venice Gallery After 50 Years

Latest Posts

China tells tech firms to stop buying Nvidia’s AI chips: Report

September 17, 2025

Critics Question OpenAI’s $100 Billion Gift to Its Nonprofit

September 17, 2025

ChatGPT teen-safety measures to include age verification, OpenAI says

September 17, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • China tells tech firms to stop buying Nvidia’s AI chips: Report
  • Critics Question OpenAI’s $100 Billion Gift to Its Nonprofit
  • ChatGPT teen-safety measures to include age verification, OpenAI says
  • Daily Life of IBM’s Head of VC: Miles With Her Dogs and Meeting Startups
  • YouTube to use AI to help podcasters promote themselves with clips and Shorts

Recent Comments

  1. beste sportwetten Anbieter Deutschland on Election 2024: What Will Markets Do With Trump Victory Over Biden?
  2. AnthonyKak on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. kraken mirror site on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. Wetten Halbzeit Endstand on A Library of LLM Intrinsics for Retrieval-Augmented Generation
  5. DerekGlock on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.