Cerebras Systems’ dinner-plate-sized chips currently power the latest AI inference offerings from Meta and, soon, those of IBM, but US trade policy weighs heavy on its prospects worldwide.
“The [AI] diffusion rule is bad policy,” CEO Andrew Feldman said during a press conference ahead of IBM’s annual Think conference, which kicked off on Tuesday.
These and other AI-related rules, set to go into effect later this month unless the Trump administration intervenes, were put forward in the final hours of former President Joe Biden’s first term and seek to limit the sale of American GPUs and AI accelerators outside the US and a select few allies.
One of the goals of the policy is to prevent China and other nations barred from getting around the bans by buying American accelerators directly from using countries where they are legal.
Yet while some AI startups have called for even tougher controls on AI exports, US chip companies aren’t a fan. Last week, Nvidia CEO Jensen Huang, whose GPU empire stands to lose the most from the rules, called on the Trump administration to revise them.
Essentially, the diffusion rule would impose strict export controls and offer only a small allocation of chips per year for every country except 18 favored nations that are considered allies of the US. Chipmakers would prefer if the government did not dictate exactly how many items they can sell to whom, as it caps the total addressable market. In the past, the government has handled security concerns by adding specific countries to the US Entities list, which then requires vendors to obtain special licenses to do business with those countries.
Feldman is no fan of Nvidia, and previously accused the company of “arming” China by continuing to build and sell sanctions-compliant accelerators for the Middle Kingdom. But here, he agrees the diffusion rule is too strict. And, he points out, so do a lot of other big players in tech.
“You know how hard it is to get me and Nvidia and Oracle and Google and Amazon and Microsoft to agree on something. It’s really hard to get an entire industry of competitors to agree on something and that policy was not good policy,” he said.
“What we want is US exports, including technology exports in the hands of our allies, and we want to support US companies while they do that. We want reasonable safeguards to make sure that the equipment doesn’t end up in China or being used by the Chinese. We want real penalties if companies turn a blind eye to their equipment being used by Chinese or other adversaries.” Feldman added. “I don’t think those goals were achieved by the diffusion rules, and I think we can do better.”
Feldman acknowledged the difficulty faced by the US Commerce Department in crafting trade policy that protects American interests without infringing on a US company’s ability to do business worldwide.
“I’m hoping we roll back to more thoughtful policy, not to no policy,” he said.
Feldman is less concerned about the impact of tariffs on supply chains. This is, in part, because Cerebras’ dinner plate sized chips are fabbed at TSMC, but critically, Feldman says, don’t rely on Chinese components.
“I will say that uncertainty and surprises are challenging, and those are hard on everybody’s supply chain,” he said. “You buy sub-assemblies, you buy things from other people, and someone’s got to figure out whether this bit was made in Mexico or Malaysia or in Austin, right? And that takes an enormous amount of time and effort.
“[G]ive us some heads up, and at least we’re not running around like chickens without our heads, trying to figure out what to charge,” he said, adding that the added cost ultimately gets passed along to the end customer.
Higher component costs are potentially problematic for Cerebras going forward as the company’s business model, at least for inference, already relies on customers being willing to pay a premium for its speed advantage in generation. The folks at Artificial Analysis have clocked Cerebras running the Llama 4 Scout at over 2,600 tokens a second, well ahead of GPU-based API providers like Fireworks or Together.ai, which top out at around 130 tokens per second. But while miles faster than the competition, it’s also among the most expensive.
However, that’s a price many are willing to pay, as we saw with IBM’s tie-up with Cerebras on Tuesday. Big Blue’s Watson-X AI Gateway is set to run, at least in part, on Cerebras’ wafer-scale accelerators.
I will say that uncertainty and surprises are challenging, and those are hard on everybody’s supply chain… If you give us some heads up, and at least we’re not running around like chickens without our heads, trying to figure out what to charge
But before you get too carried away, IBM isn’t deploying clusters of Cerebras CS-3s. The agreement closely mirrors its collab with Hugging Face, and will see Big Blue provide a common API interface and billing platform with the actual inference workloads running in Cerebras datacenters.
Today’s announcement comes a week after Cerebras notched another win, when Meta confirmed at least a portion of its new Llama API service would run Cerebras hardware as well.
These wins serve to diversify the company’s customer base. Until recently, it was extremely dependent on United Arab Emirates-based AI cloud provider G42, which accounted for 87 percent of its revenues in the first half of 2024.
The UAE is one of several nations in the Middle East where shipments of American-designed AI accelerators remain heavily restricted. G42 has been able to sidestep many of these challenges by financing the construction of several Cerebras AI supercomputers, in the US.
And despite rumors the Trump administration may loosen restrictions on AI accelerators to the UAE, the country would still be subject to AI diffusion rule limits on compute unless the Trump administration moves to scrap them. That’s one likely reason why G42 has reportedly