IBM’s vision for its large-scale fault-tolerant Starling quantum computer
IBM
IBM has just made a major announcement about its plans to achieve large-scale quantum fault tolerance before the end of this decade. Based on the company’s new quantum roadmap, by 2029 IBM expects to be able to run accurate quantum circuits with hundreds of logical qubits and hundreds of millions of gate operations. If all goes according to plan, this stands to be an accomplishment with sweeping effects across the quantum market — and potentially for computing as a whole.
In advance of this announcement, I received a private briefing from IBM and engaged in detailed correspondence with some of its quantum researchers for more context. (Note: IBM is an advisory client of my firm, Moor Insights & Strategy.) The release of the new roadmap offers a good opportunity to review what IBM has already accomplished in quantum, how it has adapted its technical approach to achieve large-scale fault tolerance and how it intends to implement the milestones of its revised roadmap across the next several years.
Let’s dig in.
Quantum Error Correction And Fault Tolerance
First, we need some background on why fault tolerance is so important. Today’s quantum computers have the potential, but not yet the broader capability, to solve complex problems beyond the reach of our most powerful classical supercomputers. The current generation of quantum computers are fundamentally limited by high error rates that are difficult to correct and that prevent complex quantum algorithms from running at scale. While there are numerous challenges being tackled by quantum researchers around the world, there is broad agreement that these error rates are a major hurdle to be cleared.
In this context, it is important to understand the difference between fault tolerance and quantum error correction. QEC uses specialized measurements to detect errors in encoded qubits. And although it is also a core mechanism used in fault tolerance, QEC alone can only go so far. Without fault-tolerant circuit designs in place, errors that occur during operations or even in the correction process can spread and accumulate, making it exponentially more difficult for QEC on its own to maintain logical qubit integrity.
Reaching well beyond QEC, fault-tolerant quantum computing is a very large and complex engineering challenge that applies a broad approach to errors. FTQC not only protects individual computational qubits from errors, but also systemically prevents errors from spreading. It achieves this by employing clever fault-tolerant circuit designs, and by making use of a system’s noise threshold — that is, the maximum level of errors the system can handle and still function correctly. Achieving the reliability of FTQC also requires more qubits.
FTQC can potentially lower error rates much more efficiently than QEC. If an incremental drop in logical error rate is desired, then fault tolerance needs only a small polynomial increase in the number of qubits and gates to achieve the desired level of accuracy for the overall computation. Despite its complexity, this makes fault tolerance an appealing and important method for improving quantum error rates.
IBM’s Strategic Shift: Embracing Modularity And Novel Error Correction Methods
IBM’s first quantum roadmap, released in 2020
IBM
Research on fault tolerance goes back several decades. IBM began a serious effort to build a quantum computer in the late 1990s when it collaborated with several leading universities to build a two-qubit quantum computer capable of running a small quantum algorithm. Continuing fundamental research eventually led to the 2016 launch of the IBM Quantum Experience, featuring a five-qubit superconducting quantum computer accessible via the cloud.
IBM’s first quantum roadmap, released in 2020 (see the image above), detailed the availability of the company’s 27-qubit Falcon processor in 2019 and outlined plans for processors with a growing number of qubits in each of the subsequent years. The roadmap concluded with the projected development in 2023 of a research-focused processor, the 1,121-qubit Condor, that was never made available to the public.
However, as IBM continued to scale its qubit counts and explore error correction and error mitigation, it became clear to its researchers that monolithic processors were insufficient to achieve the long-term goal of fault-tolerant quantum computing. To achieve fault tolerance in the context of quantum low-density parity-check (much more on qLDPC below), IBM knew it had to overcome three major issues:
It had to scale qubits beyond existing surface code limitations.
It needed to develop its own fault-tolerant framework.
The entire quantum system had to be reengineered for fault tolerance — not just qubits, but everything, including gates, qubit connectivity, operations, measurements, control electronics and quantum memory.
This helps explain why fault tolerance is such a large and complex endeavor, and why monolithic processors were not enough. Achieving all of this would require that modularity be designed into the system.
Shift To The New Architecture
IBM’s shift to modular architecture first appeared in its 2022 roadmap with the introduction for 2024 of multi-chip processors called Crossbill and Flamingo. Crossbill was a 408-qubit processor that demonstrated the first application of short-range coupling. And Flamingo was a 1,386-qubit quantum processor that was the first to use long-range coupling.
For more background on couplers, I previously wrote a detailed Forbes.com article explaining why IBM needed modular processors and tunable couplers. Couplers play an important role in IBM’s current and future fault-tolerant quantum computers. They allow qubits to be logically scaled but without the difficulty, expense and additional time required to fabricate larger chips. Couplers also provide architectural and design flexibility. Short-range couplers provide chip-to-chip parallelization by extending IBM’s heavy-hex lattice across multiple chips, while long-range couplers use cables to connect modules so that quantum information can be shared between processors.
A year later, in 2023, IBM scientists made an important breakthrough by developing a more reliable way to store quantum information using qLDPC codes. These are also called bivariate bicycle codes, and you’ll also hear this referred to as the gross code because it has the capability to encode 12 logical qubits into a gross of 144 physical qubits with 144 ancilla qubits, making a total of 288 physical qubits for error correction.
Previously, surface code was the go-to error-correction code for superconducting because it had the ability to tolerate high error rates, along with the abilities to scale, use the nearest neighbor and protect qubits against bit-flip and phase-flip errors. It’s important to note that IBM has verified that qLDPC performs error correction just as effectively and efficiently as surface code. Yet these two methods do not bring the same level of benefit. Although qLDPC code and surface code perform equally well in terms of error correction, qLDPC code has the significant advantage of needing only one-tenth as many qubits. (More details on that below.)
Quantum Today And Tomorrow
This brings us to today’s state of the art for IBM quantum. Currently, IBM has a fleet of quantum computers available over the cloud and at client sites, many of which are equipped with 156-qubit Heron processors. According to IBM, Heron has the highest performance of any IBM quantum processor. Heron is currently being used in the IBM Quantum System Two and it is available in other systems as well.
IBM 2025 quantum innovation roadmap, showing developments from 2016 to 2033 and beyond
IBM
IBM’s new quantum roadmap shows several major developments on the horizon. (You can read the IBM team’s explanation and see a full version of the roadmap in this IBM blog post.) In 2029 IBM expects to be the first organization to deliver what has long been the elusive goal of the entire quantum industry. After so many years of research and experimentation, IBM believes that in 2029 it will finally deliver a fault-tolerant quantum computer. By 2033, IBM also believes it will be capable of building a quantum-centric supercomputer capable of running thousands of logical qubits and a billion or so gates.
Before we go further into specifics about the milestones that IBM projects for this new roadmap, let’s dig a little deeper into the technical breakthroughs enabling this work.
Qubit Count: Surface Code Versus Gross Code
As mentioned earlier, one key breakthrough IBM has made comes in its use of gross code (qLDPC) for error correction, which is much more efficient than surface code.
Comparison of surface code versus qLDPC error rates
S. Bravyi et al, Nature 627 (2024); arXiv:2308.07915
The above chart shows the qLDPC physical and logical error rates (diamonds) compared to two different surface code error rates (stars). The qLDPC code uses a total of 288 physical qubits (144 physical code qubits and 144 check qubits) to create 12 logical qubits (red diamond). As illustrated in the chart, one instance of surface code requires 2,892 physical qubits to create 12 logical qubits (green star) and the other version of surface code requires 4,044 physical qubits to create 12 logical qubits (blue star). It can be easily seen that qLDPC code uses far fewer qubits than surface code yet produces a comparable error rate.
Running Gates With The Gross Code And LPUs
Connectivity between the gross code and the LPU
IBM
Producing a large number of logical and physical qubits with low error rates is impressive; indeed, as explained earlier, large numbers of physical qubits with low error rates are necessary to encode and scale logical qubits. But what really matters is the ability to successfully run gates. Gates are necessary to manipulate qubits and create superpositions, entanglement and operational sequences for quantum algorithms. So, let’s take a closer look at that technology.
Running gates with qLDPC codes requires an additional set of physical qubits known as a logical processing unit. The LPU has approximately 100 physical qubits and adds about 35% of ancilla overhead per logical qubit to the overall code. (If you’re curious, a similar low to moderate qubit overhead would also be required for surface code to run gates.) LPUs are physically attached to qLDPC quantum memory (gross code) to allow encoded information to be monitored. LPUs can also be used to stabilize logical computations such as Clifford gates (explained below), state preparations and measurements. It is worth mentioning that the LPU itself is fault-tolerant, so it can continue to operate reliably even with component failures or errors.
IBM already understands the detailed connectivity required between the LPU and gross code. For simplification, the drawing of the gross code on the left above has been transformed into a symbolic torus (doughnut) in the drawing on the right; that torus has 12 logical qubits consisting of approximately 288 physical qubits, accompanied by the LPU. (As you look at the drawings, remember that “gross code” and “bivariate bicycle code” are two terms for the same thing.) The drawing on the right appears repeatedly in the diagrams below, and it will likely appear in future IBM documents and discussions about fault tolerance.
The narrow rectangle at the top of the right-hand configuration is called a “bridge” in IBM research papers. Its function is to couple one unit to a neighboring unit with “L-couplers.” It makes the circuits fault-tolerant inside the LPU, and it acts as a natural connecting point between modules. These long-distance couplers, about a meter in length, are used for bell pair generation. It’s a method that allows the entanglement of logical qubits.
So what happens when multiple of these units are coupled together?
Fault-Tolerant Architecture, Clifford And Non-Clifford Gates And Magic State Factories
IBM fault-tolerant quantum architecture
IBM
Above is a generalized configuration of IBM’s future fault-tolerant architecture. As mentioned earlier, each torus contains 12 logical qubits created by the gross code through the use of approximately 288 physical qubits. So, for instance, if a quantum computer was designed to run 96 logical qubits, it would be equipped with eight torus code blocks (8 x 12 = 96) which would require a total of approximately 2,304 physical qubits (8 x 288) plus eight LPUs.
Two special quantum operations are needed for quantum computers to run all the necessary algorithms plus perform error correction. These two operations are Clifford gates and non-Clifford gates. Clifford gates — named after the 19th-century British mathematician William Clifford — handle error correction in a way that allows error-correction code to fix mistakes. Clifford gates are well-suited for FTQC because they are able to limit the spread of errors. Reliability is critical for practical fault-tolerant quantum systems, so running Clifford gates helps ensure accurate computations. The other necessary quantum operation is non-Clifford gates (particularly T-gates).
A quantum computer needs both categories of gates so it can perform universal tasks such as chemistry simulations, factoring large numbers and other complex algorithms. However, there is a trick for using both of these operations together. Even though T-gates are important, they also break the symmetry needed by Clifford gates for error correction. That’s where the “magic state factory” comes in. It implements the non-Clifford group (T-gates) by combining a stream of so-called magic states alongside Clifford gates. In that way, the quantum computer can maintain its computational power and fault tolerance.
IBM’s research has proven it can run fault-tolerant logic within the stabilizer (Clifford) framework. However, without the extra non‑Clifford gates, a quantum computer would not be able to execute the full spectrum of quantum algorithms.
IBM fault-tolerant quantum roadmap
IBM
Now let’s take a closer look at the specific milestones in IBM’s new roadmap that will take advantage of the breakthroughs explained above, and how the company plans to create a large-scale fault-tolerant quantum computer within this decade.
2025: Loon Processor
IBM expects to begin fabricating and testing the Loon processor sometime this year. The Loon will use two logical qubits and approximately 100 physical qubits. Although the Loon will not use the gross code, it will be using a smaller code with similar hardware requirements.
IBM has drawn on its past four-way coupler research to develop and test a six-way coupler using a central qubit connected through tunable couplers to six neighboring qubits, a setup that demonstrates low crosstalk and high fidelity between connections. IBM also intends to demonstrate the use of “c-couplers” to connect Loon qubits to non-local qubits. Couplers up to 16mm in length have been tested, with a goal of increasing that length to 20 mm. Longer couplers allow connections to be made over more areas of the chip. So far, the longer couplers have also maintained low error rates and acceptable coherence times — in the range of several hundred microseconds.
2026: Kookaburra Processor
In this phase of the roadmap, IBM plans to test one full unit of the gross code, long c-couplers and real-time decoding of the gross code. IBM also plans a demonstration of quantum advantage in 2026 via the Heron (a.k.a. Nighthawk) platform with HPC.
2027: Cockatoo Processor
The Cockatoo design employs two blocks of gross code connected to LPUs to create 24 logical qubits using approximately 288 physical qubits. In this year, IBM aims to test L-couplers and module-to-module communications capability. IBM also plans to test Clifford gates between the two code blocks, giving it the ability to perform computations, but not yet universal computations.
2028: Starling Processor
A year later, the Starling processor should be equipped with approximately 200 logical qubits. Required components, including magic state distillation, will be tested. Although only two blocks of gross code are shown in the illustrative diagram above, the Starling will in fact require about 17 blocks of gross code, with each block connected to an LPU.
The estimated size of IBM’s 2029 large-scale fault-tolerant Starling quantum computer in a … More
IBM
2029: Large-Scale Fault-Tolerant Starling Processor
This is the year IBM plans to deliver the industry’s first large‑scale, fault‑tolerant quantum computer — equipped with approximately 200 logical qubits and able to execute 100 million gate operations. A processor of this size will have approximately 17 gross code blocks equipped with LPUs and magic state distillation.
2033 And Beyond: Blue Jay Processor
IBM expects that quantum computers during this period will run billions of gates on several thousand circuits to demonstrate the full power and potential of quantum computing.
The Path To Large-Scale Fault Tolerance Is Difficult But Manageable
IBM milestones in its roadmap for large-scale, fault-tolerant quantum computers
Table: Moor Insights & Strategy; Data: IBM
Although there have been a number of significant quantum computing advancements in recent years, building practical, fault-tolerant quantum systems has been — and still remains — a significant challenge. Up until now, this has largely been due to a lack of a suitable method for error correction. Traditional methods such as surface code have important benefits, but limitations, too. Surface code, for instance, is still not a practical solution because of the large numbers of qubits required to scale it.
IBM has overcome surface code’s scaling limitation through the development of its qLDPC codes, which require only a tenth of the physical qubits needed by surface code. The qLDPC approach has allowed IBM to develop a workable architecture for a near-term, fully fault-tolerant quantum computer. IBM has also achieved other important milestones such as creating additional layers in existing chips to allow qubit connections to be made on different chip planes. Tests have shown that gates using the new layers are able to maintain high quality and low error rates in the range of existing devices.
Still, there are a few areas in need of improvement. Existing error rates are around 3×10^-3, which needs to improve to accommodate advanced applications. IBM is also working on extending coherence times. Using isolated test devices, IBM has determined that coherence is running between one to two milliseconds, and up to four milliseconds in some cases. Since it appears to me that future utility-scale algorithms and magic state factories will need between 50,000 to 100,000 gates between resets, further improvement in coherence may be required.
As stated earlier, IBM’s core strategy relies on modularity and scalability. The incremental improvement of its processors through the years has allowed IBM to progressively develop and test its designs to incrementally increase the number of logical qubits and quantum operations — and, ultimately, expand quantum computing’s practical utility. Without IBM’s extensive prior research and its development of qLDPC for error correction, estimating IBM’s chance for success would largely be guesswork. With it, IBM’s plan to release a large-scale fault-tolerant quantum computer in 2029 looks aggressive but achievable.