
(Source: VesnaArt/Shutterstock)
By now, you’ve probably heard of robotic surgery, where a surgeon sits at a console and guides long-armed instruments that enter the body through tiny pre-made incisions. The surgeon’s hand movements are converted into steady, miniature actions while a 3D camera shows the scene in real time. In this scenario, the robot never acts on its own, with the surgeon remaining in full control at all times.
But what if the robot took the lead? At Johns Hopkins University, researchers say a robot trained on videos of surgeries has performed a lengthy phase of a gallbladder removal without human assistance on a realistic patient model. The machine listened to voice prompts from the team and adjusted its technique in real time, learning on the fly much like a resident taking cues from a senior surgeon.
“The robot performed unflappably across trials and with the expertise of a skilled human surgeon, even during unexpected scenarios typical in real life medical emergencies,” a Johns Hopkins Hub report said. The study, “SRT-H: A hierarchical framework for autonomous surgery via language-conditioned imitation learning,” was published today in the journal Science Robotics.
The researchers say the federally funded work moves surgical robotics beyond rigid automation toward systems where robots can perform with both mechanical precision and human-like adaptability and understanding.
“This advancement moves us from robots that can execute specific surgical tasks to robots that truly understand surgical procedures,” said medical roboticist and study author Axel Krieger. “This is a critical distinction that brings us significantly closer to clinically viable autonomous surgical systems that can work in the messy, unpredictable reality of actual patient care.”

SRT-H’s two-tier brain: a language planner flips between task and corrective commands with a “correction flag,” while a low-level controller turns the chosen instruction into precise instrument paths. (Source: SRT-H GitHub)
The Johns Hopkins team tackled a problem that keeps most “self-driving” robots out of the operating room: long operations where small errors accumulate. Their hierarchical system, called Surgical Robot Transformer-H (SRT-H), splits the job in two. A high-level planner thinks in plain-language commands (“trim here,” “move the left gripper closer”) while a low-level controller converts those words into precise instrument paths. A correction flag flips the robot into recovery mode when things drift off course, letting it fix minor slips instead of stalling.
To teach those skills, the researchers recorded about 18,000 demonstrations on more than 30 pig gallbladders, including sequences in which they deliberately introduced mistakes so the robot could learn how to recover. In tests on eight ex vivo gallbladders, the system completed the critical clipping-and-cutting phase of cholecystectomy without a single human intervention. Surgeons could still steer it with simple voice cues, and each correction became new training data.
“This work represents a major leap from prior efforts because it tackles some of the fundamental barriers to deploying autonomous surgical robots in the real world,” said lead author Ji Woong “Brian” Kim, a former postdoctoral researcher at Johns Hopkins who is now with Stanford University. “Our work shows that AI models can be made reliable enough for surgical autonomy—something that once felt far-off but is now demonstrably viable.”
This work builds on prior experiments, including a 2022 study where Krieger’s Smart Tissue Autonomous Robot, STAR, performed the first autonomous robotic surgery on a live animal: laparoscopic surgery on a pig. The feat was notable but tightly managed: the tissue was specially marked, the setting was highly controlled, and STAR followed a fixed plan much like a self-driving car confined to a pre-mapped test track.
The new SRT-H system is different, Krieger says, because it “is like teaching a robot to navigate any road, in any condition, responding intelligently to whatever it encounters.”

The Surgical Robot Transformer-Hierarchy performing a gallbladder surgery. (Source: Juo-Tung Chen/Johns Hopkins University)
SRT-H performs surgery in real time, adjusting to unique anatomy, choosing its next move as conditions change, and correcting small errors on its own. Built on the same transformer architecture that powers ChatGPT, the system takes spoken guidance such as “grab the gallbladder head” or “move the left arm a bit to the left,” logging that feedback as fresh training data.
Last year, the team showed the approach could teach a robot three core skills of needle manipulation, tissue lifting, and suturing. These tasks only lasted a few seconds, laying the groundwork for today’s longer, fully autonomous procedure. The gallbladder removal procedure is a several-minute-long string of 17 different tasks, and the robot must identify certain ducts and arteries and grab them precisely, strategically place clips, and sever parts with scissors, according to the report.
The researchers say that the robot performed with 100% accuracy across anatomical conditions that were not uniform and with unexpected detours, and though it took longer to do the surgery than a human surgeon, the results were comparable to those of an expert surgeon.
“Just as surgical residents often master different parts of an operation at different rates, this work illustrates the promise of developing autonomous robotic systems in a similarly modular and progressive manner,” says Johns Hopkins surgeon Jeff Jopling, a co-author.
In the future, the team anticipates training and testing SRT-H on different types of surgery to expand its capabilities.
“To me, it really shows that it’s possible to perform complex surgical procedures autonomously,” Krieger said. “This is a proof of concept that it’s possible, and this imitation learning framework can automate such a complex procedure with such a high degree of robustness.”