Now, let’s rewind two years. Almost to the day, Bruce and I uncovered a vulnerability. While preparing a case study for a workshop on AI and biosecurity, we discovered that open-source AI protein design tools could be used to redesign toxic proteins in ways that could bypass biosecurity screening systems, systems set up to identify incoming orders of concern.
Now in that work, we created an AI pipeline from open-source tools that could essentially “paraphrase” the amino acid sequences—reformulating them while working to preserve their structure and potentially their function.
These paraphrased sequences could evade the screening systems used by major DNA synthesis companies, and these are the systems that scientists rely on to safely produce AI-designed proteins.
Now, experts in the field described this finding as the first “zero day” for AI and biosecurity. And this marked the beginning of a deep, two-year collaborative effort to investigate and address this challenge.
With the help of a strong cross-sector team—including James, Tessa, Bruce, and many others—we worked behind the scenes to build AI biosecurity red-teaming approaches, probe for vulnerabilities, and to design practical fixes. These “patches,” akin to those in cybersecurity, have now been shared with organizations globally to strengthen biosecurity screening.
This has been one of the most fascinating projects I’ve had the privilege to work on, for its technical complexity, its ethical and policy dimensions, and the remarkable collaboration across industry, government, and nonprofit sectors.
The project highlights that the same AI tools capable of incredible good can also be misused, requiring us to be vigilant, thoughtful, and creative so we continue to get the most benefit out of AI tools while working to ensure that we avoid costly misuses.
With that, let me officially welcome our guests.
Bruce, James, Tessa, welcome to the podcast.
BRUCE WITTMANN: Thanks, Eric.
JAMES DIGGANS: Thanks for having us.
HORVITZ: It’s been such a pleasure working closely with each of you, not only for your expertise but also for your deep commitment and passion about public health and global safety.
Before we dive into the technical side of things, I’d like to ask each of you, how did you get into this field? What inspired you to become biologists and then pursue the implications of advances in AI for biosecurity? Bruce?
WITTMANN: Well, I’ve always liked building things. That’s where I would say I come from. You know, my hobbies when I’m not working on biology or AI things—as you know, Eric—is, like, building things around the house, right. Doing construction. That kind of stuff.
But my broader interests have always been biology, chemistry. So I originally got into organic chemistry. I found that was fascinating. From there, went to synthetic biology, particularly metabolic engineering, because that’s kind of like organic chemistry, but you’re wiring together different parts of an organism’s metabolism rather than different chemical reactions. And while I was working in that space, I, kind of, had the thought of there’s got to be an easier way to do this [LAUGHS] because it is really difficult to do any type of metabolic engineering. And that’s how I got into the AI space, trying to solve these very complicated biological problems, trying to build things that we don’t necessarily even understand using our understanding from data or deriving understanding from data.
So, you know, that’s the roundabout way of how I got to where I am—the abstract way of how I got to where I am.
HORVITZ: And, Tessa, what motivated you to jump into this area and zoom into biology and biosciences and helping us to avoid catastrophic outcomes?
ALEXANIAN: Yeah, I mean, probably the origin of me being really excited about biology is actually a book called [The] Lives of [a] Cell (opens in new tab) by Lewis Thomas, which is an extremely beautiful book of essays that made me be like, Oh, wow, life is just incredible. I think I read it when I was, you know, 12 or 13, and I was like, Life is incredible. I want to work on this. This is the most beautiful science, right. And then I, in university, I was studying engineering, and I heard there was this engineering team for engineering biology—this iGEM (opens in new tab) team—and I joined it, and I thought, Oh, this is so cool. I really got to go work in this field of synthetic biology.
And then I also tried doing the wet lab biology, and I was like, Oh, but I don’t like this part. I don’t actually, like, like babysitting microbes. [LAUGHTER] I think there’s a way … some people who are great wet lab biologists are made of really stern stuff. And they really enjoy figuring out how to redesign their negative controls so they can figure out whether it was contamination or whether it was, you know, temperature fluctuation. I’m not that, apparently.
And so I ended up becoming a lab automation engineer because I could help the science happen, but I … but my responsibilities were the robots and the computers rather than the microbes, which I find a little bit intransigent.
HORVITZ: Right. I was thinking of those tough souls; they also used their mouths to do pipetting and so on of these contaminated fluids …
WITTMANN: Not anymore. ALEXANIAN: It’s true. [LAUGHTER]
DIGGANS: Not anymore. [LAUGHS]
ALEXANIAN: They used to be tougher. They used to be tougher.
HORVITZ: James.
DIGGANS: So I did my undergrad in computer science and microbiology, mostly because at the time, I couldn’t pick which of the two I liked more. I liked them both. And by the time I graduated, I was lucky enough that I realized that the intersection of the two could be a thing. And so I did a PhD in computational biology, and then I worked for five years at the MITRE Corporation. It’s a nonprofit. I got the chance to work with the US biodefense community and just found an incredible group of people working to protect forces and the population at large from biological threats and just learned a ton about both biology and also dual-use risk. And then so when Twist called me and asked if I wanted to join Twist and set up their biosecurity program, I leapt at the chance and have done that for the past 10 years.
HORVITZ: Well, thanks everyone.
I believe that AI-powered protein design in particular is one of the most exciting frontiers of modern science. It holds promise for breakthroughs in medicine, public health, even material science. We’re already seeing it lead to new vaccines, novel therapeutics, and—on the scientific front—powerful insights into the machinery of life.
So there’s much more ahead, especially in how AI can help us promote wellness, longevity, and the prevention of disease. But before we get too far ahead, while some of our listeners work in bioscience, many may not have a good understanding of some of the foundations.
So, Bruce, can you just give us a high-level overview of proteins? What are they? Why are they important? How do they figure into human-designed applications?
WITTMANN: Sure. Yeah. Fortunately, I used to TA a class on AI for protein design, so it’s right in my wheelhouse. [LAUGHS]
HORVITZ: Perfect, perfect background. [LAUGHS]
WITTMANN: It’s perfect. Yeah. I got to go back to all of that. Yeah, so from the very basic level, proteins are the workhorses of life.
Every chemical reaction that happens in our body—well, nearly every chemical reaction that happens in our body—most of the structure of our cells, you name it. Any life process, proteins are central to it.
Now proteins are encoded by what are known as … well, I shouldn’t say encoded. They are constructed from what are called amino acids—there are 20 of them—and depending on the combination and order in which you string these amino acids together, you get a different protein sequence. So that’s what we mean when we say protein sequence.
The sequence of a protein then determines what shape that protein folds into in a cell, and that shape determines what the protein does. So we will often say sequence determines structure, which determines function.
Now the challenge that we face in engineering proteins is just how many possibilities there are. For all practical purposes, it’s infinite. So we have 20 building blocks. There are on average around 300 amino acids in a protein. So that’s 20 to the power of 300 possible combinations. And a common reference point is that it’s estimated there are around 10 to the 80 particles in the observable universe. So beyond astronomical numbers of possible combinations that we could have, and the job of a protein engineer is to find that one or a few of the proteins within that space that do what we want it to do.
So when a human has an idea of, OK, here’s what I want a protein to do, we have various techniques of finding that desired protein, one of which is using artificial intelligence and trying to either sift through that milieu of potential proteins or, as we’ll talk about more in this podcast, physically generating them. So creating them in a way, sampling them out of some distribution of reasonable proteins.
HORVITZ: Great. So I wanted to throw it to James now to talk about how protein design goes from computer to reality—from in silico to test tubes. What role does Twist Bioscience (opens in new tab) play in transforming digital protein designs into synthesized proteins? And maybe we can talk also about what safeguards are in place at your company and why do we need them.
DIGGANS: So all of these proteins that Bruce has described are encoded in DNA. So the language that our cells use to kind of store the information about how to make these proteins is all encoded in DNA. And so if you as an engineer have designed a protein and you want to test it to see if it does what you think it does, the first step is to have the DNA that encodes that protein manufactured, and companies like Twist carry out that role.
So we are cognizant also, however, that these are what are called dual-use technologies. So you can use DNA and proteins for an incredible variety of amazing applications. So drug development, agricultural improvements, bioindustrial manufacturing, all manner of incredible applications. But you could also potentially use those to cause harm so toxins or other, you know, sort of biological misuse.
And so the industry has since at least 2010 recognized that they have a responsibility to make sure that when we’re asked to make some sequence of DNA that we understand what that thing is encoding and who we’re giving it for. So we’re screening both the customer that’s coming to us and we’re screening the sequence that they’re requesting.
And so Twist has long invested in a very, sort of, complicated system for essentially reverse engineering the constructs that we’re asked to make so that we understand what they are. And then a system where we engage with our customers and make sure that they’re going to use those for legitimate purpose and responsibly.
HORVITZ: And how do the emergence of these new generative AI tools influence how you think about risk?
DIGGANS: A lot of the power of these AI tools is they allow us to make proteins or design proteins that have never existed before in nature to carry out functions that don’t exist in the natural world. That’s an extremely powerful capability.
But the existing defensive tools that we use at DNA synthesis companies generally rely on what’s called homology, similarity to known naturally occurring sequences, to determine whether something might pose risk. And so AI tools kind of break the link between those two things.
HORVITZ: Now you also serve as chair of the International Gene Synthesis Consortium (opens in new tab). Can you tell us a little bit more about the IGSC, its mission, how it supports global biosecurity?
DIGGANS: Certainly. So the IGSC was founded in 2010[1] and right now has grown to more than 40 companies and organizations across 10 countries. And the IGSC is essentially a place where companies who might be diehard competitors in the market around nucleic acid synthesis come together and design and develop best practices around biosecurity screening to, kind of, support the shared interest we all have in making sure that these technologies are not subject to misuse.
HORVITZ: Thanks, James. Now, Tessa, your organization, IBBIS (opens in new tab) is focused—it’s a beautiful mission—on advancing science while minimizing catastrophic risk, likelihood of catastrophic risk. When we say catastrophic risk, what do we really mean, Tessa, in the context of biology and AI? And how is that … do you view that risk landscape as evolving as AI capabilities are growing?
ALEXANIAN: I think the … to be honest, as a person who’s been in biosecurity for a while, I’ve been surprised by how much of the conversation about the risks from advances in artificial intelligence has centered on the risk of engineered biological weapons and engineered pandemics.
Even recently, there was a new discussion on introducing redlines for AI that came up at the UN General Assembly. And the very first item they list in their list of risks, if I’m not mistaken, was engineered pandemics, which is exactly the sort of thing that people fear could be done, could be done with these biological AI tools.
Now, I think that when we talk about catastrophic risk, we talk about, you know, something that has an impact on a large percentage of humanity. And I think the reason that we think that biotechnologies pose a catastrophic risk is that we believe there, as we’ve seen with many historical pandemics, there’s a possibility for something to emerge or be created that is beyond our society’s ability to control.
You know, there were a few countries in COVID that managed to more or less successfully do a zero-COVID policy, but that was not, that was not most countries. That was not any of the countries that I lived in. And, you know, we saw millions of people die. And I think we believe that with something like the 1918 influenza, which had a much higher case fatality rate, you could have far more people die.
Now, why we think about this in the context of AI and where this connects to DNA synthesis is that, you know, there is a … these risks of both, sort of, public health risks, pandemic risks, and misuse risks—people deliberately trying to do harm with biology, as we’ve seen from the long history of biological weapons programs—you know, we think that those might be accelerated in a few different ways by AI technology, both the potential … and I say potential here because as everyone who has worked in a wet lab—which I think is everyone on this call—knows, engineering biology is really difficult. So there’s maybe a potential for it to become easier to develop biological technology for the purposes of doing harm, and there’s maybe also the potential to create novel threats.
And so I think people talk about both of those, and people have been looking hard for possible safeguards. And I think one safeguard that exists in this biosecurity world that, for example, doesn’t exist as cleanly in the cybersecurity world is that none of these biological threats can do harm until they are realized in physical reality, until you actually produce the protein or produce the virus or the microorganism that could do harm. And so I think at this point of production, both in DNA synthesis and elsewhere, we have a chance to introduce safeguards that can have a really large impact on the amount of risk that we’re facing—as long as we develop those safeguards in a way that keeps pace with AI.
HORVITZ: Well, thanks, Tessa. So, Bruce, our project began when I posed a challenge to you of the form: could current open-source AI tools be tasked with rewriting toxic protein sequences in a way that preserves their native structure, and might they evade today’s screening systems?
And I was preparing for a global workshop on AI and biosecurity that I’d been organizing with Frances Arnold, David Baker, and Lynda Stuart, and I wanted a concrete case study to challenge attendees. And what we found was interesting and deeply concerning.
So I wanted to dive in with you, Bruce, on the technical side. Can you describe some about the generative pipeline and how it works and what you did to build what we might call an AI and biosecurity red-teaming pipeline for testing and securing biosecurity screening tools?
WITTMANN: Sure. Yeah. I think the best place to start with this is really by analogy.
An analogy I often use in this case is the type of image generation AI tools we’re all familiar with now where I can tell the AI model, “Hey, give me a cartoonish picture of a dog playing fetch.” And it’ll do that, and it’ll give us back something that is likely never been seen before, right. That exact image is new, but the theme is still there. The theme is this dog.
And that’s kind of the same technology that we’re using in this red-teaming pipeline. Only rather than using plain language, English, we’re passing in what we would call conditioning information that is relevant to a protein.
So our AI models aren’t at the point yet where I can say, “Give me a protein that does x.” That would be the dream. We’re a long way from that. But what instead we do is we pass in things that match that theme that we’re interested in. So rather than saying, “Hey, give me back the theme on a dog,” we pass in information that we know will cause or at least push this generative model to create a protein that has the characteristics that we want.
So in the case of that example you just mentioned, Eric, it would be the protein structure. Like I mentioned earlier, we usually say structure determines function. There’s obviously a lot of nuance to that, but we can, at a first approximation, say structure determines function. So if I ask an AI model, ”Hey, here’s this structure; give me a protein sequence that folds to this structure,” just like with that analogy with the dog, it’s going to give me something that matches that structure but that is likely still never been seen before. It’s going to be a new sequence.
So you can imagine taking this one step further. In the red-teaming pipeline, what we would do is take a protein that should normally be captured by DNA synthesis screening—that would be captured by DNA synthesis screening—find its structure, pass it through one of these models, and get variants on the theme of that structure so these new sequences, these synthetic homologs that you mentioned, paraphrased, reformulated, whatever phrase we want to use to describe them.
And they have a chance or a greater chance than not of maintaining the structure and so maintaining the function while being sufficiently different that they’re not detected by these tools anymore.
So that’s the nuts and bolts of how the red-teaming pipeline comes together. We use more tools than just structure. I think structure is the easiest one to understand. But we have a suite of tools in there, each pass different conditioning information that causes the model to generate sequences that are paraphrased versions of potential proteins of concern.
HORVITZ: But to get down to brass tacks, what Bruce did for the framing study was … we took the toxic, well-known toxic protein ricin, as we described in a framing paper that’s actually part of the appendix now to the Science publication, and we generated through this pipeline, composed of open-source tools, thousands of AI-rewritten versions of ricin.
And this brings us to the next step of our project, way back when, at the early … in the early days of this effort, where Twist Bioscience was one of the companies we approached with what must have seemed like an unusual question to your CEO, in fact, James: would you be open to testing whether current screening systems could detect thousands of AI-rewritten versions of ricin, a well-known toxic protein?
And your CEO quickly connected me with you, James. So, James, what were your first thoughts on hearing about this project, and how did you respond to our initial framing study?
DIGGANS: I think my first response was gratitude and excitement. So it was fantastic that Microsoft had really leaned forward on this set of ideas and had produced this dataset. But to have it, you know, show up on our doorstep in a very concrete way with a partner that was ready to, sort of, help us try and address that, I think was a really … a valuable opportunity. And so we really leapt at that.
HORVITZ: And the results were that both for you and another company, major producer IDT [Integrated DNA Technologies], those thousands of variants flew through … flew under the radar of the biosecurity screening software as we covered in that framing paper.
Now, after our initial findings on this, we quietly shared the paper with a few trusted contacts, including some in government. Through my work with the White House Office of Science and Technology Policy, or OSTP, we connected up with biosecurity leads there, and it was an OSTP biosecurity lead who described our results as the first zero day in AI and biosecurity. And now in cybersecurity, a zero day is a vulnerability unknown to defenders generally, meaning there’s no time to respond before it could be exploited should it be known.
In that vein, we took a cybersecurity approach. We stood up a CERT—C-E-R-T—a cybersecurity [computer] emergency response team approach used in responding to cybersecurity vulnerabilities, and we implemented this process to address what we saw as a vulnerability with AI-enabled challenges to biosecurity.
At one point down the line, it was so rewarding to hear you say, James, “I’m really glad Microsoft got here first.” I’m curious how you think about this kind of AI-enabled vulnerability compared to other ones, biosecurity threats, you’ve encountered, and I’d love to hear your perspective on how we handled the situation from the early discovery to the coordination and outreach.
DIGGANS: Yeah, I think in terms of comparison known threats, the challenge here is really there is no good basis on which we can just, sort of, say, Oh, I’ll build a new tool to detect this concrete universe of things, right. This was more a pattern of I’m going to use tools—and I love the name “Paraphrase”; it’s a fantastic name—I can paraphrase anything that I would normally think of as biological … as posing biological risk, and now that thing is harder to detect for existing tools. And so that really was a very eye-opening experience, and I think the practice of forming this CERT response, putting together a group of people who were well versed not just in the threat landscape but also in the defensive technologies, and then figuring out how to mitigate that risk and broaden that study, I think, was a really incredibly valuable response to the entire synthesis industry.
HORVITZ: Yeah, and, Bruce, can you describe a little bit about the process by which we expanded the effort beyond our initial framing study to more toxins and then to a larger challenge set and then the results that we pursued and achieved?
WITTMANN: Yeah, of course. So, you know, using machine learning lingo, you don’t want to overfit to a single example. So early on with this, as part of the framing study, we were able to show or I should say James and coworkers across the screening field were able to show that this could be patched, right. We needed to just make some changes to the tools, and we could at the very least detect ricin or reformulated versions of ricin.
So the next step of course was then, OK, how generalizable are these patches? Can they detect other reformulated sequences, as well? So we had to expand the set of proteins that we had reformulated. We couldn’t just do 10s of thousands of ricins. We had to do 10s of thousands of name your other potentially hazardous …
HORVITZ: I think we had 72, was it?
WITTMANN: It was 72 in the end that we ended up at. I believe, James, it was you and maybe Jake, another one of the authors on the list … on the paper, who primarily put that list together …
HORVITZ: This is Jacob Beal … Jacob Beal at Raytheon BBN.
WITTMANN: I think James actually might be the better one to answer how this list was expanded.
DIGGANS: Initially the focus [was] on ricin as a toxin so that list expanded to 62 sort of commonly controlled toxins that are subject to an export control restriction or other concern. And then on top of that, we added 10 viral proteins. So we didn’t really just want to look at toxins. We also wanted to look at viral proteins, largely because those proteins tend to have multiple functions. They have highly constrained structures. And so if we could work in a toxin context, could Paraphrase also do the same for viral proteins, as well.
HORVITZ: And, Bruce, can you describe some about how we characterize the updates and the, we’ll say, the boost in capabilities of the patched screening tools?
WITTMANN: So we had, like you said, Eric, 72 base proteins or template proteins. And for each of those, we had generated a few 100 to a couple thousand reformulated variants of them. The only way to really get any sense of validity of those sequences was to predict their structures. So we predicted protein structures for I think it was 70ish thousand protein structures in the end that we had to predict and score them using in silico metrics. So things like, how similar is this to that template, wild-type protein structure that we used as our conditioning information?
We put them on a big grid. So we have two axes. We have on the x-axis—and this is a figure in our paper—the quality of the prediction. It’s essentially a confidence metric: how realistic is this protein sequence? And on the other axis is, how similar is the predicted structure of this variant to the original? And ultimately, what we were wanting to see was the proteins that scored well in both of those metrics, so that showed up in the top right of that diagram, were caught primarily, because these are again the ones that are most likely, having to say most likely, to retain function of the original.
So when you compare the original tools—Tool Series A, right, the unpatched tools—what you’ll find is varying degrees of success in the top right. It varied by tool. But in some cases, barely anything being flagged as potentially hazardous. And so improvement is then in the next series—Series B, the patched version of tools—we have more flagged in that upper-right corner.
HORVITZ: And we felt confident that we had a more AI-resilient screening solution across the companies, and, James, at this point, the whole team decided it was time to disclose the vulnerability as well as the patch details and pointers to where to go for the updated screening software and to communicate this to synthesis companies worldwide via the IGSC. This was probably July, I think, of 2024. What was that process like, and how did members respond?
DIGGANS: I think members were really grateful and excited. To present to that group, to say, hey, this activity (a) has gone on, (b) was successful, and (c) was kept close hold until we knew how to mitigate this, I think everyone was really gratified by that and comforted by the fact that now they had kind of off-the-shelf solutions that they could use to improve their resilience against any incoming heavily engineered protein designs.
HORVITZ: Thanks, James.
Now, I know that we all understand this particular effort to be important but a piece of the biosecurity and AI problem. I’m just curious to … I’ll ask all three of you to just share some brief reflections.
I know, Bruce, you’ve been on … you’ve stayed on this, and we’ve—all of us on the original team—have other projects going on that are pushing on the frontiers ahead of where we were with this paper when we published it.
Let me start with Tessa in terms of, like, what new risks do you see emerging as AI accelerates and maybe couple that with thoughts about how do we proactively get ahead of them.
ALEXANIAN: Yeah, I think with the Paraphrase’s work, as Bruce explained so well, you know, I sometimes use the metaphor of the previous response that the IGSC had to do, the synthesis screening community, where it used to be you could look for similarities to DNA sequences, and then everyone started doing synthetic biology where they were doing codon optimization so that proteins could express more efficiently in different host organisms, and now all of a sudden, well, you’ve scrambled your DNA sequence and it doesn’t look very similar even though your protein sequence actually still looks, you know, very similar or often the same once it’s been translated from DNA to protein, and so that was a, you know, many, many in the industry were already screening both DNA and protein, but they had to start screening … everybody had to start screening protein sequences even just to do the similarity testing as these codon optimization tools became universal.
I feel like we’re, kind of, in a similar transition phase with protein-design, protein-rephrasing, tools where, you know, these tools are still in many cases drawing from the natural distribution of proteins. You know, I think some of the work we saw in, you know, designing novel CRISPR enzymes, you go, OK, yeah, it is novel; it’s very unlike any one CRISPR enzyme. But if you do a massive multiple sequence alignment of every CRISPR enzyme that we know about, you’re like, OK, this fits in the distribution of those enzymes. And so, you know, I think we’re not … we’re having to do a more flexible form of screening, where we look for things that are kind of within distribution of natural proteins.
But I feel like broadly, all of the screening tools were able to respond by doing something like that. And I think … I still feel like the clock is ticking down on that and that as the AI tools get better at predicting function and designing, sort of, novel sequences to pursue a particular function, you know—you have tools now that can go from Gene Ontology terms to a potential structure or potential sequence that may again be much farther out of the distribution of natural protein—I think all of us on the screening side are going to have to be responding to that, as well.
So I think I see this as a necessary ongoing engagement between people at the frontier of designing novel biology and people at the frontier of producing all of the materials that allow that novel biology to be tested in the lab. You know, I think this feels like the first, you know, detailed, comprehensive zero day disclosure and response. But I think that’s … I think we’re going to see more of those. And I think what I’m excited about doing at IBBIS is trying to encourage and set up more infrastructure so that you can, as an AI developer, disclose these new discoveries to the people who need to respond before the publication comes out.
HORVITZ: Thank you, Tessa.
The, the … Bruce, I mean, you and I are working on all sorts of dimensions. You’re leading up some efforts at Microsoft, for example, on the foundation model front and so on, among other directions. We’ve talked about new kinds of embedding models that might go beyond sequence and structure. Can you talk a little bit about just a few of the directions that just paint the larger constellation of the kinds of things that we talk about when we put our worry hats on?
WITTMANN: I feel like that could have its own dedicated podcast, as well. There’s a lot … [LAUGHTER] there’s a lot to talk about.
HORVITZ: Yeah. We want to make sure that we don’t tell the world that the whole problem is solved here.
WITTMANN: Right, right, right. I think Tessa said it really, really well in that most of what we’re doing right now, it’s a variant on a known theme. I have to know the structure that does something bad to be able to pass it in as context. I have to know some existing sequence that does something bad to pass it in.
And obviously the goal is to move away from that in benign applications, where when I’m designing something, I often want to design it because nothing exists [LAUGHS] that already does it. So we are going to be heading to this space where we don’t know what this protein does. It’s kind of a circular problem, right, where we’re going to need to be able to predict what some obscure protein sequence does in order to be able to still do our screening.
Now, the way that I think about this, I often think about it beyond just DNA synthesis screening. It’s one line of defense, and there needs to be many lines of defense that come into play here that go beyond just relying on this one roadblock. It’s a very powerful roadblock. It’s a very powerful barrier. But we need to be proactively thinking about how we broaden the scope of defenses. And there are lots of conversations that are ongoing. I won’t go into the details of them. Again, that would be its own podcast.
But primarily my big push—and I think this is emerging consensus in the field, though I don’t want to speak for everybody—is it needs to … any interventions we have need to come more at the systems level and less at the model level, primarily because this is such dual-use technology. If it can be used for good biological design, it can be used for bad biological design. Biology has no sense of morality. There is no bad protein. It’s just a protein.
So we need to think about this differently than how we would maybe think about looking at the outputs of that image generator model that I spoke about earlier, where I can physically look at an image and say, don’t want my model producing that, do want my model producing that. I don’t have that luxury in this space. So it’s a totally different problem. It’s an evolving problem. Conversations are happening about it, but the work is very much not done.
HORVITZ: And, James, I want to give you the same open question, but I’d like to apply what Bruce just said on system level and so on and in the spirit of the kind of things that you’re very much involved with internationally to also add to it, just get some comments on programs and policies that move beyond technical solutions for governance mechanisms—logging, auditing nucleic acid orders, transparency, various kinds—that might complement technical approaches like Paraphrase and their status today.
DIGGANS: Yeah, I’m very gratified that Bruce said that we, the synthesis industry, should not be the sole bulwark against misuse. That is very comforting and correct.
Yeah, so the US government published a guidance document in 2023 that essentially said you, the entire biotech supply chain, have a responsibility to make sure that you’re evaluating your customers. You should know your customer; you know that they’re legitimate. I think that’s an important practice.
Export controls are designed to minimize the movement of equipment and materials that can be used in support of these kinds of misuse activities. And then governments have really been quite active in trying to incentivize, you know, sort of what we would think of as positive behavior, so screening, for example, in DNA synthesis companies. The US government created a framework in 2024, and it’s under a rewrite now to basically say US research dollars will only go to companies who make nucleic acid who do these good things. And so that is using, kind of, the government-funding carrot to, kind of, continue to build these layers of defense against potential misuse.
HORVITZ: Thanks. Now, discussing risk, especially when it involves AI and biosecurity, isn’t always easy. As we’ve all been suggesting, some worry about alarming the public or arming bad actors. Others advocate for openness as a principle of doing science with integrity.
A phase of our work as we prepared our paper was giving serious thought to both the benefits and the risks of transparency about what it was that we were doing. Some experts encouraged full disclosure as important for enhancing the science of biosecurity. Other experts, all experts, cautioned against what are called information hazards, the risk of sharing the details to enable malevolent actions with our findings or our approach.
So we faced a real question: how can we support open science while minimizing the risk of misuse? And we took all the input we got, even if it was contradictory, very seriously. We carefully deliberated about a good balance, and even then, once we chose our balance and submitted our manuscript to Science, the peer reviewers came back and said they wanted some of the more sensitive details that we withheld with explanations as to why.
So this provoked some thinking out of the box about a novel approach, and we came up with a perpetual gatekeeping strategy where requests for access to sensitive methods and data and even the software across different risk categories would be carefully reviewed by a committee and a process for access that would continue in perpetuity.
Now, we brought the proposal to Tessa and her team at IBBIS—this is a great nonprofit group; look at their mission—and we worked with Tessa and her colleagues to refine a workable solution that was accepted by Science magazine as a new approach to handling information hazards as first demonstrated by our paper.
So, Tessa, thank you again for helping us to navigate such a complex challenge. Can you share your perspective on information hazards? And then walk us through how our proposed system ensures responsible data and software sharing.
ALEXANIAN: Yeah. And thanks, Eric.
It’s all of the long discussions we had among the group of people on this podcast and the other authors on the paper and many people we engaged, you know, technical experts, people in various governments, you know, we heard a lot of contradictory advice.
And I think it showed us that there isn’t a consensus right now on how to handle information hazards in biotechnology. You know, I think … I don’t want to overstate how much of a consensus there is in cybersecurity either. If you go to DEF CON, you’ll hear people about how they’ve been mistreated in their attempts to do responsible disclosure for pacemakers and whatnot. But I think we’re … we have even less of a consensus when it comes to handling biological information.
You know, you have some people who say, oh, because the size of the consequences could be so catastrophic if someone, you know, releases an engineered flu or something, you know, we should just never share information about this. And then you have other people who say there’s no possibility of building defenses unless we share information about this. And we heard very strong voices with both of those perspectives in the process of conducting this study.
And I think what we landed on that I’m really excited about and really excited to get feedback on now that the paper is out, you know, if you go and compare our preprint, which came out in December of 2024, and this paper in October 2025, you’ll see a lot of information got added back in.
And I’m excited to see people’s reaction to that because even back in January 2025, talking with people who were signatories to the responsible biodesign commitments, they were really excited that this was such an empirically concrete paper because they’d maybe read a number of papers talking about biosecurity risks from AI that didn’t include a whole lot of data, you know, often, I think, because of concerns about information hazards. And they found the arguments in this paper are much more convincing because we are able to share data.
So the process we underwent that I felt good about was trying to really clearly articulate, when we talk about an information hazard, what are we worried about being done with this data? And if we put this data in public, completely open source, does it shift the risk at all? You know, I think doing that kind of marginal contribution comparison is really important because it also let us make more things available publicly.
But there were a few tiers of data that after a lot of discussion amongst the authors of the paper, we thought, OK, potentially someone who wanted to do harm, if they got access to this data, it might make it easier for them. Again, not necessarily saying it, you know, it opens the floodgates, but it might make it easier for them. And when we thought about that, we thought, OK, you know, giving all of those paraphrased protein sequences, maybe, maybe that, you know, compared to having to set up the whole pipeline with the open-source tools yourself, just giving you those protein sequences, maybe that makes your life a bit easier if you’re trying to do harm.
And then we thought, OK, giving you those protein sequences plus whether or not they were successfully flagged, maybe that makes your life, you know, quite a bit easier. And then finally, we thought, OK, the code that we want to share with some people who might try to reproduce these results or might try to build new screening systems that are more robust, we want to share the code with them. But again, if you have that whole code pipeline just prepared for you, it might really help make your life easier if you’re trying to do harm.
And so we, sort of, sorted the data into these three tiers and then went through a process actually very inspired by the existing customer screening processes in nucleic acid synthesis about how to determine, you know, we tried to take an approach not of what gets you in but what gets you out. You know, for the most part, we think it should be possible to access this data.
You know, if you have an affiliation with a recognizable institution or some good explanation of why you don’t have one right now, you know, if you have a reason for accessing this data, it shouldn’t be too hard to meet those requirements, but we wanted to have some in place. And we wanted it to be possible to rule out some people from getting access to this data. And so we’ve tried to be extremely transparent about what those are. If you go through our data access process and for some reason you get rejected, you’ll get a list of, “Here’s the reasons we rejected you. If you don’t think that’s right, get back to us.”
So I’m really excited to pilot this in part because I think, you know, we’re already in conversations with some other people handling potential bio-AI information hazard about doing a similar process for their data of, you know, tiering it, determining which gates to put in which tiers, but I really hope a number of people do get access through the process or if they try and they fail, they tell us why. Because I think as we move toward this world of potentially, you know, biology that is much easier to engineer, partly due to dual-use tools, you know, my dream is it’s, like, still hard to engineer harm with biology, even if it’s really easy to engineer biology. And I think these, kind of, new processes for managing access to things, this sort of like, you know, open but not completely public, I think those can be a big part of that layered defense.
HORVITZ: Thanks, Tessa. So we’re getting close to closing, and I just thought I would ask each of you to just share some reflections on what we’ve learned, the process we’ve demonstrated, the tools, the policy work that we did, this idea of facing the dual-use dilemma with … even at the information hazard level, with sharing information versus withholding it. What do you think about how our whole end to end of the study, now reaching the two-year point, can help other fields facing dual-use dilemmas?
Tessa, Bruce, James … James, have you ever thought about that? And we’ll go to Bruce and then Tessa.
DIGGANS: Yeah, I think it was an excellent model. I would like to see a study like this repeated on a schedule, you know, every six months because from where I sit, you know, the tools that we used for this project are now two years old. And so capabilities have moved on. Is the picture the same in terms of defensive capability? And so using that model over time, I think, would be incredibly valuable. And then using the findings to chart, you know, how much should we be investing in alternative strategies for this kind of risk mitigation for AI tool … the products of AI tools?
HORVITZ: Bruce.
WITTMANN: Yeah, I think I would extend on what James said. The anecdote I like to point out about this project is, kind of, our schedule. We found the vulnerability and it was patched within a week, two weeks, on all major synthesis screening platforms. We wrote the paper within a month. We expanded on the paper within two months, and then we spent a year and a half to nearly two years [LAUGHS] trying to figure out what goes into the paper; how do we release this information; you know, how do we do this responsibly?
And my hope is similar to what James said. We’ve made it easier for others to do this type of work. Not this exact work; it doesn’t have to necessarily do with proteins. But to do this type of work where you are dealing with potential hazards but there is also value in sharing and that hopefully that year and a half we spent figuring out how to appropriately share and what to share will not be a year and a half for other teams because these systems are in place or at least there is an example to follow up from. So that’s my takeaway.
HORVITZ: Tessa, bring us home—bring us home! [LAUGHS]
ALEXANIAN: Bring us home! Let’s do it faster next time. [LAUGHTER] Come talk to any of us if you’re dealing with this kind of stuff. You know, I think IBBIS, especially, we want to be a partner for building those layers of defense and, you know, having ripped out our hair as a collective over the past year and a half about the right process to follow here, I think we all really hope it’ll be faster next time.
And I think, you know, the other thing I would encourage is if you’re an AI developer, I would encourage you to think about how your tool can strengthen screening and strengthen recognition of threats.
I know James and I have talked before about how, you know, our Google search alerts each week send us dozens of cool AI bio papers, and it’s more like once a year or maybe once every six months, if we’re lucky, that we get something that’s like applying AI bio to biosecurity. So, you know, if you’re interested in these threats, I think we’d love to see more work that’s directly applied to facing these threats using the most modern technology.
HORVITZ: Well said.
Well, Bruce, James, Tessa, thank you so much for joining me today and for representing the many collaborators, both coauthors and beyond, who made this project possible.
It’s been a true pleasure to work with you. I’m so excited about what we’ve accomplished, the processes and the models that we’re now sharing with the world. And I’m deeply grateful for the collective intelligence and dedication that really powered the effort from the very beginning. So thanks again.
[MUSIC]
WITTMANN: Thanks, Eric.
DIGGANS: Thank you.
ALEXANIAN: Thank you.
[MUSIC FADES]