Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Perplexity’s AI-Powered Comet Browser Expands to Windows in Beta

vLex Sale Creates New Market Challenges – Artificial Lawyer

Qwen VLo Image Generation AI Model Released, Offers Image Generation and Editing for Free

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • Amazon (Titan)
    • Anthropic (Claude 3)
    • Cohere (Command R)
    • Google DeepMind (Gemini)
    • IBM (Watsonx)
    • Inflection AI (Pi)
    • Meta (LLaMA)
    • OpenAI (GPT-4 / GPT-4o)
    • Reka AI
    • xAI (Grok)
    • Adobe Sensi
    • Aleph Alpha
    • Alibaba Cloud (Qwen)
    • Apple Core ML
    • Baidu (ERNIE)
    • ByteDance Doubao
    • C3 AI
    • DataRobot
    • DeepSeek
  • AI Research & Breakthroughs
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Education AI
    • Energy AI
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Media & Entertainment
    • Transportation AI
    • Manufacturing AI
    • Retail AI
    • Agriculture AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
Facebook X (Twitter) Instagram
Advanced AI News
Microsoft Research

AI Testing and Evaluation: Learnings from genome editing

Advanced AI EditorBy Advanced AI EditorJune 30, 2025No Comments27 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


ALTA CHARO: It’s my pleasure. Thanks for having me.

SULLIVAN: Alta, I’d love to begin by stepping back in time a bit before you became a leading figure in bioethics and legal policy. You’ve shared that your interest in science was really inspired by your brothers’ interest in the topic and that your upbringing really helped shape your perseverance and resilience. Can you talk to us about what put you on the path to law and policy?

CHARO: Well, I think it’s true that many of us are strongly influenced by our families and certainly my family had, kind of, a science-y, techy orientation. My father was a refugee, you know, escaping the Nazis, and when he finally was able to start working in the United States, he took advantage of the G.I. Bill to learn how to repair televisions and radios, which were really just coming in in the 1950s. So he was, kind of, technically oriented.

My mother retrained from being a talented amateur artist to becoming a math teacher, and not surprisingly, both my brothers began to aim toward things like engineering and chemistry and physics. And our form of entertainment was to watch PBS or Star Trek. [LAUGHTER]

And so the interest comes from that background coupled with, in the 1960s, this enormous surge of interest in the so-called nature-versus-nurture debate about the degree to which we are destined by our biology or shaped by our environments. It was a heady debate, and one that perfectly combined the two interests in politics and science.

SULLIVAN: For listeners who are brand new to your field in genomic editing, can you give us what I’ll call a “90-second survey” of the space in perhaps plain language and why it’s important to have a framework for ensuring its responsible use.

CHARO: Well, you know, genome editing is both very old and very new. At base, what we’re talking about is a way to either delete sections of the genome, our collection of genes, or to add things or to alter what’s there. The goal is simply to be able to take what might not be healthy and make it healthy, whether it’s a plant, an animal, or a human.

Many people have compared it to a word processor, where you can edit text by swapping things in and out. You could change the letter g to the letter h in every word, and in our genomes, you can do similar kinds of things.

But because of this, we have a responsibility to make sure that whatever we change doesn’t become dangerous and that it doesn’t become socially disruptive. Now the earliest forms of genome editing were very inefficient, and so we didn’t worry that much. But with the advances that were spearheaded by people like Jennifer Doudna and Emmanuelle Charpentier, who won the Nobel Prize for their work in this area, genome editing has become much easier to do.

It’s become more efficient. It doesn’t require as much sophisticated laboratory equipment. It’s moved from being something that only a few people can do to something that we’re going to be seeing in our junior high school biology labs. And that means you have to pay attention to who’s doing it, why are they doing it, what are they releasing, if anything, into the environment, what are they trying to sell, and is it honest and is it safe?

SULLIVAN: How would you describe the risks, and are there, you know, sort of, specifically inherent risks in the technology itself, or do those risks really emerge only when it’s applied in certain contexts, like CRISPR in agriculture or CRISPR for human therapies?

CHARO: Well, to answer that, I’m going to do something that may seem a little picky, even pedantic. [LAUGHTER] But I’m going to distinguish between hazards and risks. So there are certain intrinsic hazards. That is, there are things that can go wrong.

You want to change one particular gene or one particular portion of a gene, and you might accidentally change something else, a so-called off-target effect. Or you might change something in a gene expecting a certain effect but not necessarily anticipating that there’s going to be an interaction between what you changed and what was there, a gene-gene interaction, that might have an unanticipated kind of result, a side effect essentially.

So there are some intrinsic hazards, but risk is a hazard coupled with the probability that it’s going to actually create something harmful. And that really depends upon the application.

If you are doing something that is making a change in a human being that is going to be a lifelong change, that enhances the significance of that hazard. It amplifies what I call the risk because if something goes wrong, then its consequences are greater.

It may also be that in other settings, what you’re doing is going to have a much lower risk because you’re working with a more familiar substance, your predictive power is much greater, and it’s not going into a human or an animal or into the environment. So I think that you have to say that the risk and the benefits, by the way, all are going to depend upon the particular application.

SULLIVAN: Yeah, I think on this point of application, there’s many players involved in that, right. Like, we often hear about this puzzle of who’s actually responsible for ensuring safety and a reasonable balance between risks and benefits or hazards and benefits, to quote you. Is it the scientists, the biotech companies, government agencies? And then if you could touch upon, as well, maybe how does the nature of genome editing risks … how do those responsibilities get divvied up?

CHARO: Well, in the 1980s, we had a very significant policy discussion about whether we should regulate the technology—no matter how it’s used or for whatever purpose—or if we should simply fold the technology in with all the other technologies that we currently have and regulate its applications the way we regulate applications generally. And we went for the second, the so-called coordinated framework.

So what we have in the United States is a system in which if you use genome editing in purely laboratory-based work, then you will be regulated the way we regulate laboratories.

There’s also, at most universities because of the way the government works with this, something called Institutional Biosafety Committees, IBCs. You want to do research that involves recombinant DNA and modern biotechnology, including genome editing but not limited to it, you have to go first to your IBC, and they look and see what you’re doing to decide if there’s a danger there that you have not anticipated that requires special attention.

If what you’re doing is going to get released into the environment or it’s going to be used to change an animal that’s going to be in the environment, then there are agencies that oversee the safety of our environment, predominantly the Environmental Protection Agency and the U.S. Department of Agriculture.

If you’re working with humans and you’re doing medical therapies, like you’re doing the gene therapies that just have been developed for things like sickle cell anemia, then you have to go through a very elaborate regulatory process that’s overseen by the Food and Drug Administration and also seen locally at the research stages overseen by institutional review boards that make sure the people who are being recruited into research understand what they’re getting into, that they’re the right people to be recruited, etc.

So we do have this kind of Jenga game …

SULLIVAN: [LAUGHS] Yeah, sounds like it.

CHARO: … of regulatory agencies. And on top of all that, most of this involves professionals who’ve had to be licensed in some way. There may be state laws specifically on licensing. If you are dealing with things that might cross national borders, there may be international treaties and agreements that cover this.

And, of course, the insurance industry plays a big part because they decide whether or not what you’re doing is safe enough to be insured. So all of these things come together in a way that is not at all easy to understand if you’re not, kind of, working in the field. But the bottom-line thing to remember, the way to really think about it is, we don’t regulate genome editing; we regulate the things that use genome editing.

SULLIVAN: Yeah, that makes a lot of sense. Actually, maybe just following up a little bit on this notion of a variety of different, particularly like government agencies being involved. You know, in this multi-stakeholder model, where do you see gaps today that need to be filled, some of the pros and cons to keep in mind, and, you know, just as we think about distributing these systems at a global level, like, what are some of the considerations you are keeping in mind on that front?

CHARO: Well, certainly there are times where the way the statutes were written that govern the regulation of drugs or the regulation of foods did not anticipate this tremendous capacity we now have in the area of biotechnology generally or genome editing in particular. And so you can find that there are times where it feels a little bit ambiguous, and the agencies have to figure out how to apply their existing rules.

So an example. If you’re going to make alterations in an animal, right, we have a system for regulating drugs, including veterinary drugs. But we didn’t have something that regulated genome editing of animals. But in a sense, genome editing of an animal is the same thing as using a veterinary drug. You’re trying to affect the animal’s physical constitution in some fashion.

And it took a long time within the FDA to, sort of, work out how the regulation of veterinary drugs would apply if you think about the genetic construct that’s being used to alter the animal as the same thing as injecting a chemically based drug. And on that basis, they now know here’s the regulatory path—here are the tests you have to do; here are the permissions you have to do; here’s the surveillance you have to do after it goes on the market.

Even there, sometimes, it was confusing. What happens when it’s not the kind of animal you’re thinking about when you think about animal drugs? Like, we think about pigs and dogs, but what about mosquitoes?

Because there, you’re really thinking more about pests, and if you’re editing the mosquito so that it can’t, for example, transmit dengue fever, right, it feels more like a public health thing than it is a drug for the mosquito itself, and it, kind of, fell in between the agencies that possibly had jurisdiction. And it took a while for the USDA, the Department of Agriculture, and the Food and Drug Administration to work out an agreement about how they would share this responsibility. So you do get those kinds of areas in which you have at least ambiguity.

We also have situations where frankly the fact that some things can move across national borders means you have to have a system for harmonizing or coordinating national rules. If you want to, for example, genetically engineer mosquitoes that can’t transmit dengue, mosquitoes have a tendency to fly. [LAUGHTER] And so … they can’t fly very far. That’s good. That actually makes it easier to control.

But if you’re doing work that’s right near a border, then you have to be sure that the country next to you has the same rules for whether it’s permitted to do this and how to surveil what you’ve done in order to be sure that you got the results you wanted to get and no other results. And that also is an area where we have a lot of work to be done in terms of coordinating across government borders and harmonizing our rules.

SULLIVAN: Yeah, I mean, you’ve touched on this a little bit, but there is such this striking balance between advancing technology, ensuring public safety, and sometimes, I think it feels just like you’re walking a tightrope where, you know, if we clamp down too hard, we’ll stifle innovation, and if we’re too lax, we risk some of these unintended consequences. And on a global scale like you just mentioned, as well. How has the field of genome editing found its balance?

CHARO: It’s still being worked out, frankly, but it’s finding its balance application by application. So in the United States, we have two very different approaches on regulation of things that are going to go into the market.

Some things can’t be marketed until they’ve gotten an approval from the government. So you come up with a new drug, you can’t sell that until it’s gone through FDA approval.

On the other hand, for most foods that are made up of familiar kinds of things, you can go on the market, and it’s only after they’re on the market that the FDA can act to withdraw it if a problem arises. So basically, we have either pre-market controls: you can’t go on without permission. Or post-market controls: we can take you off the market if a problem occurs.

How do we decide which one is appropriate for a particular application? It’s based on our experience. New drugs typically are both less familiar than existing things on the market and also have a higher potential for injury if they, in fact, are not effective or they are, in fact, dangerous and toxic.

If you have foods, even bioengineered foods, that are basically the same as foods that are already here, it can go on the market with notice but without a prior approval. But if you create something truly novel, then it has to go through a whole long process.

And so that is the way that we make this balance. We look at the application area. And we’re just now seeing in the Department of Agriculture a new approach on some of the animal editing, again, to try and distinguish between things that are simply a more efficient way to make a familiar kind of animal variant and those things that are genuinely novel and to have a regulatory process that is more rigid the more unfamiliar it is and the more that we see a risk associated with it.

SULLIVAN: I know we’re at the end of our time here and maybe just a quick kind of lightning-round of a question. For students, young scientists, lawyers, or maybe even entrepreneurs listening who are inspired by your work, what’s the single piece of advice you give them if they’re interested in policy, regulation, the ethical side of things in genomics or other fields?

CHARO: I’d say be a bio-optimist and read a lot of science fiction. Because it expands your imagination about what the world could be like. Is it going to be a world in which we’re now going to be growing our buildings instead of building them out of concrete?

Is it going to be a world in which our plants will glow in the evening so we don’t need to be using batteries or electrical power from other sources but instead our environment is adapting to our needs?

You know, expand your imagination with a sense of optimism about what could be and see ethics and regulation not as an obstacle but as a partner to bringing these things to fruition in a way that’s responsible and helpful to everyone.

[TRANSITION MUSIC]

SULLIVAN: Wonderful. Well, Alta, this has been just an absolute pleasure. So thank you.

CHARO: It was my pleasure. Thank you for having me.

SULLIVAN: Now, I’m happy to bring in Daniel Kluttz. As a partner general manager in Microsoft’s Office of Responsible AI, Daniel leads the group’s Sensitive Uses and Emerging Technologies program.

Daniel, it’s great to have you here. Thanks for coming in.

DANIEL KLUTTZ: It’s great to be here, Kathleen.

SULLIVAN: Yeah. So maybe before we unpack Alta Charo’s insights, I’d love to just understand the elevator pitch here. What exactly is [the] Sensitive Uses and Emerging Tech program, and what was the impetus for establishing it?

KLUTTZ: Yeah. So the Sensitive Uses and Emerging Technologies program sits within our Office of Responsible AI at Microsoft. And inherent in the name, there are two real core functions. There’s the sensitive uses and emerging technologies. What does that mean?

Sensitive uses, think of that as Microsoft’s internal consulting and oversight function for our higher-risk, most impactful AI system deployments. And so my team is a team of multidisciplinary experts who engages in sort of a white-glove-treatment sort of way with product teams at Microsoft that are designing, building, and deploying these higher-risk AI systems, and where that sort of consulting journey culminates is in a set of bespoke requirements tailored to the use case of that given system that really implement and apply our more standardized, generalized requirements that apply across the board.

Then the emerging technologies function of my team faces a little bit further out, trying to look around corners to see what new and novel and emerging risks are coming out of new AI technologies with the idea that we work with our researchers, our engineering partners, and, of course, product leaders across the company to understand where Microsoft is going with those emerging technologies, and we’re developing sort of rapid, quick-fire early-steer guidance that implements our policies ahead of that formal internal policymaking process, which can take a bit of time. So it’s designed to, sort of, both afford that innovation speed that we like to optimize for at Microsoft but also integrate our responsible AI commitments and our AI principles into emerging product development.

SULLIVAN: That segues really nicely, actually, as we met with Professor Charo and she was, you know, talking about the field of genome editing and the governing at the application level. I’d love to just understand how similar or not is that to managing the risks of AI in our world?

KLUTTZ: Yeah. I mean, Professor Charo’s comments were music to my ears because, you know, where we make our bread and butter, so to speak, in our team is in applying to use cases. AI systems, especially in this era of generative AI, are almost inherently multi-use, dual use. And so what really matters is how you’re going to apply that more general-purpose technology. Who’s going to use it? In what domain is it going to be deployed? And then tailor that oversight to those use cases. Try to be risk proportionate.

Professor Charo talked a little bit about this, but if it’s something that’s been done before and it’s just a new spin on an old thing, maybe we’re not so concerned about how closely we need to oversee and gate that application of that technology, whereas if it’s something new and novel or some new risk that might be posed by that technology, we take a little bit closer look and we are overseeing that in a more sort of high-touch way.

SULLIVAN: Maybe following up on that, I mean, how do you define sensitive use or maybe like high-impact application, and once that’s labeled, what happens? Like, what kind of steps kick in from there?

KLUTTZ: Yeah. So we have this Sensitive Uses program that’s been at Microsoft since 2019. I came to Microsoft in 2019 when we were starting this program in the Office of Responsible AI, and it had actually been incubated in Microsoft Research with our Aether community of colleagues who are experts in sociotechnical approaches to responsible AI, as well. Once we put it in the Office of Responsible AI, I came over. I came from academia. I was a researcher myself …

SULLIVAN: At Berkeley, right?

KLUTTZ: At Berkeley. That’s right. Yep. Sociologist by training and a lawyer in a past life. [LAUGHTER] But that has helped sort of bridge those fields for me.

But Sensitive Uses, we force all of our teams when they’re envisioning their system design to think about, could the reasonably foreseeable use or misuse of the system that they’re developing in practice result in three really major, sort of, risk types. One is, could that deployment result in a consequential impact on someone’s legal position or life opportunity? Another category we have is, could that foreseeable use or misuse result in significant psychological or physical injury or harm? And then the third really ties in with a longstanding commitment we’ve had to human rights at Microsoft. And so could that system in it’s reasonably foreseeable use or misuse result in human rights impacts and injurious consequences to folks along different dimensions of human rights?

Once you decide, we have a process to reporting that project into my office, and we will triage that project, working with the product team, for example, and our Responsible AI Champs community, which are folks who are dispersed throughout the ecosystem at Microsoft and educated in our responsible AI program, and then determine, OK, is it in scope for our program? If it is, say, OK, we’re going to go along for that ride with you, and then we get into that whole sort of consulting arrangement that then culminates in this set of bespoke use-case-based requirements applying our AI principles.

SULLIVAN: That’s super fascinating. What are some of the approaches in the governance of genome editing are you maybe seeing happening in AI governance or maybe just, like, bubbling up in conversations around it?

KLUTTZ: Yeah, I mean, I think we’ve learned a lot from fields like genome editing that Professor Charo talked about and others. And again, it gets back to this, sort of, risk-proportionate-based approach. It’s a balancing test. It’s a tradeoff of trying to, sort of, foster innovation and really look for the beneficial uses of these technologies. I appreciated her speaking about that. What are the intended uses of the system, right? And then getting to, OK, how do we balance trying to, again, foster that innovation in a very fast-moving space, a pretty complex space, and a very unsettled space contrasting to other, sort of, professional fields or technological fields that have a long history and are relatively settled from an oversight and regulatory standpoint? This one is not, and for good reason. It is still developing.

And I think, you know, there are certain oversight and policy regimes that exist today that can be applied. Professor Charo talked about this, as well, where, you know, maybe you have certain policy and oversight regimes that, depending on how the application of that technology is applied, applies there versus some horizontal, overarching regulatory sort of framework. And I think that applies from an internal governance standpoint, as well.

SULLIVAN: Yeah. It’s a great point. So what isn’t being explored from genome editing that, you know, maybe we think could be useful to AI governance, or as we think about the evolving frameworks …

KLUTTZ: Yeah.

SULLIVAN: … what maybe we should be taking into account from what Professor Charo shared with us?

KLUTTZ: So one of the things I’ve thought about and took from Professor Charo’s discussion was she had just this amazing way of framing up how genome editing regulation is done. And she said, you know, we don’t regulate genome editing; we regulate the things that use genome editing. And while it’s not a one-to-one analogy with the AI space because we do have this sort of very general model level distinction versus application layer and even platform layer distinctions, I think it’s fair to say, you know, we don’t regulate AI applications writ large. We regulate the things that use AI in a very similar way. And that’s how we think of our internal policy and oversight process at Microsoft, as well.

And maybe there are things that we regulated and oversaw internally at the first instance and the first time we saw it come through, and it graduates into more of a programmatic framework for how we manage that. So one good example of that is some of our higher-risk AI systems that we offer out of Azure at the platform level. When I say that, I mean APIs that you call that developers can then build their own applications on top of. We were really deep in evaluating and assessing mitigations on those platform systems in the first instance, but we also graduated them into what we call our Limited Access AI services program.

And some of the things that Professor Charo discussed really resonated with me. You know, she had this moment where she was mentioning how, you know, you want to know who’s using your tools and how they’re being used. And it’s the same concepts. We want to have trust in our customers, we want to understand their use cases, and we want to apply technical controls that, sort of, force those use cases or give us signal post-deployment that use cases are being done in a way that may give us some level of concern, to reach out and understand what those use cases are.

SULLIVAN: Yeah, you’re hitting on a great point. And I love this kind of layered approach that we’re taking and that Alta highlighted, as well. Maybe to double-click a little bit just on that post-market control and what we’re tracking, kind of, once things are out and being used by our customers. How do we take some of that deployment data and bring it back in to maybe even better inform upfront governance or just how we think about some of the frameworks that we’re operating in?

KLUTTZ: It’s a great question. The number one thing is for us at Microsoft, we want to know the voice of our customer. We want our customers to talk to us. We don’t want to just understand telemetry and data. But it’s really getting out there and understanding from our customers and not just our customers. I would say our stakeholders is maybe a better term because that includes civil society organizations. It includes governments. It includes all of these non, sort of, customer actors that we care about and that we’re trying to sort of optimize for, as well. It includes end users of our enterprise customers. If we can gather data about how our products are being used and trying to understand maybe areas that we didn’t foresee how customers or users might be using those things, and then we can tune those systems to better align with what both customers and users want but also our own AI principles and policies and programs.

SULLIVAN: Daniel, before coming to Microsoft, you led social science research and sociotechnical applications of AI-driven tech at Berkeley. What do you think some of the biggest challenges are in defining and maybe even just, kind of, measuring at, like, a societal level some of the impacts of AI more broadly?

KLUTTZ: Measuring social phenomenon is a difficult thing. And one of the things that, as social scientists, you’re very interested in is scientifically observing and measuring social phenomena. Well, that sounds great. It sounds also very high level and jargony. What do we mean by that? You know, it’s very easy to say that you’re collecting data and you’re measuring, I don’t know, trust in AI, right? That’s a very fuzzy concept.

SULLIVAN: Right. Definitely.

KLUTTZ: It is a concept that we want to get to, but we have to unpack that, and we have to develop what we call measurable constructs. What are the things that we might observe that could give us an indication toward what is a very fuzzy and general concept. And there’s challenges with that everywhere. And I’m extremely fortunate to work at Microsoft with some of the world’s leading sociotechnical researchers and some of these folks who are thinking about—you know, very steeped in measurement theory, literally PhDs in these fields—how to both measure and allow for a scalable way to do that at a place the size of Microsoft. And that is trying to develop frameworks that are scalable and repeatable and put into our platform that then serves our product teams. Are we providing, as a platform, a service to those product teams that they can plug in and do their automated evaluations at scale as much as possible and then go back in over the top and do some of your more qualitative targeted testing and evaluations.

SULLIVAN: Yeah, makes a lot of sense. Before we close out, if you’re game for it, maybe we do a quick lightning round. Just 30-second answers here. Favorite real-world sensitive use case you’ve ever reviewed.

KLUTTZ: Oh gosh. Wow, this is where I get to be the social scientist.

SULLIVAN: [LAUGHS] Yes.

KLUTTZ: It’s like, define favorite, Kathleen. [LAUGHS] Most memorable, most painful.

SULLIVAN: Let’s do most memorable.

KLUTTZ: We’ll do most memorable.

SULLIVAN: Yeah.

KLUTTZ: You know, I would say the most memorable project I worked on was when we rolled out the new Bing Chat, which is no longer called Bing Chat, because that was the first really big cross-company effort to deploy GPT-4, which was, you know, the next step up in AI innovation from our partners at OpenAI. And I really value working hand in hand with engineering teams and with researchers and that was us at our best and really sort of turbocharged the model that we have.

SULLIVAN: Wonderful. What’s one of the most overused phrases that you have in your AI governance meetings?

KLUTTZ: Gosh. [LAUGHS] If I hear “We need to get aligned; we need to align on this more” …

SULLIVAN: [LAUGHS] Right.

KLUTTZ: But, you know, it’s said for a reason. And I think it sort of speaks to that clever nature. That’s one that comes to mind.

SULLIVAN: That’s great. And then maybe, maybe last one. What are you most excited about in the next, I don’t know, let’s say three months? This world is moving so fast!

KLUTTZ: You know, the pace of innovation, as you just said, is just staggering. It is unbelievable. And sometimes it can feel overwhelming in my space. But what I am most excited about is how we are building up this Emerging … I mentioned this Emerging Technologies program in my team as a, sort of, formal program is relatively new. And I really enjoy being able to take a step back and think a little bit more about the future and a little bit more holistically. And I love working with engineering teams and sort of strategic visionaries who are thinking about what we’re doing a year from now or five years from now, or even 10 years from now, and I get to be a part of those conversations. And that really gives me energy and helps me … helps keep me grounded and not just dealing with the day to day, and, you know, various fire drills that you may run. It’s thinking strategically and having that foresight about what’s to come. And it’s exciting.

SULLIVAN: Great. Well, Daniel, just thanks so much for being here. I had such a wonderful discussion with you, and I think the thoughtfulness in our discussion today I hope resonates with our listeners. And again, thanks to Alta for setting the stage and sharing her really amazing, insightful thoughts here, as well. So thank you.

[MUSIC]

KLUTTZ: Thank you, Kathleen. I appreciate it. It’s been fun.

SULLIVAN: And to our listeners, thanks for tuning in. You can find resources related to this podcast in the show notes. And if you want to learn more about how Microsoft approaches AI governance, you can visit microsoft.com/RAI.

See you next time! 

[MUSIC FADES]



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleCan the grid cope with AI’s growing appetite?
Next Article [2408.15969] Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems
Advanced AI Editor
  • Website

Related Posts

PadChest-GR: A bilingual grounded radiology reporting benchmark for chest X-rays

June 26, 2025

AI Testing and Evaluation: Learnings from Science and Industry

June 23, 2025

Learning from other domains to advance AI evaluation and testing

June 23, 2025
Leave A Reply Cancel Reply

Latest Posts

‘The Joan’ At Liberty Station

Brice Arsène Yonkeu Brings Diaspora Dialogue to Gagosian Park & 75

Vatican Unveils Last of Four Restored Raphael Rooms

Olivia Rodrigo Dazzles At BST Hyde Park With Surprise Ed Sheeran Duet

Latest Posts

Perplexity’s AI-Powered Comet Browser Expands to Windows in Beta

June 30, 2025

vLex Sale Creates New Market Challenges – Artificial Lawyer

June 30, 2025

Qwen VLo Image Generation AI Model Released, Offers Image Generation and Editing for Free

June 30, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Perplexity’s AI-Powered Comet Browser Expands to Windows in Beta
  • vLex Sale Creates New Market Challenges – Artificial Lawyer
  • Qwen VLo Image Generation AI Model Released, Offers Image Generation and Editing for Free
  • What happened when Anthropic’s Claude AI ran a small shop for a month (spoiler: it got weird)
  • Measuring AI in the world

Recent Comments

No comments to show.

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

YouTube LinkedIn
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.