Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

How E2B became essential to 88% of Fortune 100 companies and raised $21 million

The first look: Disrupt 2025 AI Stage revealed

Rise of AI deepfakes and fraudulent candidates are changing how TA recruits

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Industry AI
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Coding Assistants

I retested Microsoft Copilot’s AI coding skills in 2025 and now it’s got serious game

By Advanced AI EditorApril 25, 2025No Comments7 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


copilot

Microsoft

There’s been a ton of buzz about how AIs can help programming, but in the first year or two of generative AI, much of that was hype. Microsoft ran huge events celebrating how Copilot could help you code, but when I put it to the test in April 2024, it failed all four of my standardized tests. It completely struck out. Crashed and burned. Fell off the cliff. It performed the worst of any AI I tested.

Mixed metaphors aside, let’s stick with baseball. Copilot traded its cleats for a bus pass. It was not worthy.

Also: The best AI for coding in 2025 (and what not to use)

But time spent in the bullpen of life seems to have helped Copilot. This time, when it showed up for tryouts, it was warmed up and ready to step into the box. It was throwing heat in the bullpen. When it was time to play, it had its eye on the ball and its swing dialed in. Clearly, it was game-ready and looking for a pitch to drive.

But could it withstand my tests? With a squint in my eye, I stepped onto the pitcher’s mound and started off with an easy lob. Back in 2024, you could feel the wind as Copilot swung and missed. But now, in April 2025, Copilot connected squarely with the ball and hit it straight and true.

Also: How I test an AI chatbot’s coding ability – and you can, too

We had to send Copilot down, but it fought its way back to the show. Here’s the play-by-play.

1. Writing a WordPress plugin

Well, Copilot certainly improved since its first run of this test in April 2024. The first time, it didn’t provide code to actually display the randomized lines. It did store them in a value, but it didn’t retrieve and display them. In other words, it swung and missed. It didn’t produce any output.

This is the result of the latest run:

line-shuffler

Screenshot by David Gewirtz/ZDNET

This time, the code worked. It did leave a random extra blank line at the end, but since it fulfilled the programming assignment, we’ll call it good.

Also: How to use ChatGPT to write code – and my favorite trick to debug what it generates

Copilot’s unbroken streak of absolutely unmitigated programming failures has been broken. Let’s see how it does in the rest of the tests.

2. Rewriting a string function

This test is designed to test dollars and cents conversions. In my first test back in April 20224, the Copilot-generated code did properly flag an error if a value containing a letter or more than one decimal point is sent to it, but didn’t perform a complete validation. It allowed results through that could have caused subsequent routines to fail.

Also: How I used ChatGPT to write a custom JavaScript bookmarklet

This run, however, did pretty well. It performs most of the tests properly. It returns false for numbers with more than two digits to the right of the decimal point, like 1.234 and 1.230. It also returns false for numbers with extra leading zeros. So 0.01 is allowed, but 00.01 is not.

Technically, these values could be converted to usable currency values, but it’s never bad for a validation routine to be strict in its tests. The main goal is that the validation routine doesn’t let a value through that could cause a subsequent routine to crash. Copilot did good here.

We’re now at two for two, a huge improvement over its results from its first run.

3. Finding an annoying bug

I gotta tell you how Copilot first answered this back in April 2024, because it’s just too good.

Also: Why I just added Gemini 2.5 Pro to the very short list of AI tools I pay for

This tests the AI’s ability to think a few chess moves ahead. The answer that seems obvious isn’t the right answer. I got caught by that when I was originally debugging the issue that eventually became this test.

On Copilot’s first run, it suggested I check the spelling of my function name and the WordPress hook name. The WordPress hook is a published thing, so Copilot should have been able to confirm spelling. And my function is my function, so I can spell it however I want. If I had misspelled it somewhere in the code, the IDE would have very visibly pointed it out.

And it got better. Back then, Copilot also quite happily repeated the problem statement to me, suggesting I solve the problem myself. Yeah, its entire recommendation was that I debug it. Well, duh. Then, it ended with “consider seeking support from the plugin developer or community forums. 😊” — and yeah, that emoji was part of the AI’s response.

It was a spectacular, enthusiastic, emojic failure. See what I mean? Early AI answers, no matter how useless, should be immortalized.

Especially when Copilot wasn’t nearly as much fun this time. It just solved it. Quickly, cleanly, clearly. Done and done. Solved.

cleanshot-2025-04-23-at-10-33-062x

Screenshot by David Gewirtz/ZDNET

That puts Copilot at three-for-three and decisively moves it out of the “don’t use this tool” category. Bases are loaded. Let’s see if Copilot can score a home run.

4. Writing a script

The idea with this test is that it asks about a fairly obscure Mac scripting tool called Keyboard Maestro, as well as Apple’s scripting language AppleScript, and Chrome scripting behavior. For the record, Keyboard Maestro is one of the single biggest reasons I use Macs over Windows for my daily productivity, because it allows the entire OS and the various applications to be reprogrammed to suit my needs. It’s that powerful.

In any case, to pass the test, the AI has to properly describe how to solve the problem using a mix of Keyboard Maestro code, AppleScript code, and Chrome API functionality. 

Also: AI has grown beyond human knowledge, says Google’s DeepMind unit

Back in the day, Copilot didn’t do it right. It completely ignored Keyboard Maestro (at the time, it probably wasn’t in its knowledge base). In the generated AppleScript, where I asked it to just scan the current window, Copilot repeated the process for all windows, returning results for the wrong window (the last one in the chain).

But not now. This time, Copilot did it right. It did exactly what was asked, got the right window and tab, properly talked to Keyboard Maestro and Chrome, and used actual AppleScript syntax for the AppleScript.

Bases loaded. Home run.

Overall results

Last year, I said I wasn’t impressed. In fact, I found the results a little demoralizing. But I also said this:

Ah well, Microsoft does improve its products over time. Maybe by next year.

In the past year, Copilot went from strikeouts to scoreboard shaker. It went from batting cleanup in the basement to chasing a pennant under the lights.

What about you? Have you taken Copilot or another AI coding assistant out to the field lately? Do you think it’s finally ready for the big leagues, or is it still riding the bench? Have you had any strikeouts or home runs using AI for development? And what would it take for one of these tools to earn a spot in your starting lineup? Let us know in the comments below.

You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.





Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleAn OpenAI researcher who worked on GPT-4.5 had their green card denied
Next Article Atty who refused to charge six-time Tesla vandal sparks controversy
Advanced AI Editor
  • Website

Related Posts

AI coding assistants are getting ever more popular – especially in this country

June 20, 2025

Is vibe coding a death knell for traditional software development roles?

June 19, 2025

Why AI code assistants need a security reality check

June 19, 2025
Leave A Reply

Latest Posts

Scottish Museum Group Warns of ‘Policing of Gender’—and More Art News

David Geffen Sued By Estranged Husband for Breach of Contract

Auction House Will Sell Egyptian Artifact Despite Concern From Experts

Anish Kapoor Lists New York Apartment for $17.75 M.

Latest Posts

How E2B became essential to 88% of Fortune 100 companies and raised $21 million

July 28, 2025

The first look: Disrupt 2025 AI Stage revealed

July 28, 2025

Rise of AI deepfakes and fraudulent candidates are changing how TA recruits

July 28, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • How E2B became essential to 88% of Fortune 100 companies and raised $21 million
  • The first look: Disrupt 2025 AI Stage revealed
  • Rise of AI deepfakes and fraudulent candidates are changing how TA recruits
  • Tencent’s Hunyuan Team Releases Open-Source Hunyuan3D World Model 1.0, Can Generate Explorable 3D Worlds
  • Icertis Partners With Dioptra – 3rd AI Deal in 18 Months – Artificial Lawyer

Recent Comments

  1. binance推薦獎金 on [2407.11104] Exploring the Potentials and Challenges of Deep Generative Models in Product Design Conception
  2. психолог онлайн индивидуально on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. GeraldDes on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. binance sign up on Inclusion Strategies in Workplace | Recruiting News Network
  5. Rejestracja on Online Education – How I Make My Videos

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.