Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Can AI run a physical shop? Anthropic’s Claude tried and the results were gloriously, hilariously bad

3D Printing With Filigree Patterns | Two Minute Papers #89

Ryan Hall: Principles of Jiu Jitsu | Take It Uneasy Podcast

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • Amazon (Titan)
    • Anthropic (Claude 3)
    • Cohere (Command R)
    • Google DeepMind (Gemini)
    • IBM (Watsonx)
    • Inflection AI (Pi)
    • Meta (LLaMA)
    • OpenAI (GPT-4 / GPT-4o)
    • Reka AI
    • xAI (Grok)
    • Adobe Sensi
    • Aleph Alpha
    • Alibaba Cloud (Qwen)
    • Apple Core ML
    • Baidu (ERNIE)
    • ByteDance Doubao
    • C3 AI
    • DataRobot
    • DeepSeek
  • AI Research & Breakthroughs
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Education AI
    • Energy AI
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Media & Entertainment
    • Transportation AI
    • Manufacturing AI
    • Retail AI
    • Agriculture AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
Facebook X (Twitter) Instagram
Advanced AI News
VentureBeat AI

The Hidden Costs of AI: Securing Inference in an Age of Attacks

Advanced AI EditorBy Advanced AI EditorJune 29, 2025No Comments12 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


This article is part of VentureBeat’s special issue, “The Real Cost of AI: Performance, Efficiency and ROI at Scale.” Read more from this special issue.

AI’s promise is undeniable, but so are its blindsiding security costs at the inference layer. New attacks targeting AI’s operational side are quietly inflating budgets, jeopardizing regulatory compliance and eroding customer trust, all of which threaten the return on investment (ROI) and total cost of ownership of enterprise AI deployments.

AI has captivated the enterprise with its potential for game-changing insights and efficiency gains. Yet, as organizations rush to operationalize their models, a sobering reality is emerging: The inference stage, where AI translates investment into real-time business value, is under siege. This critical juncture is driving up the total cost of ownership (TCO) in ways that initial business cases failed to predict.

Security executives and CFOs who greenlit AI projects for their transformative upside are now grappling with the hidden expenses of defending these systems. Adversaries have discovered that inference is where AI “comes alive” for a business, and it’s precisely where they can inflict the most damage. The result is a cascade of cost inflation: Breach containment can exceed $5 million per incident in regulated sectors, compliance retrofits run into the hundreds of thousands and trust failures can trigger stock hits or contract cancellations that decimate projected AI ROI. Without cost containment at inference, AI becomes an ungovernable budget wildcard.

The unseen battlefield: AI inference and exploding TCO

AI inference is rapidly becoming the “next insider risk,” Cristian Rodriguez, field CTO for the Americas at CrowdStrike, told the audience at RSAC 2025.

Other technology leaders echo this perspective and see a common blind spot in enterprise strategy. Vineet Arora, CTO at WinWire, notes that many organizations “focus intensely on securing the infrastructure around AI while inadvertently sidelining inference.” This oversight, he explains, “leads to underestimated costs for continuous monitoring systems, real-time threat analysis and rapid patching mechanisms.”

Another critical blind spot, according to Steffen Schreier, SVP of product and portfolio at Telesign, is “the assumption that third-party models are thoroughly vetted and inherently safe to deploy.”

He warned that in reality, “these models often haven’t been evaluated against an organization’s specific threat landscape or compliance needs,” which can lead to harmful or non-compliant outputs that erode brand trust. Schreier told VentureBeat that “inference-time vulnerabilities — like prompt injection, output manipulation or context leakage — can be exploited by attackers to produce harmful, biased or non-compliant outputs. This poses serious risks, especially in regulated industries, and can quickly erode brand trust.”

When inference is compromised, the fallout hits multiple fronts of TCO. Cybersecurity budgets spiral, regulatory compliance is jeopardized and customer trust erodes. Executive sentiment reflects this growing concern. In CrowdStrike’s State of AI in Cybersecurity survey, only 39% of respondents felt generative AI’s rewards clearly outweigh the risks, while 40% judged them comparable. This ambivalence underscores a critical finding: Safety and privacy controls have become top requirements for new gen AI initiatives, with a striking 90% of organizations now implementing or developing policies to govern AI adoption. The top concerns are no longer abstract; 26% cite sensitive data exposure and 25% fear adversarial attacks as key risks.

Security leaders exhibit mixed sentiments regarding the overall safety of gen AI, with top concerns centered on the exposure of sensitive data to LLMs (26%) and adversarial attacks on AI tools (25%).

Anatomy of an inference attack

The unique attack surface exposed by running AI models is being aggressively probed by adversaries. To defend against this, Schreier advises, “it is critical to treat every input as a potential hostile attack.” Frameworks like the OWASP Top 10 for Large Language Model (LLM) Applications catalogue these threats, which are no longer theoretical but active attack vectors impacting the enterprise:

Prompt injection (LLM01) and insecure output handling (LLM02): Attackers manipulate models via inputs or outputs. Malicious inputs can cause the model to ignore instructions or divulge proprietary code. Insecure output handling occurs when an application blindly trusts AI responses, allowing attackers to inject malicious scripts into downstream systems.

Training data poisoning (LLM03) and model poisoning: Attackers corrupt training data by sneaking in tainted samples, planting hidden triggers. Later, an innocuous input can unleash malicious outputs.

Model denial of service (LLM04): Adversaries can overwhelm AI models with complex inputs, consuming excessive resources to slow or crash them, resulting in direct revenue loss.

Supply chain and plugin vulnerabilities (LLM05 and LLM07): The AI ecosystem is built on shared components. For instance, a vulnerability in the Flowise LLM tool exposed private AI dashboards and sensitive data, including GitHub tokens and OpenAI API keys, on 438 servers.

Sensitive information disclosure (LLM06): Clever querying can extract confidential information from an AI model if it was part of its training data or is present in the current context.

Excessive agency (LLM08) and Overreliance (LLM09): Granting an AI agent unchecked permissions to execute trades or modify databases is a recipe for disaster if manipulated.

Model theft (LLM10): An organization’s proprietary models can be stolen through sophisticated extraction techniques — a direct assault on its competitive advantage.

Underpinning these threats are foundational security failures. Adversaries often log in with leaked credentials. In early 2024, 35% of cloud intrusions involved valid user credentials, and new, unattributed cloud attack attempts spiked 26%, according to the CrowdStrike 2025 Global Threat Report. A deepfake campaign resulted in a fraudulent $25.6 million transfer, while AI-generated phishing emails have demonstrated a 54% click-through rate, more than four times higher than those written by humans.

The OWASP framework illustrates how various LLM attack vectors target different components of an AI application, from prompt injection at the user interface to data poisoning in the training models and sensitive information disclosure from the datastore.

Back to basics: Foundational security for a new era

Securing AI requires a disciplined return to security fundamentals — but applied through a modern lens. “I think that we need to take a step back and ensure that the foundation and the fundamentals of security are still applicable,” Rodriguez argued. “The same approach you would have to securing an OS is the same approach you would have to securing that AI model.”

This means enforcing unified protection across every attack path, with rigorous data governance, robust cloud security posture management (CSPM), and identity-first security through cloud infrastructure entitlement management (CIEM) to lock down the cloud environments where most AI workloads reside. As identity becomes the new perimeter, AI systems must be governed with the same strict access controls and runtime protections as any other business-critical cloud asset.

The specter of “shadow AI”: Unmasking hidden risks

Shadow AI, or the unsanctioned use of AI tools by employees, creates a massive, unknown attack surface. A financial analyst using a free online LLM for confidential documents can inadvertently leak proprietary data. As Rodriguez warned, queries to public models can “become another’s answers.” Addressing this requires a combination of clear policy, employee education, and technical controls like AI security posture management (AI-SPM) to discover and assess all AI assets, sanctioned or not.

Fortifying the future: Actionable defense strategies

While adversaries have weaponized AI, the tide is beginning to turn. As Mike Riemer, Field CISO at Ivanti, observes, defenders are beginning to “harness the full potential of AI for cybersecurity purposes to analyze vast amounts of data collected from diverse systems.” This proactive stance is essential for building a robust defense, which requires several key strategies:

Budget for inference security from day zero: The first step, according to Arora, is to begin with “a comprehensive risk-based assessment.” He advises mapping the entire inference pipeline to identify every data flow and vulnerability. “By linking these risks to possible financial impacts,” he explains, “we can better quantify the cost of a security breach” and build a realistic budget.

To structure this more systematically, CISOs and CFOs should start with a risk-adjusted ROI model. One approach:

Security ROI = (estimated breach cost × annual risk probability) – total security investment

For example, if an LLM inference attack could result in a $5 million loss and the likelihood is 10%, the expected loss is $500,000. A $350,000 investment in inference-stage defenses would yield a net gain of $150,000 in avoided risk. This model enables scenario-based budgeting tied directly to financial outcomes.

Enterprises allocating less than 8 to 12% of their AI project budgets to inference-stage security are often blindsided later by breach recovery and compliance costs. A Fortune 500 healthcare provider CIO, interviewed by VentureBeat and requesting anonymity, now allocates 15% of their total gen AI budget to post-training risk management, including runtime monitoring, AI-SPM platforms and compliance audits. A practical budgeting model should allocate across four cost centers: runtime monitoring (35%), adversarial simulation (25%), compliance tooling (20%) and user behavior analytics (20%).

Here’s a sample allocation snapshot for a $2 million enterprise AI deployment based on VentureBeat’s ongoing interviews with CFOs, CIOs and CISOs actively budgeting to support AI projects:

Budget categoryAllocationUse case exampleRuntime monitoring$300,000Behavioral anomaly detection (API spikes)Adversarial simulation$200,000Red team exercises to probe prompt injectionCompliance tooling$150,000EU AI Act alignment, SOC 2 inference validationsUser behavior analytics$150,000Detect misuse patterns in internal AI use

These investments reduce downstream breach remediation costs, regulatory penalties and SLA violations, all helping to stabilize AI TCO.

Implement runtime monitoring and validation: Begin by tuning anomaly detection to detect behaviors at the inference layer, such as abnormal API call patterns, output entropy shifts or query frequency spikes. Vendors like DataDome and Telesign now offer real-time behavioral analytics tailored to gen AI misuse signatures.

Teams should monitor entropy shifts in outputs, track token irregularities in model responses and watch for atypical frequency in queries from privileged accounts. Effective setups include streaming logs into SIEM tools (such as Splunk or Datadog) with tailored gen AI parsers and establishing real-time alert thresholds for deviations from model baselines.

Adopt a zero-trust framework for AI: Zero-trust is non-negotiable for AI environments. It operates on the principle of “never trust, always verify.” By adopting this architecture, Riemer notes, organizations can ensure that “only authenticated users and devices gain access to sensitive data and applications, regardless of their physical location.”

Inference-time zero-trust should be enforced at multiple layers:

Identity: Authenticate both human and service actors accessing inference endpoints.

Permissions: Scope LLM access using role-based access control (RBAC) with time-boxed privileges.

Segmentation: Isolate inference microservices with service mesh policies and enforce least-privilege defaults through cloud workload protection platforms (CWPPs).

A proactive AI security strategy requires a holistic approach, encompassing visibility and supply chain security during development, securing infrastructure and data and implementing robust safeguards to protect AI systems in runtime during production.

Protecting AI ROI: A CISO/CFO collaboration model

Protecting the ROI of enterprise AI requires actively modeling the financial upside of security. Start with a baseline ROI projection, then layer in cost-avoidance scenarios for each security control. Mapping cybersecurity investments to avoided costs including incident remediation, SLA violations and customer churn, turns risk reduction into a measurable ROI gain.

Enterprises should model three ROI scenarios that include baseline, with security investment and post-breach recovery to show cost avoidance clearly. For example, a telecom deploying output validation prevented 12,000-plus misrouted queries per month, saving $6.3 million annually in SLA penalties and call center volume. Tie investments to avoided costs across breach remediation, SLA non-compliance, brand impact and customer churn to build a defensible ROI argument to CFOs.

Checklist: CFO-Grade ROI protection model

CFOs need to communicate with clarity on how security spending protects the bottom line. To safeguard AI ROI at the inference layer, security investments must be modeled like any other strategic capital allocation: With direct links to TCO, risk mitigation and revenue preservation.

Use this checklist to make AI security investments defensible in the boardroom — and actionable in the budget cycle.

Link every AI security spend to a projected TCO reduction category (compliance, breach remediation, SLA stability).

Run cost-avoidance simulations with 3-year horizon scenarios: baseline, protected and breach-reactive.

Quantify financial risk from SLA violations, regulatory fines, brand trust erosion and customer churn.

Co-model inference-layer security budgets with both CISOs and CFOs to break organizational silos.

Present security investments as growth enablers, not overhead, showing how they stabilize AI infrastructure for sustained value capture.

This model doesn’t just defend AI investments; it defends budgets and brands and can protect and grow boardroom credibility.

Concluding analysis: A strategic imperative

CISOs must present AI risk management as a business enabler, quantified in terms of ROI protection, brand trust preservation and regulatory stability. As AI inference moves deeper into revenue workflows, protecting it isn’t a cost center; it’s the control plane for AI’s financial sustainability. Strategic security investments at the infrastructure layer must be justified with financial metrics that CFOs can act on.

The path forward requires organizations to balance investment in AI innovation with an equal investment in its protection. This necessitates a new level of strategic alignment. As Ivanti CIO Robert Grazioli told VentureBeat: “CISO and CIO alignment will be critical to effectively safeguard modern businesses.” This collaboration is essential to break down the data and budget silos that undermine security, allowing organizations to manage the true cost of AI and turn a high-risk gamble into a sustainable, high-ROI engine of growth.

Telesign’s Schreier added: “We view AI inference risks through the lens of digital identity and trust. We embed security across the full lifecycle of our AI tools — using access controls, usage monitoring, rate limiting and behavioral analytics to detect misuse and protect both our customers and their end users from emerging threats.”

He continued: “We approach output validation as a critical layer of our AI security architecture, particularly because many inference-time risks don’t stem from how a model is trained, but how it behaves in the wild.”



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleStyLit, Illumination-Guided Artistic Style Transfer | Two Minute Papers #91
Next Article Four OpenAI Lead Researchers Leave for Meta, Slack Profile Deactivated
Advanced AI Editor
  • Website

Related Posts

Can AI run a physical shop? Anthropic’s Claude tried and the results were gloriously, hilariously bad

June 29, 2025

Kumo’s ‘relational foundation model’ predicts the future your LLM can’t see

June 29, 2025

Model minimalism: The new AI strategy saving companies millions

June 29, 2025
Leave A Reply Cancel Reply

Latest Posts

Tituss Burgess Teams Up With Lyft To Offer Pride Weekend Discounts

‘Squid Game’ Director Hwang Dong-Hyuk On Making Seasons 2 And 3

Nathan Fielder’s The Rehearsal is One of Many Genre-Defying Projects.

From Royal Drawings To Rare Meteorites

Latest Posts

Can AI run a physical shop? Anthropic’s Claude tried and the results were gloriously, hilariously bad

June 29, 2025

3D Printing With Filigree Patterns | Two Minute Papers #89

June 29, 2025

Ryan Hall: Principles of Jiu Jitsu | Take It Uneasy Podcast

June 29, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Can AI run a physical shop? Anthropic’s Claude tried and the results were gloriously, hilariously bad
  • 3D Printing With Filigree Patterns | Two Minute Papers #89
  • Ryan Hall: Principles of Jiu Jitsu | Take It Uneasy Podcast
  • Germany asks Apple, Google to ban Chinese AI app DeepSeek over privacy concerns
  • Kumo’s ‘relational foundation model’ predicts the future your LLM can’t see

Recent Comments

No comments to show.

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

YouTube LinkedIn
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.