Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models – Takara TLDR

DeepSeek Unveils Upgrade Of Flagship AI Model With Support For Chinese Chips As Beijing Races To Cut Reliance On Nvidia, US Tech – NVIDIA (NASDAQ:NVDA)

Indian AI talent and founders need to ‘wake up’: Google DeepMind’s Manish Gupta

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Alibaba Cloud (Qwen)

This open-source LLM could redefine AI research, and it’s 100% public

By Advanced AI EditorAugust 6, 2025No Comments7 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


What is an open-source LLM by EPFL and ETH Zurich

ETH Zurich and EPFL’s open-weight LLM offers a transparent alternative to black-box AI built on green compute and set for public release.

Large language models (LLMs), which are neural networks that predict the next word in a sentence, are powering today’s generative AI. Most remain closed, usable by the public, yet inaccessible for inspection or improvement. This lack of transparency conflicts with Web3’s principles of openness and permissionless innovation.

So everyone took notice when ETH Zurich and Swiss Federal Institute of Technology in Lausanne (EPFL) announced a fully public model, trained on Switzerland’s carbon‑neutral “Alps” supercomputer and slated for release under Apache 2.0 later this year. 

It is generally referred to as “Switzerland’s open LLM,” “a language model built for the public good,” or “the Swiss large language model,” but no specific brand or project name has been shared in public statements so far.

Open‑weight LLM is a model whose parameters can be downloaded, audited and fine‑tuned locally, unlike API‑only “black‑box” systems.

Anatomy of the Swiss public LLM

Scale: Two configurations, 8 billion and 70 billion parameters,  trained on 15 trillion tokens.Languages: Coverage in 1,500 languages thanks to a 60 / 40 English–non‑English data set.Infrastructure: 10,000 Nvidia Grace‑Hopper chips on “Alps,” powered entirely by renewable energy.Licence: Open code and weights, enabling fork‑and‑modify rights for researchers and startups alike.

What makes Switzerland’s LLM stand out

Switzerland’s LLM blends openness, multilingual scale and green infrastructure to offer a radically transparent LLM.

Open-by-design architecture: Unlike GPT‑4, which offers only API access, this Swiss LLM will provide all its neural-network parameters (weights), training code and data set references under an Apache 2.0 license, empowering developers to fine‑tune, audit and deploy without restrictions.Dual model sizes: Will be released in 8 billion and 70 billion parameter versions. The initiative spans lightweight to large-scale usage with consistent openness, something GPT‑4, estimated at 1.7 trillion parameters, does not offer publicly.Massive multilingual reach: Trained on 15 trillion tokens across more than 1,500 languages (~60% English, 40% non-English), it challenges GPT‑4’s English-centric dominance with truly global inclusivity.Green, sovereign compute: Built on Swiss National Supercomputing Centre (CSCS)’s carbon-neutral Alps cluster, 10,000 Nvidia Grace‑Hopper superchips delivering over 40 exaflops in FP8 mode, it combines scale with sustainability absent in private cloud training.Transparent data practices: Complying with Swiss data protection, copyright norms and EU AI Act transparency, the model respects crawler opt‑outs without sacrificing performance, underscoring a new ethical standard.

What fully open AI model unlocks for Web3

Full model transparency enables onchain inference, tokenized data flows and oracle-safe DeFi integrations with no black boxes required.

Onchain inference: Running trimmed versions of the Swiss model inside rollup sequencers could enable real‑time smart‑contract summarization and fraud proofs.Tokenized data marketplaces: Because the training corpus is transparent, data contributors can be rewarded with tokens and audited for bias.Composability with DeFi tooling: Open weights allow deterministic outputs that oracles can verify, reducing manipulation risk when LLMs feed price models or liquidation bots.

These design goals map cleanly onto high‑intent SEO phrases, including decentralized AI, blockchain AI integration and onchain inference, boosting the article’s discoverability without keyword stuffing.

Did you know? Open-weight LLMs can run inside rollups, helping smart contracts summarize legal docs or flag suspicious transactions in real time.

AI market tailwinds you can’t ignore

The AI market is projected to surpass $500 billion, with more than 80% controlled by closed providers.Blockchain‑AI is projected to grow from $550 million in 2024 to $4.33 billion by 2034 (22.9% CAGR).68% of enterprises already pilot AI agents, and 59% cite model flexibility and governance as top selection criteria, a vote of confidence for open weights.

Regulation: EU AI Act meets sovereign model

Public LLMs, like Switzerland’s upcoming model, are designed to comply with the EU AI Act, offering a clear advantage in transparency and regulatory alignment.

On July 18, 2025, the European Commission issued guidance for systemic‑risk foundation models. Requirements include adversarial testing, detailed training‑data summaries and cybersecurity audits, all effective Aug. 2, 2025. Open‑source projects that publish their weights and data sets can satisfy many of these transparency mandates out of the box, giving public models a compliance edge.

Swiss LLM vs GPT‑4

Swiss LLM (upcoming) vs GPT‑4

GPT‑4 still holds an edge in raw performance due to scale and proprietary refinements. But the Swiss model closes the gap, especially for multilingual tasks and non-commercial research, while delivering auditability that proprietary models fundamentally cannot.

Did you know? Starting Aug. 2, 2025, foundation models in the EU must publish data summaries, audit logs, and adversarial testing results, requirements that the upcoming Swiss open-source LLM already satisfies.

Alibaba Qwen vs Switzerland’s public LLM: A cross-model comparison

While Qwen emphasizes model diversity and deployment performance, Switzerland’s public LLM focuses on full-stack transparency and multilingual depth.

Switzerland’s public LLM is not the only serious contender in the open-weight LLM race. Alibaba’s Qwen series, Qwen3 and Qwen3‑Coder, has rapidly emerged as a high-performing, fully open-source alternative. 

While Switzerland’s public LLM shines with full-stack transparency, releasing its weights, training code and data set methodology in full, Qwen’s openness focuses on weights and code, with less clarity around training data sources. 

When it comes to model diversity, Qwen offers an expansive range, including dense models and a sophisticated Mixture-of-Experts (MoE) architecture boasting up to 235 billion parameters (22 billion active), along with hybrid reasoning modes for more context-aware processing. By contrast, Switzerland’s public LLM maintains a more academic focus, offering two clean, research-oriented sizes: 8 billion and 70 billion.

On performance, Alibaba’s Qwen3‑Coder has been independently benchmarked by sources including Reuters, Elets CIO and Wikipedia to rival GPT‑4 in coding and math-intensive tasks. Switzerland’s public LLM’s performance data is still pending public release. 

On multilingual capability, Switzerland’s public LLM takes the lead with support for over 1,500 languages, whereas Qwen’s coverage includes 119, still substantial but more selective. Finally, the infrastructure footprint reflects divergent philosophies: Switzerland’s public LLM runs on CSCS’s carbon-neutral Alps supercomputer, a sovereign, green facility, while Qwen models are trained and served via Alibaba Cloud, prioritizing speed and scale over energy transparency.

Below is a side-by-side look at how the two open-source LLM initiatives measure up across key dimensions:

Switzerland’s public LLM (ETH Zurich, EPFL)

Did you know? Qwen3‑Coder uses a MoE setup with 235B total parameters but only 22 billion are active at once, optimizing speed without full compute cost.

Why builders should care

Full control: Own the model stack, weights, code, and data provenance. No vendor lock‑in or API restrictions.Customizability: Tailor models through fine‑tuning to domain-specific tasks, onchain analysis, DeFi oracle validation, code generationCost optimization: Deploy on GPU marketplaces or rollup nodes; quantization to 4-bit can reduce inference costs by 60%–80%.Compliance by design: Transparent documentation aligns seamlessly with EU AI Act requirements, fewer legal hurdles and time to deployment.

Pitfalls to navigate while working with open-source LLMs

Open-source LLMs offer transparency but face hurdles like instability, high compute demands and legal uncertainty.

Key challenges faced by open-source LLMs include:

Performance and scale gaps: Despite sizable architectures, community consensus questions whether open-source models can match the reasoning, fluency, and tool-integration capabilities of closed models like GPT‑4 or Claude4.Implementation and component instability: LLM ecosystems often face software fragmentation, with issues like version mismatches, missing modules or crashes common at runtime.Integration complexity: Users frequently encounter dependency conflicts, complex environment setups or configuration errors when deploying open-source LLMs.Resource intensity: Model training, hosting and inference demand substantial compute and memory (e.g., multi-GPU, 64 GB RAM), making them less accessible to smaller teams.Documentation deficiencies: Transitioning from research to deployment is often hindered by incomplete, outdated or inaccurate documentation, complicating adoption.Security and trust risks: Open ecosystems can be susceptible to supply-chain threats (e.g., typosquatting via hallucinated package names). Relaxed governance can lead to vulnerabilities like backdoors, improper permissions or data leakage.Legal and IP ambiguities: Using web-crawled data or mixed licenses may expose users to intellectual-property conflicts or violate usage terms, unlike thoroughly audited closed models.Hallucination and reliability issues: Open models can generate plausible yet incorrect outputs, especially when fine-tuned without rigorous oversight. For example, developers report hallucinated package references in 20% of code snippets.Latency and scaling challenges: Local deployments can suffer from slow response times, timeouts, or instability under load, problems rarely seen in managed API services.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleOpenAI talks with investors about share sale at $500 billion valuation
Next Article DeepSeek AI Predicts 2025 Prices for XRP, Pepe, Shiba Inu
Advanced AI Editor
  • Website

Related Posts

DeepSeek’s upgraded AI model absorbs reasoning feature in move towards ‘agent era’

August 21, 2025

Alibaba launches open-source AI image editor

August 21, 2025

Alibaba’s Open-Source Qwen-Image-Edit Challenges Photoshop with Free AI-Powered Image Editing

August 20, 2025

Comments are closed.

Latest Posts

White House Targets Specific Artworks at Smithsonian Museums

French Art Historian Trying to Block Bayeux Tapestry’s Move to London

Czech Man Sues Christie’s For Information on Nazi-Looted Artworks

Tanya Bonakdar Gallery to Close Los Angeles Space

Latest Posts

Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models – Takara TLDR

August 22, 2025

DeepSeek Unveils Upgrade Of Flagship AI Model With Support For Chinese Chips As Beijing Races To Cut Reliance On Nvidia, US Tech – NVIDIA (NASDAQ:NVDA)

August 22, 2025

Indian AI talent and founders need to ‘wake up’: Google DeepMind’s Manish Gupta

August 22, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models – Takara TLDR
  • DeepSeek Unveils Upgrade Of Flagship AI Model With Support For Chinese Chips As Beijing Races To Cut Reliance On Nvidia, US Tech – NVIDIA (NASDAQ:NVDA)
  • Indian AI talent and founders need to ‘wake up’: Google DeepMind’s Manish Gupta
  • OpenAI Model Earns Gold-Medal Score at International Math Olympiad and Advances Path to Artificial General Intelligence
  • OpenAI announces New Delhi office as it expands footprint in India

Recent Comments

  1. NelsonKic on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  2. Grovervot on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. Eliseosycle on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. NelsonKic on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. Ahrefsfluex on 12 AI Copywriting Tools for Faster, Smarter Content Creation (2025)

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.