Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

IBM research reveals sports fans like AI-enhanced content | News

Deepfakes in the wild, more big AI funding rounds, a mixed bag for earnings, and more layoffs

MIT study: 95% GenAI projects fail to show returns

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
VentureBeat AI

ByteDance releases new open source Seed-OSS-36B model

By Advanced AI EditorAugust 21, 2025No Comments6 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now

TikTok is making headlines again today after the White House joined the popular social media application — but its parent company ByteDance, a Chinese web giant, also had a surprise announcement up its sleeve.

The company’s Seed Team of AI researchers today released Seed-OSS-36B on AI code sharing website Hugging Face.

Seed-OSS-36B is new line of open source, large language models (LLM) designed for advanced reasoning, and developer-focused usability with a longer token context — that is, how much information the models can accept as inputs and then output in a single exchange — than many competing LLMs from U.S. tech companies, even leaders such as OpenAI and Anthropic.

The collection introduces three main variants:

AI Scaling Hits Its Limits

Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:

Turning energy into a strategic advantage

Architecting efficient inference for real throughput gains

Unlocking competitive ROI with sustainable AI systems

Secure your spot to stay ahead: https://bit.ly/4mwGngO

Seed-OSS-36B-Base with synthetic data

Seed-OSS-36B-Base without synthetic data

Seed-OSS-36B-Instruct

In releasing both synthetic and non-synthetic versions of the Seed-OSS-36B-Base model, the Seed Team sought to balance practical performance with research flexibility.

The synthetic-data variant, trained with additional instruction data, consistently delivers stronger scores on standard benchmarks and is intended as a higher-performing general-purpose option.

The non-synthetic model, by contrast, omits these augmentations, creating a cleaner foundation that avoids potential bias or distortion introduced by synthetic instruction data.

By providing both, the team gives applied users access to improved results while ensuring researchers retain a neutral baseline for studying post-training methods.

Meanwhile, the Seed-OSS-36B-Instruct model differs in that it is post-trained with instruction data to prioritize task execution and instruction following, rather than serving purely as a foundation model.

All three models are released under the Apache-2.0 license, allowing free use, modification, and redistribution by researchers and developers working for enterprises.

That means they can be used to power commercial applications, internal to a company or external/customer-facing, without paying ByteDance any licensing fees or for application programming interface (API) usage.

This continues the summer 2025 trend of Chinese companies shipping powerful open source models with OpenAI attempting to catch up with its own open source gpt-oss duet released earlier this month.

The Seed Team positions Seed-OSS for international applications, emphasizing versatility across reasoning, agent-like task execution, and multilingual settings.

The Seed Team, formed in 2023, has concentrated on building foundation models that can serve both research and applied use cases.

Design and core features

The architecture behind Seed-OSS-36B combines familiar design choices such as causal language modeling, grouped query attention, SwiGLU activation, RMSNorm, and RoPE positional encoding.

Each model carries 36 billion parameters across 64 layers and supports a vocabulary of 155,000 tokens.

One of the defining features is its native long-context capability, with a maximum length of 512,000 tokens, designed to process extended documents and reasoning chains without performance loss.

That’s twice the length of OpenAI’s new GPT-5 model family and is roughly equivalent to about 1,600 pages of text, the length of a Christian Bible.

Another distinguishing element is the introduction of a thinking budget, which lets developers specify how much reasoning the model should perform before delivering an answer.

It’s something we’ve seen from other recent open source models as well, including Nvidia’s new Nemotron-Nano-9B-v2, also available on Hugging Face.

In practice, this means teams can tune performance depending on the complexity of the task and the efficiency requirements of deployment.

Budgets are recommended in multiples of 512 tokens, with 0 providing a direct response mode/

Competitive performance on third-party benchmarks

Benchmarks published with the release position Seed-OSS-36B among the stronger large open-source models. The Instruct variant, in particular, posts state-of-the-art results in multiple areas.

Math and reasoning: Seed-OSS-36B-Instruct achieves 91.7 percent on AIME24 and 65 on BeyondAIME, both representing open-source “state-of-the-art” (SOTA).

Coding: On LiveCodeBench v6, the Instruct model records 67.4, another SOTA score.

Long-context handling: On RULER at 128K context length, it reaches 94.6, marking the highest open-source result reported.

Base model performance: The synthetic-data Base variant delivers 65.1 on MMLU-Pro and 81.7 on MATH, both state-of-the-art results in their categories.

The no-synthetic Base version, while slightly behind on many measures, proves competitive in its own right.

It outperforms its synthetic counterpart on GPQA-D, providing researchers with a cleaner, instruction-free baseline for experimentation.

For enterprises comparing open options, these results suggest Seed-OSS offers strong potential across math-heavy, coding, and long-context workloads while still providing flexibility for research use cases.

Access and deployment

Beyond performance, the Seed Team highlights accessibility for developers and practitioners. The models can be deployed using Hugging Face Transformers, with quantization support in both 4-bit and 8-bit formats to reduce memory requirements.

They also integrate with vLLM for scalable serving, including configuration examples and API server instructions.

To lower barriers further, the team includes scripts for inference, prompt customization, and tool integration.

For technical leaders managing small teams or working under budget constraints, these provisions are positioned to make experimentation with 36-billion-parameter models more approachable.

Licensing and considerations for enterprise decision-makers

With the models offered under Apache-2.0, organizations can adopt them without restrictive licensing terms, an important factor for teams balancing legal and operational concerns.

For decision makers evaluating the open-source landscape, the release brings three takeaways:

State-of-the-art benchmarks across math, coding, and long-context reasoning.

A balance between higher-performing synthetic-trained models and clean research baselines.

Accessibility features that lower operational overhead for lean engineering teams.

By placing strong performance and flexible deployment under an open license, ByteDance’s Seed Team has added new options for enterprises, researchers, and developers alike.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleYou can now talk to Google Photos to make your edits
Next Article OpenAI says GPT-6 is coming and it’ll be better than GPT-5 (obviously)
Advanced AI Editor
  • Website

Related Posts

CodeSignal’s new AI tutoring app Cosmo wants to be the ‘Duolingo for job skills’

August 20, 2025

LLMs generate ‘fluent nonsense’ when reasoning outside their training zone

August 20, 2025

Stop benchmarking in the lab: Inclusion Arena shows how LLMs perform in production

August 20, 2025

Comments are closed.

Latest Posts

Tanya Bonakdar Gallery to Close Los Angeles Space

Ancient Silver Coins Suggest New History of Trading in Southeast Asia

Dallas Museum of Art Names Brian Ferriso as Its Next Director

Rapa Nui’s Moai Statues Threatened by Rising Sea Levels, Flooding

Latest Posts

IBM research reveals sports fans like AI-enhanced content | News

August 21, 2025

Deepfakes in the wild, more big AI funding rounds, a mixed bag for earnings, and more layoffs

August 21, 2025

MIT study: 95% GenAI projects fail to show returns

August 21, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • IBM research reveals sports fans like AI-enhanced content | News
  • Deepfakes in the wild, more big AI funding rounds, a mixed bag for earnings, and more layoffs
  • MIT study: 95% GenAI projects fail to show returns
  • Y Combinator alum SRE.ai raises $7.2M for DevOps AI agents
  • mSCoRe: a Multilingual and Scalable Benchmark for Skill-based Commonsense Reasoning – Takara TLDR

Recent Comments

  1. JuliusRex on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  2. NathanFairl on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. choctaw casino hotel on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. Registrera dig on New York Sales Miss the Mark as Top Works and Young Artists Fall to Lower Levels
  5. NathanFairl on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.