Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

New Benchmark for Domestic Image Creation! Volcano Engine Seedream 4.0 Released, Leading a New Trend in Multi-Image Creation with 4K Direct Output

ALSP Lawhive Buys Woodstock As SMB Market Evolves – Artificial Lawyer

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions – Takara TLDR

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Tencent Hunyuan

Tencent Hunyuan Releases and Open Sources Image Model 2.1, Supporting Native 2K Images_the_model_being

By Advanced AI EditorSeptember 10, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


On the night of September 9, Tencent released and open-sourced the latest image model “Hunyuan Image 2.1”. This model boasts industry-leading capabilities and supports native 2K high-definition images.

After being open-sourced, the Hunyuan Image 2.1 model quickly climbed the Hugging Face model popularity chart, becoming the third most popular model globally. Among the top eight models on the list, Tencent’s Hunyuan model family occupies three positions.

At the same time, the Tencent Hunyuan team revealed that they will soon release a native multimodal image generation model.

Hunyuan Image 2.1 is a comprehensive upgrade from the 2.0 architecture, placing greater emphasis on balancing generation quality and performance. The new version not only supports native input in both Chinese and English but also enables high-quality generation of text with complex semantics in both languages. Additionally, there have been significant improvements in the overall aesthetic performance of generated images and the diversity of applicable scenarios.

This means that designers, illustrators, and other visual creators can more efficiently and conveniently translate their ideas into visuals. Whether generating high-fidelity creative illustrations, creating posters and packaging designs with Chinese and English slogans, or producing complex four-panel comics and graphic novels, Hunyuan Image 2.1 can provide creators with fast, high-quality support.

Hunyuan Image 2.1 is a fully open-source base model that not only achieves industry-leading generation results but can also flexibly adapt to the diverse derivative needs of the community. Currently, the model weights and code for Hunyuan Image 2.1 have been officially released in open-source communities such as Hugging Face and GitHub, allowing both individual and enterprise developers to conduct research or develop various derivative models and plugins based on this foundational model.

Thanks to a larger-scale image-text alignment dataset, Hunyuan Image 2.1 has significantly improved in complex semantic understanding and cross-domain generalization capabilities. It supports prompts of up to 1000 tokens, allowing for precise generation of scene details, character expressions, and actions, enabling separate deion and control of multiple objects. Furthermore, Hunyuan Image 2.1 can finely control text within images, allowing for a natural integration of textual information with visuals.

(Highlight 1: Hunyuan Image 2.1 demonstrates strong understanding of complex semantics, supporting separate deion and precise generation of multiple subjects.)

(Highlight 2: More stable control over text and scene details in images.)

(Highlight 3: Supports a rich variety of styles, such as realistic, comic, and vinyl figures, while possessing high aesthetic quality.)

Tencent’s Hunyuan Image Model 2.1 is at the SOTA level among open-source models.

According to the SSAE (Structured Semantic Alignment Evaluation) results, Tencent’s Hunyuan Image Model 2.1 currently achieves the best semantic alignment performance among open-source models, coming very close to the performance of closed-source commercial models (GPT-Image).

Additionally, GSB (Good Same Bad) evaluation results indicate that Hunyuan Image 2.1’s image generation quality is comparable to that of the closed-source commercial model Seedream 3.0, while being slightly superior to similar open-source models like Qwen-Image.

The Hunyuan Image 2.1 model not only utilizes a vast amount of training data but also employs structured, varying-length, and diverse content captions, greatly enhancing its understanding of textual deions. The caption model incorporates OCR and IP RAG expert models, effectively improving its responsiveness to complex text recognition and world knowledge.

To significantly reduce computational load and enhance training and inference efficiency, the model employs a VAE with a 32-fold ultra-high compression ratio and utilizes dinov2 alignment and repa loss to ease training difficulties. As a result, the model can efficiently generate native 2K images.

In terms of text encoding, Hunyuan Image 2.1 is equipped with dual text encoders: one MLLM module to further enhance image-text alignment capabilities, and another ByT5 model to boost text generation expressiveness. The overall architecture consists of a single/dual-stream DiT model with 17 billion parameters.

Moreover, Hunyuan Image 2.1 addresses the training stability issues of the average flow model (meanflow) at the 17 billion parameter level, reducing the model’s inference steps from 100 to 8, significantly improving inference speed while maintaining the original performance of the model.

The concurrently open-sourced Hunyuan text rewriting model (PromptEnhancer) is the industry’s first systematic, industrial-grade Chinese and English rewriting model, capable of structurally optimizing user text commands to enrich visual expression, thereby greatly enhancing the semantic performance of the images generated from the rewritten text.

Tencent Hunyuan continues to deepen its efforts in the field of image generation, having previously released the first open-source Chinese native DiT architecture large image model—Hunyuan DiT, as well as the industry’s first commercial-grade real-time image model—Hunyuan Image 2.0. The newly launched native 2K model, Hunyuan Image 2.1, achieves a better balance between quality and performance, meeting various needs of users and enterprises in diverse visual scenarios.

At the same time, Tencent Hunyuan is firmly embracing open-source, continuously releasing various sizes of language models, comprehensive multimodal generation capabilities and toolset plugins for images, videos, and 3D, providing open-source foundations that approach commercial model performance. The total number of derivative models for images and videos has reached 3,000, and downloads of the Hunyuan 3D series models in the community have exceeded 2.3 million, making it the most popular 3D open-source model globally.返回搜狐,查看更多



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleShutterstock Expands AI Horizons: New Partnership with Reka AI to Enhance Digital Asset Metadata – Adobe (NASDAQ:ADBE), Apple (NASDAQ:AAPL)
Next Article Lifelong Learning Large Models, the Razor Principle is Key_new_to_models
Advanced AI Editor
  • Website

Related Posts

Tencent Hunyuan Releases and Open Sources Image Model 2.1, Supporting Native 2K High-Quality Images_the_model_This

September 10, 2025

World No. 1! Tencent’s Hunyuan Translation Model Hunyuan-MT-7B Tops Open Source Rankings_model_the_along

September 7, 2025

AI Model Learns to ‘Act Accordingly’, Opening a New Era of Adaptive AI_model_The_this

September 7, 2025

Comments are closed.

Latest Posts

Leon Black and Leslie Wexner’s Letters to Jeffrey Epstein Released

School of Visual Arts Transfers Ownership to Nonprofit Alumni Society

Cristin Tierney Moves Gallery to Tribeca for 15th Anniversary Exhibition

Anne Imhof Reimagines Football Jerseys with Nike

Latest Posts

New Benchmark for Domestic Image Creation! Volcano Engine Seedream 4.0 Released, Leading a New Trend in Multi-Image Creation with 4K Direct Output

September 10, 2025

ALSP Lawhive Buys Woodstock As SMB Market Evolves – Artificial Lawyer

September 10, 2025

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions – Takara TLDR

September 10, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • New Benchmark for Domestic Image Creation! Volcano Engine Seedream 4.0 Released, Leading a New Trend in Multi-Image Creation with 4K Direct Output
  • ALSP Lawhive Buys Woodstock As SMB Market Evolves – Artificial Lawyer
  • F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions – Takara TLDR
  • The Fastest Inference Model Built on Qwen Using Cerebras Chips_model_the_This
  • OpenAI installs parental controls following California teen’s death

Recent Comments

  1. https://555winBall.com/ on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  2. Juniorfar on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. 가락동노래방 on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. 你爸爸的鸡巴断了,你倒霉的阴部,你爸爸的网络钓鱼,你妈妈的内脏 on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. RichardDusty on Trump’s Tech Sanctions To Empower China, Betray America

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.