Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Whitney Cummings: Compassion Toward Robots | AI Podcast Clips

EU Commission: “AI Gigafactories” to strengthen Europe as a business location

United States, China, and United Kingdom Lead the Global AI Ranking According to Stanford HAI’s Global AI Vibrancy Tool

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • Adobe Sensi
    • Aleph Alpha
    • Alibaba Cloud (Qwen)
    • Amazon AWS AI
    • Anthropic (Claude)
    • Apple Core ML
    • Baidu (ERNIE)
    • ByteDance Doubao
    • C3 AI
    • Cohere
    • DataRobot
    • DeepSeek
  • AI Research & Breakthroughs
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Education AI
    • Energy AI
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Media & Entertainment
    • Transportation AI
    • Manufacturing AI
    • Retail AI
    • Agriculture AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
Advanced AI News
Home » Meta FAIR advances human-like AI with five major releases
Manufacturing AI

Meta FAIR advances human-like AI with five major releases

Advanced AI BotBy Advanced AI BotApril 17, 2025No Comments7 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


The Fundamental AI Research (FAIR) team at Meta has announced five projects advancing the company’s pursuit of advanced machine intelligence (AMI).

The latest releases from Meta focus heavily on enhancing AI perception – the ability for machines to process and interpret sensory information – alongside advancements in language modelling, robotics, and collaborative AI agents.

Meta stated its goal involves creating machines “that are able to acquire, process, and interpret sensory information about the world around us and are able to use this information to make decisions with human-like intelligence and speed.”

The five new releases represent diverse but interconnected efforts towards achieving this ambitious goal.

Perception Encoder: Meta sharpens the ‘vision’ of AI

Central to the new releases is the Perception Encoder, described as a large-scale vision encoder designed to excel across various image and video tasks.

Vision encoders function as the “eyes” for AI systems, allowing them to understand visual data.

Meta highlights the increasing challenge of building encoders that meet the demands of advanced AI, requiring capabilities that bridge vision and language, handle both images and videos effectively, and remain robust under challenging conditions, including potential adversarial attacks.

The ideal encoder, according to Meta, should recognise a wide array of concepts while distinguishing subtle details—citing examples like spotting “a stingray burrowed under the sea floor, identifying a tiny goldfinch in the background of an image, or catching a scampering agouti on a night vision wildlife camera.”

Meta claims the Perception Encoder achieves “exceptional performance on image and video zero-shot classification and retrieval, surpassing all existing open source and proprietary models for such tasks.”

Furthermore, its perceptual strengths reportedly translate well to language tasks. 

When aligned with a large language model (LLM), the encoder is said to outperform other vision encoders in areas like visual question answering (VQA), captioning, document understanding, and grounding (linking text to specific image regions). It also reportedly boosts performance on tasks traditionally difficult for LLMs, such as understanding spatial relationships (e.g., “if one object is behind another”) or camera movement relative to an object.

“As Perception Encoder begins to be integrated into new applications, we’re excited to see how its advanced vision capabilities will enable even more capable AI systems,” Meta said.

Perception Language Model (PLM): Open research in vision-language

Complementing the encoder is the Perception Language Model (PLM), an open and reproducible vision-language model aimed at complex visual recognition tasks. 

PLM was trained using large-scale synthetic data combined with open vision-language datasets, explicitly without distilling knowledge from external proprietary models.

Recognising gaps in existing video understanding data, the FAIR team collected 2.5 million new, human-labelled samples focused on fine-grained video question answering and spatio-temporal captioning. Meta claims this forms the “largest dataset of its kind to date.”

PLM is offered in 1, 3, and 8 billion parameter versions, catering to academic research needs requiring transparency.

Alongside the models, Meta is releasing PLM-VideoBench, a new benchmark specifically designed to test capabilities often missed by existing benchmarks, namely “fine-grained activity understanding and spatiotemporally grounded reasoning.”

Meta hopes the combination of open models, the large dataset, and the challenging benchmark will empower the open-source community.

Meta Locate 3D: Giving robots situational awareness

Bridging the gap between language commands and physical action is Meta Locate 3D. This end-to-end model aims to allow robots to accurately localise objects in a 3D environment based on open-vocabulary natural language queries.

Meta Locate 3D processes 3D point clouds directly from RGB-D sensors (like those found on some robots or depth-sensing cameras). Given a textual prompt, such as “flower vase near TV console,” the system considers spatial relationships and context to pinpoint the correct object instance, distinguishing it from, say, a “vase on the table.”

The system comprises three main parts: a preprocessing step converting 2D features to 3D featurised point clouds; the 3D-JEPA encoder (a pretrained model creating a contextualised 3D world representation); and the Locate 3D decoder, which takes the 3D representation and the language query to output bounding boxes and masks for the specified objects.

Alongside the model, Meta is releasing a substantial new dataset for object localisation based on referring expressions. It includes 130,000 language annotations across 1,346 scenes from the ARKitScenes, ScanNet, and ScanNet++ datasets, effectively doubling existing annotated data in this area.

Meta sees this technology as crucial for developing more capable robotic systems, including its own PARTNR robot project, enabling more natural human-robot interaction and collaboration.

Dynamic Byte Latent Transformer: Efficient and robust language modelling

Following research published in late 2024, Meta is now releasing the model weights for its 8-billion parameter Dynamic Byte Latent Transformer.

This architecture represents a shift away from traditional tokenisation-based language models, operating instead at the byte level. Meta claims this approach achieves comparable performance at scale while offering significant improvements in inference efficiency and robustness.

Traditional LLMs break text into ‘tokens’, which can struggle with misspellings, novel words, or adversarial inputs. Byte-level models process raw bytes, potentially offering greater resilience.

Meta reports that the Dynamic Byte Latent Transformer “outperforms tokeniser-based models across various tasks, with an average robustness advantage of +7 points (on perturbed HellaSwag), and reaching as high as +55 points on tasks from the CUTE token-understanding benchmark.”

By releasing the weights alongside the previously shared codebase, Meta encourages the research community to explore this alternative approach to language modelling.

Collaborative Reasoner: Meta advances socially-intelligent AI agents

The final release, Collaborative Reasoner, tackles the complex challenge of creating AI agents that can effectively collaborate with humans or other AIs.

Meta notes that human collaboration often yields superior results, and aims to imbue AI with similar capabilities for tasks like helping with homework or job interview preparation.

Such collaboration requires not just problem-solving but also social skills like communication, empathy, providing feedback, and understanding others’ mental states (theory-of-mind), often unfolding over multiple conversational turns.

Current LLM training and evaluation methods often neglect these social and collaborative aspects. Furthermore, collecting relevant conversational data is expensive and difficult.

Collaborative Reasoner provides a framework to evaluate and enhance these skills. It includes goal-oriented tasks requiring multi-step reasoning achieved through conversation between two agents. The framework tests abilities like disagreeing constructively, persuading a partner, and reaching a shared best solution.

Meta’s evaluations revealed that current models struggle to consistently leverage collaboration for better outcomes. To address this, they propose a self-improvement technique using synthetic interaction data where an LLM agent collaborates with itself.

Generating this data at scale is enabled by a new high-performance model serving engine called Matrix. Using this approach on maths, scientific, and social reasoning tasks reportedly yielded improvements of up to 29.4% compared to the standard ‘chain-of-thought’ performance of a single LLM.

By open-sourcing the data generation and modelling pipeline, Meta aims to foster further research into creating truly “social agents that can partner with humans and other agents.”

These five releases collectively underscore Meta’s continued heavy investment in fundamental AI research, particularly focusing on building blocks for machines that can perceive, understand, and interact with the world in more human-like ways. 

See also: Meta will train AI models using EU user data

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleEU Commission: “AI Gigafactories” to strengthen Europe as a business location
Next Article Wikipedia offers AI developers its article data on Kaggle to stop automated scraping
Advanced AI Bot
  • Website

Related Posts

AI deployemnt security and governance, with Deloitte

June 3, 2025

MIT spinout teaches AI to admit when it’s clueless

June 3, 2025

IBM and Roche use AI to forecast blood sugar levels

June 2, 2025
Leave A Reply Cancel Reply

Latest Posts

Original Prototype for Jane Birkin’s Hermes Bag Consigned to Sotheby’s

Viral Trump Vs. Musk Feud Ignites A Meme Chain Reaction

UK Art Dealer Sentenced To 2.5 Years In Jail For Selling Art to Suspected Hezbollah Financier

Artists Accuse Dealer Reco Sturgis of Withholding Payments and Artworks

Latest Posts

Whitney Cummings: Compassion Toward Robots | AI Podcast Clips

June 7, 2025

EU Commission: “AI Gigafactories” to strengthen Europe as a business location

June 7, 2025

United States, China, and United Kingdom Lead the Global AI Ranking According to Stanford HAI’s Global AI Vibrancy Tool

June 7, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

YouTube LinkedIn
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.