Author: Advanced AI Editor

The success of powerful open source Large Language Models (LLMs) has enabled the community to create a vast collection of post-trained models adapted to specific tasks and domains. However, navigating and understanding these models remains challenging due to inconsistent metadata and unstructured repositories. We introduce Delta Activations, a method to represent finetuned models as vector embeddings by measuring shifts in their internal activations relative to a base model. This representation allows for effective clustering by domain and task, revealing structure in the model landscape. Delta Activations also demonstrate desirable properties: it is robust across finetuning settings and exhibits an additive…

Read More

A new kind of cybercrime has emerged in the online job market, and it is powered by Claude AI. Behind what looks like normal hiring, investigators have uncovered organized systems of fraudulent remote jobs, run by North Korean workers who use artificial intelligence (AI) to fake skills, pass interviews, and keep high-paying roles. According to Anthropic’s latest threat intelligence report, these jobs are organized by the state to bring in money, helping North Korea bypass international sanctions. The money flows directly into national programs, including the country’s weapons development.  By lowering the barrier to complex technical work, Claude AI allows…

Read More

Common Sense Media, a kids-safety-focused nonprofit offering ratings and reviews of media and technology, released its risk assessment of Google’s Gemini AI products on Friday. While the organization found that Google’s AI clearly told kids it was a computer, not a friend — something that’s associated with helping drive delusional thinking and psychosis in emotionally vulnerable individuals — it did suggest that there was room for improvement across several other fronts. Notably, Common Sense said that Gemini’s “Under 13” and “Teen Experience” tiers both appeared to be the adult versions of Gemini under the hood, with only some additional safety…

Read More

Two major sources of training data exist for post-training modern language models: online (model-generated rollouts) data, and offline (human or other-model demonstrations) data. These two types of data are typically used by approaches like Reinforcement Learning (RL) and Supervised Fine-Tuning (SFT), respectively. In this paper, we show that these approaches are not in contradiction, but are instances of a single optimization process. We derive a Unified Policy Gradient Estimator, and present the calculations of a wide spectrum of post-training approaches as the gradient of a common objective under different data distribution assumptions and various bias-variance tradeoffs. The gradient estimator is…

Read More

When you buy through links on our articles, Future and its syndication partners may earn a commission.Credit: Bloomberg via Getty ImagesFor as long as I’ve been alive, there have been bots of one kind or another on the internet. Whether it’s WoW gold bots, email spammers, SmarterChild (remember SmarterChild?), or something else, this glorious world wide web has been home to rickety, virtual facsimiles of human beings trying to wheedle money out of you for decades.But now it’s even worse. With the power of AI™ (not actually ™) we’ve successfully made the internet much worse for everyone, with social media,…

Read More

California Attorney General Rob Bonta and Delaware Attorney General Kathy Jennings met with and sent an open letter to OpenAI to express their concerns over the safety of ChatGPT, particularly for children and teens.  The warning comes a week after Bonta and 44 other attorneys general sent a letter to 12 of the top AI companies, following reports of sexually inappropriate interactions between AI chatbots and children.  “Since the issuance of that letter, we learned of the heartbreaking death by suicide of one young Californian after he had prolonged interactions with an OpenAI chatbot, as well as a similarly disturbing…

Read More

Deep research agents have attracted growing attention for their potential to orchestrate multi-stage research workflows, spanning literature synthesis, methodological design, and empirical verification. Despite these strides, evaluating their research capability faithfully is rather challenging due to the difficulty of collecting frontier research questions that genuinely capture researchers’ attention and intellectual curiosity. To address this gap, we introduce DeepResearch Arena, a benchmark grounded in academic seminars that capture rich expert discourse and interaction, better reflecting real-world research environments and reducing the risk of data leakage. To automatically construct DeepResearch Arena, we propose a Multi-Agent Hierarchical Task Generation (MAHTG) system that extracts…

Read More

picture alliance / Getty ImagesFollow ZDNET: Add us as a preferred source on Google.ZDNET’s key takeawaysDeepSeek will reportedly launch an agent by the end of this year.Agents have become a focal point in the ongoing AI race.The company’s debut was a turning point in the global AI race.DeepSeek, the Chinese AI startup that sent shockwaves throughout Silicon Valley earlier this year with its sudden ascent onto the global tech scene, is reportedly gearing up to launch its most powerful AI system yet.The company aims to release an AI agent to compete with similar models from OpenAI, Google, and other tech…

Read More

On September 5, IT Home reported that Alibaba’s Tongyi Qianwen has launched the latest Qwen-3-Max-Preview model on its official website and OpenRouter. According to the official deion, this model is the most powerful language model in the Tongyi Qianwen series.IT Home provides the following relevant links:Official Website:Qwen ChatOpenRouter:Qwen3 Max – API, Providers, StatsThe introduction of this model on OpenRouter is summarized as follows:Input:$1.20 (approximately 8.6 RMB at current exchange rate) / per million tokensOutput:$6 (approximately 42.8 RMB at current exchange rate) / per million tokensQwen3-Max is an update based on the Qwen3 series, providing significant improvements in inference, instruction following,…

Read More

OpenAI is reorganizing its Model Behavior team, a small but influential group of researchers who shape how the company’s AI models interact with people, TechCrunch has learned. In an August memo to staff seen by TechCrunch, OpenAI’s chief research officer Mark Chen said the Model Behavior team — which consists of roughly 14 researchers — would be joining the Post Training team, a larger research group responsible for improving the company’s AI models after their initial pre-training. As part of the changes, the Model Behavior team will now report to OpenAI’s Post Training lead Max Schwarzer. An OpenAI spokesperson confirmed…

Read More