HumanAgencyBench: Scalable Evaluation Of Human Agency Support In AI Assistants - Takara TLDR

As humans delegate more tasks and decisions to artificial intelligence (AI),
we risk losing control of our individual and collective futures. Relatively
simple algorithmic systems already steer human decision-making, such as social
media feed algorithms that lead people to unintentionally and absent-mindedly
scroll through engagement-optimized content. In this paper, we develop the idea
of human agency by integrating philosophical and scientific theories of agency
with AI-assisted evaluation methods: using large language models (LLMs) to
simulate and validate user queries and to evaluate AI responses. We develop
HumanAgencyBench (HAB), a scalable and adaptive benchmark with six dimensions
of human agency based on typical AI use cases. HAB measures the tendency of an
AI assistant or agent to Ask Clarifying Questions, Avoid Value Manipulation,
Correct Misinformation, Defer Important Decisions, Encourage Learning, and
Maintain Social Boundaries. We find low-to-moderate agency support in
contemporary LLM-based assistants and substantial variation across system
developers and dimensions. For example, while Anthropic LLMs most support human
agency overall, they are the least supportive LLMs in terms of Avoid Value
Manipulation. Agency support does not appear to consistently result from
increasing LLM capabilities or instruction-following behavior (e.g., RLHF), and
we encourage a shift towards more robust safety and alignment targets.

Source link

What's Hot

Google’s former security leads raise $13M to fight email threats before they reach you

AI Agents + What’s Next for Legal Judgment – Artificial Lawyer

P3-SAM: Native 3D Part Segmentation – Takara TLDR

HumanAgencyBench: Scalable Evaluation of Human Agency Support in AI Assistants – Takara TLDR

P3-SAM: Native 3D Part Segmentation – Takara TLDR

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning – Takara TLDR

Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search – Takara TLDR

Christie’s Will Auction The First Calculating Machine In History

The Art Market Isn’t Dying. The Way We Write About It Might Be.

Banksy Mural of Judge Beating Protestor Removed by Courts Service

Death of Matthew Christopher Pietras Ruled a Suicide

Google’s former security leads raise $13M to fight email threats before they reach you

AI Agents + What’s Next for Legal Judgment – Artificial Lawyer

P3-SAM: Native 3D Part Segmentation – Takara TLDR

What's Hot

HumanAgencyBench: Scalable Evaluation of Human Agency Support in AI Assistants – Takara TLDR

Related Posts

Subscribe to Updates