Lower Bound on Howard Policy Iteration for Deterministic Markov Decision Processes

arXiv:2506.12254v1 Announce Type: new
Abstract: Deterministic Markov Decision Processes (DMDPs) are a mathematical framework for decision-making where the outcomes and future possible actions are deterministically determined by the current action taken. DMDPs can be viewed as a finite directed weighted graph, where in each step, the controller chooses an outgoing edge. An objective is a measurable function on runs (or infinite trajectories) of the DMDP, and the value for an objective is the maximal cumulative reward (or weight) that the controller can guarantee. We consider the classical mean-payoff (aka limit-average) objective, which is a basic and fundamental objective.
Howard’s policy iteration algorithm is a popular method for solving DMDPs with mean-payoff objectives. Although Howard’s algorithm performs well in practice, as experimental studies suggested, the best known upper bound is exponential and the current known lower bound is as follows: For the input size $I$, the algorithm requires $\tilde{\Omega}(\sqrt{I})$ iterations, where $\tilde{\Omega}$ hides the poly-logarithmic factors, i.e., the current lower bound on iterations is sub-linear with respect to the input size. Our main result is an improved lower bound for this fundamental algorithm where we show that for the input size $I$, the algorithm requires $\tilde{\Omega}(I)$ iterations.

Source link

What's Hot

How to Build a Successful Robotics Company – Colin Angle, iRobot CEO | AI Podcast Clips

Five quick updates about that Apple reasoning paper that people can’t stop talking about

EU Commission: “AI Gigafactories” to strengthen Europe as a business location

Lower Bound on Howard Policy Iteration for Deterministic Markov Decision Processes

PRO-V: An Efficient Program Generation Multi-Agent System for Automatic RTL Verification

[2211.03295] MogaNet: Multi-order Gated Aggregation Network

Central Dogma Modeling with Multi-Omics Sequence Unification

Major Gift to National Gallery of Canada, and More

14 Gigs To Book Now For Montreal Jazz Festival 2025

Independent Art Fair Moves to Pier 36 with Expanded Format for 2026

The Best Large-Scale Works at Art Basel Unlimited 2025

How to Build a Successful Robotics Company – Colin Angle, iRobot CEO | AI Podcast Clips

Five quick updates about that Apple reasoning paper that people can’t stop talking about

EU Commission: “AI Gigafactories” to strengthen Europe as a business location

What's Hot

Lower Bound on Howard Policy Iteration for Deterministic Markov Decision Processes

Related Posts

Subscribe to Updates