arXiv AI

[2407.17226] Sublinear Regret for a Class of Continuous-Time Linear-Quadratic Reinforcement Learning Problems

By Advanced AI EditorApril 10, 2025No Comments2 Mins Read

[Submitted on 24 Jul 2024 (v1), last revised 8 Apr 2025 (this version, v4)]

View a PDF of the paper titled Sublinear Regret for a Class of Continuous-Time Linear-Quadratic Reinforcement Learning Problems, by Yilie Huang and 2 other authors

View PDF
HTML (experimental)

Abstract:We study reinforcement learning (RL) for a class of continuous-time linear-quadratic (LQ) control problems for diffusions, where states are scalar-valued and running control rewards are absent but volatilities of the state processes depend on both state and control variables. We apply a model-free approach that relies neither on knowledge of model parameters nor on their estimations, and devise an RL algorithm to learn the optimal policy parameter directly. Our main contributions include the introduction of an exploration schedule and a regret analysis of the proposed algorithm. We provide the convergence rate of the policy parameter to the optimal one, and prove that the algorithm achieves a regret bound of $O(N^{\frac{3}{4}})$ up to a logarithmic factor, where $N$ is the number of learning episodes. We conduct a simulation study to validate the theoretical results and demonstrate the effectiveness and reliability of the proposed algorithm. We also perform numerical comparisons between our method and those of the recent model-based stochastic LQ RL studies adapted to the state- and control-dependent volatility setting, demonstrating a better performance of the former in terms of regret bounds.

Submission history

From: Yilie Huang [view email]
[v1]
Wed, 24 Jul 2024 12:26:21 UTC (247 KB)
[v2]
Sat, 21 Sep 2024 16:48:58 UTC (263 KB)
[v3]
Tue, 18 Mar 2025 14:55:51 UTC (204 KB)
[v4]
Tue, 8 Apr 2025 19:11:31 UTC (205 KB)

Previous ArticleFintech shares sink as stocks pull back from tariff pause rally

Next Article HubSpot Announces 200+ Features At Spring Spotlight 2025

Advanced AI Editor

Leave A Reply