The video shows an agent driving a racecar using only raw pixels as input. The agent was trained using the Asynchronous Advantage Actor-Critic (A3C) algorithm. During training, the agent was rewarded for maintaining high velocity along the center of the racetrack.
Paper link –
source
Asynchronous Methods for Deep Reinforcement Learning: TORCS
Previous ArticleArena Announcement and Closing | OpenAI Five Finals (6/6)