reinforcement-learning

What is the way to understand Proximal Policy Optimization Algorithm in RL?

旧街凉风 提交于 2019-11-27 09:17:40
问题 I know the basics of Reinforcement Learning, but what terms it's necessary to understand to be able read arxiv PPO paper ? What is the roadmap to learn and use PPO ? 回答1: To better understand PPO, it is helpful to look at the main contributions of the paper, which are: (1) the Clipped Surrogate Objective and (2) the use of "multiple epochs of stochastic gradient ascent to perform each policy update". First, to ground these points in the original PPO paper: We have introduced [PPO], a family