Advantage Actor Critic (A2C)
- 1 minProject Link: A2C
An implementation of the advantage actor critic algorithm to solve reinforcement learning problems.
The algorithm uses a two-headed neural network, with one head computing the agent’s policy and the other estimating the value of the given state.
It is an online algorithm, using its current policy at each training step to collect a small batch of experience and use it to optimize the neural network parameters.