Advantage Actor Critic (A2C)

Friday. June 12, 2020 - 1 min

Cart balancing pole with with A2C policy.

Project Link: A2C

An implementation of the advantage actor critic algorithm to solve reinforcement learning problems.

The algorithm uses a two-headed neural network, with one head computing the agent’s policy and the other estimating the value of the given state.

It is an online algorithm, using its current policy at each training step to collect a small batch of experience and use it to optimize the neural network parameters.

Osmany Corteguera

Software Engineer

Advantage Actor Critic (A2C)

Related Posts

Osmany Corteguera