Advantage Actor Critic (A2C)

- 1 min
Cart balancing pole with with A2C policy.

:link: Project Link: A2C

An implementation of the advantage actor critic algorithm to solve reinforcement learning problems.

The algorithm uses a two-headed neural network, with one head computing the agent’s policy and the other estimating the value of the given state.

It is an online algorithm, using its current policy at each training step to collect a small batch of experience and use it to optimize the neural network parameters.

Osmany Corteguera

Osmany Corteguera

Software Engineer

rss facebook twitter github gitlab youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora