On Choosing a Deep Reinforcement Learning Library

Tech Blog Thomas Simonini

As Deep Reinforcement Learning is becoming one of the most hyped strategies to achieve AGI (aka Artificial General Intelligence) more and more libraries are developed. But choosing the best for your needs can be a daunting task.

In recent years, we’ve seen an acceleration of innovations in Deep Reinforcement learning. Examples include beating the champion of the game Go with AlphaGo in 2016, OpenAI and the PPO in 2017, the resurgence of curiosity-driven learning agents in 2018 with UberAI GoExplore and OpenAI RND, and finally, the OpenAI Five that beats the best Dota players in the world.

 
benchmarking deep reinforcement learning libraries

Criteria

agent that learns to walk using the two best deep reinforcement learning libraries

KerasRL (2.3/5)

State of the art RL methods ✅ ✅ ❌ ❌ ❌

Easy to start ✅ ✅ ✅ ✅ ❌

The code is full of comments which helps you to understand even the most obscure functions.

The code is really easy to read and demonstrates a good separation between agents, policy, and memory. There is documentation, but it remains incomplete. The explanations of each definable parameter are missing.

Easy to plug your own environment ✅ ❌ ❌ ❌ ❌

Easy to modify the agents ✅ ✅ ✅ ✅ ✅

Community and updates ✅ ❌ ❌ ❌ ❌

Tensorboard support ❌ ❌ ❌ ❌ ❌

Other features ✅ ✅ ✅ ❌ ❌

Tensorforce (4.1/5)

 
reinforce.io logo

State of the art RL methods ✅ ✅ ✅ ✅ ❌

Easy to start ✅ ✅ ✅ ❌ ❌

Easy to plug into your own environment ✅ ✅ ✅ ✅ ✅

Easy to understand the code and modify it ✅ ✅ ✅ ✅ ❌

Community and updates ✅ ✅ ✅ ❌ ❌

Tensorboard support ✅ ✅ ✅ ✅ ✅

OpenAI Baselines (2.2/5)

 
openAI logo

State of the art RL methods ✅ ✅ ✅ ✅ ❌

Easy to start ✅ ✅ ❌ ❌ ❌

python -m baselines.run — alg=ppo2 — env=BipedalWalker-v2 — num_timesteps=2500 — num_env=1 — save_path=./models/bipedalwalker — save_video_interval=100

Easy to plug your own environment ✅ ❌ ❌ ❌ ❌

Easy to understand the code and modify it ✅ ❌ ❌ ❌ ❌

Community and updates ✅ ✅ ✅ ❌ ❌

Tensorboard support ✅ ❌ ❌ ❌ ❌

Stable Baselines (4.6/5)

 
Baymax Big Hero 6 calculating a basketball dunk

State of the art RL methods ✅ ✅ ✅ ✅ ✅

Easy to start ✅ ✅ ✅ ✅ ✅

from stable_baselines import PPO2

model = PPO2('MlpPolicy', 'CartPole-v1').learn(10000)
 
Stable Baselines RL training in one line

Easy to plug your own environment ✅ ✅ ✅ ❌ ❌

Easy to understand the code and modify it ✅ ✅ ✅ ❌ ❌

Community and updates ✅ ✅ ✅ ✅ ✅

Tensorboard support ✅ ✅ ✅ ✅ ✅

Other features (vec env…) ✅ ✅ ✅ ✅ ✅

TF Agents (4.3/5)

State of the art RL methods ✅ ✅ ✅ ✅ ✅

Easy to start ✅ ✅ ✅ ❌ ❌

Easy to plug our own environment ✅ ✅ ✅ ❌ ❌

Easy to understand the code and modify it ✅ ✅ ✅ ✅ ❌

Community and updates ✅ ✅ ✅ ✅ ✅

Tensorboard support ✅ ✅ ✅ ✅ ✅

Part 2: Implement an agent that learns to walk with BipedalWalker-v2

BipedalWalker-v2 with TF Agents

BipedalWalker-v2 with Stable Baselines

BipedalWalker documentation
Look at this beautiful documentation 😍
Tensorboard visualization of results
agent that learns to walk using the two best deep reinforcement learning libraries

You May Also Like

Why Is My Data Drifting?

Read More

DEEP BEERS: Improving the Performance of Deep Recommendation Engines Using Keras

Read More

Object Detection With Deep Learning on Aerial Imagery

Read More

The Dataiku AI Lab: 2020 Year in ML Research

Read More