Reinforcement Learning And Bayesian Statistics: A Winning Combination?

Blog

Reinforcement Learning, Bayesian Statistics And Tensorflow Probability: A Child's Game (part 1)

Dr. Michael Allgöwer

Published on

15.8.2019

8.5.2025

Updated on

8.5.2025

Data Science & AI

Reinforcement Learning, Bayesian Statistics And Tensorflow Probability: A Child's Game (part 1)

Reinforcement learning has a bad reputation for being extremely data-hungry – so data-hungry it can only realistically be trained in simulation-generated data, e.g. in a computer game. We discuss how this can be cured using Bayesian Statistics, using an easily accessible small example. In the second part of this blog series, we see how this can be done in practice using TensorFlow Probability, a hot new tool from Google.

What’s this Bayes stuff all about?

In a recent podcast interview, Andrew Gelman, a leading researcher and practitioner in Bayesian Statistics, characterizes the Bayesian way of working like this: “There are two approaches to statistics. One is to make very minimal assumptions, and the other is to make maximal assumptions.” The latter is true for Bayesian statistics, he then explains. This doesn’t sound attractive, right? Many in the data science community are used to thinking of assumptions as the dirty secret that you need for your models to work, but which also makes you vulnerable to errors.

A dirty secret, turned into a modelling tool

The Bayesian approach to assumptions is very different: Instead of trying to avoid them, they are embraced as modelling tools. Bayesian modelling is very flexible in accommodating domain knowledge, which is turned into an integral part of the model calculations. This achieves two things at once: It is much easier to make use of knowledge about the problem domain, and to do it in a well-documented and transparent manner. It is also much easier to check and modify your assumptions when they’re part of the model itself. Doing a lot of checks is crucial to the Bayesian approach, minimizing the risk of errors.

Reinforcement Learning: The Strange New Kid On The block

If Bayesian statistics is the black sheep of the statistics family (and some people think it is), reinforcement learning is the strange new kid on the data science and machine learning block. It employs many of the familiar techniques from machine learning, but the setting is fundamentally different. You don’t follow the usual ritual of taking a big bunch of data, splitting it into partitions, train, evaluate and improve your model. The data your model works with in reinforcement learning is not some entity that is separate from the model itself. Instead, your model must choose from a set of actions, and gets a reward depending on this choice. Then it chooses the next action, gets the next reward, and so on, with your model trying to maximize the reward. Hence, data is not given. It is being produced while the model interacts with its environment.

Reinforcement Learning: Why Bayes?

The best-known applications of reinforcement learning are connected to games. The defeat of an e-sports champion in the computer game Dota by OpenAI’s deep reinforcement agents has attracted a lot of attention. The same is true for Deepmind’s board game program AlphaZero, which is also based on reinforcement learning. The computational resources invested for this kind of approach are huge: OpenAI’s agents have played a total of 45 000 years of Dota in fast forward mode. And the importance of games and simulations in reinforcement learning is not restricted to high-profile cases that make the headlines. When you look at OpenAI Gym, a popular environment for training reinforcement agents, you see lots of computer game classics like Pong and several Atari games, along with physics simulations where an agent can learn to balance a pole on a cart. There is an interesting connection here to the Bayesian approach: In reinforcement learning, we often assume we know the rules of the environment and their interaction so well that we can set up a simulation as a training environment for the agents. In other words, reinforcement learning routinely works with strong assumptions. So strong that it is often applied to a purely simulated game setting that is isolated from the “real world”. What if we could use that other thing that works with strong assumptions, Bayesian statistics, to break through this isolation and use reinforcement learning in the real world?

Reinforcement Learning And Bayesian Statistics: A Child’s Game

Let’s try these abstract ideas and build something concrete. We will stay in the reinforcement learning tradition by using a game, but we’ll break with tradition in other ways: the learning environment will not be simulated. It will be the interaction with a real human like you, for example. As this is intended to be as simple as possible, the game we use will be the childhood’s classic rock, paper, scissors. Game theory says this game has a single equilibrium in which both players choose their actions uniformly at random. In plain English: you can’t do better than choosing randomly. But also, game theory makes strong assumptions, and they are rarely fulfilled when humans are involved. Humans are not good at being truly random, and so it is interesting to design a reinforcement learning agent that exploits the biases of its human counterpart.

TensorFlow Probability, Practical Bayesian Statistics, Rock, Paper And Scissors

Stay tuned for the next part, where we…

…wrap the gift paper off our new toy, TensorFlow Probability.
…build a Bayesian model.
…venture into to the dark art of mathemagic.

...get used to losing at rock, paper, scissors against our computer.

Want To Learn More? Contact Us!

Dr. Sebastian Petry

Domain Lead Data Science & AI

Who is b.telligent?

Do you want to replace the IoT core with a multi-cloud solution and utilise the benefits of other IoT services from Azure or Amazon Web Services? Then get in touch with us and we will support you in the implementation with our expertise and the b.telligent partner network.

Get to know us

The top of an office building on a bright day

All posts

No previous post

No next post

Reinforcement Learning, Bayesian Statistics And Tensorflow Probability: A Child's Game (part 1)

Table of Contents

What’s this Bayes stuff all about?

A dirty secret, turned into a modelling tool

Reinforcement Learning: The Strange New Kid On The block

Reinforcement Learning: Why Bayes?

Reinforcement Learning And Bayesian Statistics: A Child’s Game

TensorFlow Probability, Practical Bayesian Statistics, Rock, Paper And Scissors

Want To Learn More? Contact Us!

Your contact person

Dr. Sebastian Petry

Who is b.telligent?

Munich

Basel

Berlin

Cluj

Dusseldorf

Frankfurt

Hamburg

Nuremberg

Vienna

Zurich

Cluj

Vienna – Postal address

Vienna – Visitor address

Basel

Zurich

Nürnberg

Frankfurt

Düsseldorf

Hamburg

Berlin

Munich

Reinforcement Learning, Bayesian Statistics And Tensorflow Probability: A Child's Game (part 1)

Table of Contents

What’s this Bayes stuff all about?

A dirty secret, turned into a modelling tool

Reinforcement Learning: The Strange New Kid On The block

Reinforcement Learning: Why Bayes?

Reinforcement Learning And Bayesian Statistics: A Child’s Game

TensorFlow Probability, Practical Bayesian Statistics, Rock, Paper And Scissors

Want To Learn More? Contact Us!

Your contact person

Dr. Sebastian Petry

Who is b.telligent?

Related Posts

Snowflake Document AI – Easily Extract Data From Unstructured Documents

Neural Averaging Ensembles for Tabular Data With TensorFlow 2.0

Neural Networks for Tabular Data: Ensemble Learning Without Trees

Sizing and Scaling Azure AI Search

Munich

Basel

Berlin

Cluj

Dusseldorf

Frankfurt

Hamburg

Nuremberg

Vienna

Zurich

Cluj

Vienna – Postal address

Vienna – Visitor address

Basel

Zurich

Nürnberg

Frankfurt

Düsseldorf

Hamburg

Berlin

Munich