Git Product home page Git Product logo

simontamayo / tutorial_rl Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 0.0 1.14 MB

This repository proposes a tutorial on reinforced learning for beginners where the main concepts of this type of learning are introduced in a straightforward and applied way. A Python application of the Q-learning algorithm is implemented to solve a "maze-world" problem.

Jupyter Notebook 100.00%
reinforced-learning tutorial machine-learning

tutorial_rl's Introduction

Reinforced learning 101: a beginner's tutorial in python

What is reinforced learning?

Reinforced Learning (RL) refers to a class of problems where the goal is to learn (from experiences) what to do in different situations (or states of a system), so as to optimize a quantitative reward over time.

Mathematically, this can be translated into the identification of the function f(s,a), which gives for each state of the system s, the best action policy a, to maximize a reward r. The theoretical principles of this type of learning are the Markov decision processes and the optimal control theory.

Typical reinforcement scenario

Reinforced learning differs from other types of Machine Learning because the machine is not trained with an input dataset, instead it learns by trial and error. We often talk about an "agent" instead of a model, because the goal here is to learn how to choose actions over time.

In the absence of input data, the agent learns through experience to make choices. It is through the exploration of the states of a given environement and the available actions that the agent builds his learning examples ("this action was good", "this action was bad"), then, by trial and error, he identifies the policy that maximizes its long-term reward.

Figure 1 shows a typical learning scenario for reinforced learning in which An agent performs an action on the environment, this action is interpreted as a reward and a representation of the new state, and this new representation is passed to the agent.

fig1 Figure 1. Typical reinforcement scenario

What is Q-learning?

Q-learning is probably the most used reinforced algorithm because of its simplicity. It is based on the learning of a function of the values ​​of the actions. This function represents "the expected utility for a given action, followed by an optimal policy". A policy is a set of rules that an agent follows to choose their actions based on states. The problem is thus composed of an agent, which can evolve in a set of states S, and which can choose in a set of actions A. The execution of an action in a state gives a reward r. The goal of the agent is to maximize his rewards. He must therefore identify what is the optimal action for each state, i.e. the one giving the greatest reward in the long run.

The Q-Learning algorithm therefore seeks to construct the Q-value function, (also called Q-table), which contains the maximum future rewards stretched for an action at each state. To illustrate this, let's take the example of a mouse (our agent) learning the shortest way out of a maze with 11 pieces as shown in Figure 2.

fig2 Figure 2. Maze

Code

Syntax highlighted code block

# Header 1
## Header 2
### Header 3

- Bulleted
- List

1. Numbered
2. List

**Bold** and _Italic_ and `Code` text

[Link](url) and ![Image](src)

For more details see GitHub Flavored Markdown.

Jekyll Themes

Your Pages site will use the layout and styles from the Jekyll theme you have selected in your repository settings. The name of this theme is saved in the Jekyll _config.yml configuration file.

Support or Contact

Having trouble with Pages? Check out our documentation or contact support and we’ll help you sort it out.

tutorial_rl's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.