Git Product home page Git Product logo

crash_course_on_rl's Introduction

A Crash Course on Reinforcement Learning for Control Problems Using TensorFlow 2

This is a self-contained repository to explain two basic Reinforcement (RL) algorithms, namely Policy Gradient (PG) and Q-learning, and show how to apply them on control problems. Dynamical systems might have discrete action-space like cartpole where two possible actions are +1 and -1 or continuous action space like linear Gaussian systems. Usually, you can find a code for only one of these cases. It might be not obvious how to extend one to another.

In this repository, we will explain how to formulate PG and Q-learning for each of these cases. We will provide implementations for these algorithms for both cases as Jupyter notebooks. You can also find the pure code for these algorithms (and also a few more algorithms that I have implemented but not discussed). The code is easy to follow and read. We have written in a modular way, so for example, if one is interested in the implementation of an algorithm is not confused with defining an environment in gym or plotting the results or so on. The theoretical materials in this repo is summarized in a handout which is available in ArXiv. Click here to access the handoutThe handout can be downloaded from here

Citing this repo

Here is a BibTeX entry that you can use to cite the handout in a publication:

@misc{yaghmaie2021crash,
      title={A Crash Course on Reinforcement Learning}, 
      author={Farnaz Adib Yaghmaie and Lennart Ljung},
      year={2021},
      eprint={2103.04910},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

If you use this repo, please consider citing the following relevant papers:

How to use this repo

This repository contains presentation files and codes.

The presentation files are related to the LINK-SIC workshop on Reinforcment Learning. The first day will be Friday March 12, 2021, 13.15 - 16.30, and the second day will be Tuesday April 6, 2021, 13.15 - 16.30. You can find the presentation files in pdf in the folder presentation.

The code is given as Jupyter notebooks and python files. If you want to run Jupyter notebooks, I suggest to use google colab. If you want to extend the results and examine more systems, I suggest to clone this repostory and run on your computer.

Running on google colab

  • Go to [https://colab.research.google.com/notebooks/intro.ipynb] and sign in with a Google acount.
  • Click File, and Upload notebook. If you get the webpage in Swedish, click Arkiv and then Ladda upp anteckningsbok.
  • Select github and paste the following link [https://github.com/FarnazAdib/Crash_course_on_RL.git].
  • Then, a list of files with type .ipynb appears. They are Jupyter notebooks. Jupyter notebooks can have both text and code and it is possible to run the code. As an example, scroll down and open pg_on_cartpole_notebook.ipynb.
  • The file contains some cells with text and come cells with code. The cells which contain code have $[]$ on the left. If you move your mouse over $[ ]$, a play box appears. You can click on it to run the cell. Make sure not to miss a cell as it causes fatal errors.
  • You can continue like this and run all code cells one by one up to the end.

Running on local computer

Where to start

The theoretical materials in this repo is nicely summarized in our handout in pdf format available at https://arxiv.org/abs/2103.04910. If you wish to read the materials in this repo, you can start by reading about Reinforcement Learning

Dynamical systems

You can read about dynamics systems (or environments in RL terminology) that we consider in this repo here.

Policy Gradient

Policy Gradient is one of the popular RL routines that relies upon optimizing the policy directly. Below, you can see jupyter notebooks regarding Policy Gradient (PG) algorithm

You can also see the pure code for PG

Q-learning

Q-learning is another popular RL routine that relies upon dynamic programming. Below, you can see jupyter notebooks regarding Q-learning algorithm

You can also see the pure code for Q- and experience replay Q-learning

Presentation files

The presentation files for the LINK-SIC workshop can be downloaded from the folder called presentation. There, you can find the presentation files for day1 and day2.

crash_course_on_rl's People

Contributors

farnazadib avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.