Light

zhoubolei / introrl Goto Github PK

View Code? Open in Web Editor NEW

3.2K 92.0 483.0 84.27 MB

Intro to Reinforcement Learning (强化学习纲要）

License: MIT License

introrl's Introduction

Overview

This short RL course introduces the basic knowledge of reinforcement learning. Slides are made in English and lectures are given by Bolei Zhou in Mandarin. The course is for personal educational use only. Please open an issue if you spot some typos or errors in the slides.

Course Schedule

The course is scheduled as follows. There are 10 lectures in total, where the first one was premiered on 16 March 2020 and the last one was finished on 25 May 2020. Thanks for watching and may ReinForce be with you!

	Topic	Resources
Lecture1	Overview (课程概括与RL基础)	slide, Youtube(part1, part2), B站(上集, 下集)
Lecture2	Markov Decision Process (马尔科夫决策过程)	slide, Youtube(part1, part2), B站(上集, 下集)
Lecture3	Model-free Prediction and Control (无模型的预测和控制)	slide, Youtube(part1, part2), B站(上集, 下集)
Lecture4	Value Function Approximation (价值函数近似)	slide, Youtube(part1, part2), B站(上集, 下集)
Lecture5	Policy Optimization: Foundation (策略优化基础篇)	slide, Youtube(part1, part2), B站(上集, 下集)
Lecture6	Policy Optimization: State of the art (策略优化进阶篇)	slide, Youtube(part1, part2), B站(上集, 下集)
Lecture7	Model-based RL (基于环境模型的RL)	slide, Youtube, B站
Lecture8	Imitation Learning (模仿学习)	slide, Youtube, B站
Lecture9	Distributed systems for RL (分布式系统)	slide, Youtube, B站
Lecture10	RL in a nutshell (课程结局篇)	slide, Youtube, B站
Bonus 1	DeepMind's AlphaStar Explained (剖析星际争霸AI) by Zhenghao Peng	slide, Youtube, B站

introrl's People

Contributors

Stargazers

Watchers

Forkers

lingtengqiu tesla-h zhenyangiacas qianrenjian xijiejie tyzaizl xuejunma zhlstone liweiaufe dlnf lqpcwl1986 juingzhou joomladigger weihuang527 ponykid lighting-zhu qiming-zou renlang97 smibayy tjustorm tchigher wmingdao xmhgit cxw1987 fmx789 mbrukman liyi3344520 magicknight goswamig hypotyposis yanxiao1930 endpress hijaen awesome-archive linyi-python shivlondon dyndzio hadryan blucehan srinivasanbigdata hack121 jiechensimon qmarkli huixianglanchaixi waitalone ljjbluesky qdpancheng shaoqibnu striderw u19900101 fantaichen zhanqiqi alfords lhmzll sillyjims taogz lfs119 danpeng2 jian-machine-learning peterouzh alanwangxiao liuduo2019102962 vcl990 yuzhongshanyue liangsi03 collector-m bzyfengjie xluckyhappy harryjd imcabbage dcmr barneyqiao songyf zhangruiskyline gcs-lg sxty4170160 18292753687 nudtxiong dongzelian 71oliver ljz756245026 troyzhangfawang chenkehan21 zhoujiangbing guesswin rnjia cdyangbo congestion-control-group arielliu3124 leliuchn-githuber txsan dudenghui tayer915 lizhanlian lrjxaint siyizoe fanbbbb mrguann sierxue mohatarem

introrl's Issues

Slides页面大小不一致

周老师您好，您放出的slides里chapter1，8，9和chapter2-7采用了两种页面大小，如果想把所有slides合并成一个PDF文件会造成chapter2-7宽度过小的问题，能否提供相同页面大小的PDF呢？谢谢！

老师好~

特别喜欢您的视频, 但是课本和课件里的例子好像是直接给出结果的,请问有没有步骤或者代码呀~

possible typo in lecture 5 slides

Should it be Q(s, a) here?

讲义上的一些细节疑问？

花了几天写了个简略的中文版：https://github.com/foocker/RL
边看边学边写，时间有点紧，在十一章节我列了一些疑问，有些很可能是讲义小问题，忘指教。欢迎看过的同学讨论。

老师你好，第三周文件夹里没有Model-free的slides

possible typos in lecture 2

Hi Prof. Zhou,
On p31, you write

For my understanding, it seems that the action should be summarized out for the stochastic policy, although it is fine to omit the summation of action for the deterministic policy on p30. That might be a typo, or am I missing something?

公式可能有误

周老师好，我看到lecture2.pdf里的公式（16）是不是有问题？

是不是应该这样写？

强化学习纲要课程作业在哪里？

你好，强化学习纲要课程是一优质课程，所以想深入掌握强化学习的内容，想问一下，强化学习纲要课程有课程作业吗？如果有，可以在哪里找到？

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.