Git Product home page Git Product logo

60_days_rl_challenge's Introduction

强化学习60天

英文地址

我为了你我设计这个挑战:在这60天里深入学习“深度强化学习”。

你肯定听说过 Deepmind with AlphaGo ZeroOpenAI in Dota 2 取得的惊人成绩! 你难道不想知道他们是如何工作的吗?现在正是你我最终学会“深度强化学习”,并应用到已有项目的时机。

终极目标是使用这些多功能的技术,并应用他们到各种重要的真实世界问题中。Demis Hassabis

这个项目引导你完成从最基本的到高级的AlphaGo Zero深度强化学习算法。你可以发现按周组织的主题建议学习资源。 同时,每周我会提供用Python实现的应用实例,帮助你更好地消化理论。

前提

  • 了解Python和PyTorch
  • 机器学习
  • 了解深度学习(MLP,CNN和RNN)

项目(待定)

  • Q-learning
  • DQN
  • AC2
  • ES
  • AlphaGo Zero

第一周 - 强化学习介绍

第二周 - 强化学习基础: MDP, Dynamic Programming and Model-Free Control

忘记过去的人,终将重蹈覆辙。 - George Santayana

在这一周,我们将会学习基本的强化学习内容,我们将通过评估和优化表示策略和状态的函数去定义现实世界的各类问题。

理论材料

  • 马尔科夫决策过程 - RL by David Silver

    马尔科夫决策过程定义强化学习问题

    • 马尔科夫过程
    • 马尔科夫决策过程
  • 动态规划设计 - RL by David Silver

    如何解决马尔科夫决策问题

    • 策略迭代
    • 价值迭代
  • 无模型预测 - RL by David Silver

    评估无模型马尔科夫决策过程的价值函数

    • 蒙特卡罗学习
    • 时间差分学习
    • TD(λ)
  • 无模型约束 - RL by David Silver

    优化无模型卡尔科夫决策过程价值函数

    • Ɛ贪婪策略迭代
    • GLIE蒙特卡罗搜索
    • SARSA
    • 重要性采样

本周项目

Q-learning解决冰冻湖问题. 在本练习中,你将学会使用SARSA或者Q-learning.


想知道更多

Week 3 - Value Function Approximation and DQN

Week 4 - A2C and A3C

Week 5 - RL in continous space - TRPO/PPO

Week 6 - Evolution Strategies and Genetic Algorithms

Week 7 - I2A

Week 8 - AlphaGoZero + Bonus

Last 4 days - Review + sharing

强化学习论文

强化学习资源

📺 Deep Reinforcement Learning - UC Berkeley class by Levine, check here their site.

📺 Reinforcement Learning course - by David Silver, DeepMind. Great introductory lectures by Silver, a lead researcher on AlphaGo. They follow the book Reinforcement Learning by Sutton & Barto.

📓 Reinforcement Learning: An Introduction - by Sutton & Barto. The "Bible" of reinforcement learning. Here you can find the PDF draft of the second version.

额外的资源

📚 Awesome Reinforcement Learning. A curated list of resources dedicated to reinforcement learning

60_days_rl_challenge's People

Contributors

huang-jack avatar zhyongquan avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.