Git Product home page Git Product logo

ashwinpn / reinforcement-learning Goto Github PK

View Code? Open in Web Editor NEW
2.0 3.0 1.0 1.78 MB

Research and implementations of Deep Learning / Reinforcement Learning agents and their implementations / applications

License: MIT License

Python 100.00%
machine-learning machine-learning-algorithms machinelearning deep-learning deep-neural-networks deep-reinforcement-learning reinforcement-learning reinforcement-learning-algorithms reinforcement-learning-agent reinforcement-learning-environments

reinforcement-learning's Introduction


Contents

Back to top


deep_rl


RL Landscape

Back to top

68747470733a2f2f706c616e73706163652e6f72672f32303137303833302d6265726b656c65795f646565705f726c5f626f6f7463616d702f696d672f616e6e6f74617465642e6a7067


reinforcement-learning

Source: eleurent/phd-bibliography


RL Agents Implementation

Back to top

algorithms

  • Value Optimization
    • [QR-DQN]
    • [DQN] - [Slides] [Code] [rainbow]
    • [Bootstrapped DQN]
    • [DDQN]
    • [NEC]
    • [MMC]
    • [N-step Q Learning]
    • [PAL]
    • [Categorical DQN]
    • [NAF]
  • Policy Optimization
    • [Policy Gradient]
    • [Actor Critic]
      • [DDPG] [Code]
        • [HAC DDPG]
        • [DDPG with HER]
      • [Clipped PPO]
      • [PPO]
  • [DFP]
  • Imitation
    • [Behavioural cloning]
    • [Inverse Reinforcement Learning] [Code] [irl-imitation-code]
    • [Generative Adversarial Imitation Learning]

Value Optimization Agents

Back to top

Policy Optimization Agents

Back to top

General Agents

Back to top

Imitation Learning Agents

Back to top

  • Behavioral Cloning (BC) (code)
Hierarchical Reinforcement Learning Agents

Back to top

Memory Types

Back to top

Exploration Techniques

Back to top


RL History

Back to top

  • Temporal difference(TD) learning (1988)
  • Q‐learning (1998)
  • BayesRL (2002)
  • RMAX (2002)
  • CBPI (2002)
  • PEGASUS (2002)
  • Least‐Squares Policy Iteration (2003)
  • Fitted Q‐Iteration (2005)
  • GTD (2009)
  • UCRL (2010)
  • REPS (2010)
  • DQN (2014) - DeepMind

Back to top

awesome


Back to top

landscape

RL Environments

Back to top

  • [Acrobot]
  • [Bike]
  • [Blackjack]
  • [Cartpole]
  • [ContextBandit]
  • [Continuous Chain]
  • [Corridor]
  • [Discrete Chain]
  • [Discretiser (for continuous environments)]
  • [Double Loop]
  • [Environment]
  • [Gridworld]
  • [Inventory management]
  • [Linear context bandit]
  • [Linear dynamic quadratic]
  • [Mountaincar (2d and 3d)]
  • [POMDP Maze]
  • [Optimistic Task]
  • [Puddleworld]
  • [Random MDPs]
  • [Riverswim]

RL Mechanisms

Back to top

  • [Attention and Memory]
  • [Unsupervised learning ]
    • [GANs]
    • [GQN]
    • [UNREAL]
  • [Hierarchical RL]
    • [FuNs]
    • [Option-Critic]
    • [STRAW]
    • [h-DQN]
    • [Stochastic Neural Networks]
  • [Multi-agent RL]
  • [Relational RL]
  • [Learning to Learn, a.k.a. Meta-Learning]
    • [Few/One/Zero-shot Learning]
      • [MAML]
    • [Transfer and Multi-Task Learning]
    • [Learning to Optimize]
    • [Learning to Re-inforcement Learn]
    • [Learning Combinatorial Optimization]
    • [AutoML]

RL Games

Back to top

  • Chinook (1997;2007) for Checkers,
  • Deep Blue (2002) for chess,
  • Logistello (1999) for Othello,
  • TD-Gammon (1994) for Backgammon,
  • GIB (2001) for contract bridge,
  • MoHex (2017) for Hex,
  • DQN (2016)(2018) for Atari 2600 games,
  • AlphaGo (2016a) and AlphaGo Zero (2017) for Go,
  • Alpha Zero (2017) for chess, shogi, and Go,
  • Cepheus (2015), DeepStack (2017), and Libratus (2017a;b) for heads-up Texas Hold’em Poker,
  • Jaderberg et al. (2018) for Quake III Arena Capture the Flag,
  • OpenAI Five, for Dota 2 at 5v5, https://openai.com/five/,
  • Zambaldi et al. (2018), Sun et al. (2018), and Pang et al. (2018) for StarCraft II

Back to top

  • [Board Games]
    • [Computer Go]
    • [AlphaGo: Trainig pipeline with MCTS]
    • [AlphaGo Zero]
    • [Alpha Zero]
  • [Card Games]
    • [DeepStack]
  • [Video Games]
    • [Atari 2600 games]
    • [StarCraft]
    • [StarCraft II mini-games]
    • [Quake III Arena]
    • [Minecraft]
    • [Super Smash Bros]
    • [Doom]
    • [ViZDoom]

DRL applied to Robotics

Back to top

  • [Sim-to-Real]
    • [MuJoCo]
  • [Imitation Learning]
  • [Value-based Learning]
  • [Policy-based Learning]
  • [Model-based Learning]
  • [Autonomous Driving Vehicles]

DRL applied to NLP

Back to top

  • [Sequence Generation]
  • [Machine Translation]
  • [Dialogue Systems]

DRL applied to Vision

Back to top

  • [Recognition]
  • [Motion Analysis]
  • [Scene Understanding]
  • [Vision + NLP]
  • [Visual Control]
  • [Interactive Perception]

References

Back to top



reinforcement-learning's People

Contributors

ashwinpn avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

prernamishra08

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.