Git Product home page Git Product logo

a.i's Introduction

Study Log

  • Reinforcement Learning #1

     1. Mathematical Background 
          - Probability                 - ok
          - Random Variable             - ok
          - Random Process              - ok
    
      2. Basic RL Algorithm
          - Markov Process              - ok             
          - Markov Reward Process       - ok
          - Markov Decision Process     - ok 
              - Bellman Expectation Eqn 
              - Bellman Optimality  Eqn  
    
          - Dynamic Programming         - ok
              - Value iteration         - ok
              - Policy iteration        - ok
    
          - Model free Approaches
              # MF Prediction                
                - Monte Carlo           - ok
                - Temporal Difference   - ok
                
                (Example : Random Walk)
                
              # MF Control                
                - Sarsa                 - ok
                - Q-Learning            - ok
    
                (Example : Cliff Walking)
                (Example : Windy Grid)
                (Example : Windy Cliff)
    
      3. ML based R learning
              - Function Approximation  - ok 
              - DQN                     - ok
    
                (Example : Cartpole - DQN)
      
      4. Policy Base R Learning
              - REINFORCE               - ok
              - A2C                     - ok
    
    
      # Term Project
    
         Cartpole A2C          
         Cartpole DQN         
         Cartpole REINFORCE   
    

  • Reinforcement Learning #2

    • Week 1 : Dynamic Programming

        - Policy Iteration
        - Value Iteration
      
        # Proof of Convergence 
      
    • Week 2 : Monte Carlo

        - On Policy Monte Carlo  : Batch / Recursive 
        - Off Policy Monte Carlo : Batch / Recursive
      
        # Law of Large Number
        # Empirical Mean 
        # Importance Sampling 
      
    • Week 3 : Temporal Difference

        - Temporal difference(0)
        - Temporal difference(1)
        - Temporal difference(λ)
        - SARSA
        - Q Learning
        - Double Q Learning
        - Deep Q Learning
        - Function Approximation 
      
        # Robbins-Monro rule
        # Sherman-Morrison fomular
        # Projected Bellman Eqn
        # Maximization bias 
      
    • Week 4 : Policy Gradient

        - REINFORCE
        - A2C
        - DPG
        - DDPG
      
        # PG Proof
        # Information Theory 
            - Self Information
            - Shannon-Entropy
            - KL divergence
            - Cross Entropy
      
  • Week 5 : Advanced RL

      - D3QN
      - Double Deep
      - Dueling Deep
      - TD3
      - TRPO
      - PPO
    
  • Week 6 : Project

      # Solve BiPedal 
    

  • Machine Learning

    1. Linear Regression       - ok
    2. Logistic Regresssion    - ok
    3. K-nearest neighborhood  - ok
    4. K-means clustering      - ok      
    5. Naive Bayes             - ok      
    6. SVM                     - ok
    7. PCA
    8. Decision Tree           - ok
    9. Perceptron              - ok      
           1. SLP              - ok
           2. MLP              - ok
    

  • Deep Learning

     1. Linear Regression      - ok    
     2. Logistic Regression            
          - Logistic Regression(Binary Classification)    - ok
          - Softmax Regression(MultiClass Classification) - ok
         
     3. Auto Encoder            
           - AE                - ok
           - CAE               - ok
    
    4. Modern CNN
           - LeNet             - ok
           - AlexNet           - ok
           - VGG Nets          - ok
           - GoogLeNet         - ok
           - ResNet            - ok
    
    
     5. Semantic Segmentation
           - FCN               - ok
           - DeConvNet         - ok
           - SegNet            - ok     
           - U-Net             - ok
           - DeepLab v1, v2    - ok
    
     6. Object Detection
           - RCNN
           - Fast RCNN
           - Faster RCNN
           - SPP Net
           - Yolo
           - SDD
           - Attention Net
    
     7. NLP
           - RNN                       - ok
           - LSTM / GRU                - ok
           - Sequence Prediction       - ok
           - Sequence Classification   - ok
    

a.i's People

Contributors

dldnxks12 avatar

Watchers

 avatar

Forkers

lxzun

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.