Git Product home page Git Product logo

rl_tutorial's Introduction

RL_Tutorial

这是楼主强化学习的笔记,也可以在博客查看。

1、下载

git clone [email protected]:NovemberChopin/RL_Tutorial.git

推荐环境:

  • tensorflow: 2.2.0
  • tensorlayer: 2.2.3
  • tensorflow-probability: 0.6.0

2、教程

Doc目录为强化学习教程,博客上就是这部分内容,也算是一个强化学习简明教程吧,对于传统的强化学习部分有详细的介绍,以及公式推导。后面就是常用的强化学习算法,最新的算法比如 SACPPO 暂时没有包括在内,希望后续能补充上。

3、代码

code 目录为一些算法的实现,一个算法一个文件,比较清晰。 代码结构应该是比较容易理解的,难懂的地方都做了注释。

代码使用 tensorlayer 这个框架写的,就是对 tensorflow 中的 layer 进行了一些包装。使得更加方便与易用。如果能看懂 tensorflow ,那么 tensorlayer 绝对不在话下,几乎没有学习成本,所以大可不必担心。另外 tensorlayer 为强化学习提供了一些API,使得编写强化学习算法更加方便。这是官方文档介绍:

TensorLayer 是为研究人员和工程师设计的一款基于Google TensorFlow开发的深度学习与强化学习库。 它提供高级别的(Higher-Level)深度学习API,这样不仅可以加快研究人员的实验速度,也能够减少工程师在实际开发当中的重复工作。 TensorLayer非常易于修改和扩展,这使它可以同时用于机器学习的研究与应用。 此外,TensorLayer 提供了大量示例和教程来帮助初学者理解深度学习,并提供大量的官方例子程序方便开发者快速找到适合自己项目的例子。 更多细节请点击 这里

注意:代码默认都是以测试模式运行的,程序开始时会读取 model 文件夹下相应的模型参数。

  • 如果在 pyCharm 中运行 main 函数或者如下命令行 运行是默认测试模式的。
>>python ./code/DQN.py

如果想要训练模式,可以把代码中参数--train 的默认值改为 True

parser.add_argument('--train', dest='train', default=False)
  • 第二种方法是命令行运行时候传入参数--train=True ,如下
(tf2.X) D:\Algorithm\RL_Tutorial>python ./code/DQN.py --train=True

这样就是训练模式了。另外每次训练完后程序都会把参数自动保存。

4、后记

由于水平有限,教程中难免会有错误,还请大家多多指出。

rl_tutorial's People

Contributors

novemberchopin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

rl_tutorial's Issues

DQN经验回放位置错误

DQN中,每次选择动作都会进行一次经验回放。代码中经验回放放置在了完成一幕后,可能是忘了缩进了(两个虚线之间)

    def train(self, train_episodes=200):
        if args.train:
            for episode in range(train_episodes):
                total_reward, done = 0, False
                state = self.env.reset().astype(np.float32)
                while not done:
                    action = self.choose_action(state)
                    next_state, reward, done, _ = self.env.step(action)
                    next_state = next_state.astype(np.float32)
                    self.buffer.push(state, action, reward, next_state, done)
                    total_reward += reward
                    state = next_state
                    # self.render()
#---------------------------------------------------------------------------#
                if len(self.buffer.buffer) > args.batch_size:
                    self.replay()
                    self.target_update()
#---------------------------------------------------------------------------#
                print('EP{} EpisodeReward={}'.format(episode, total_reward))
                # if episode % 10 == 0:
                #     self.test_episode()
            self.saveModel()
        if args.test:
            self.loadModel()
            self.test_episode(test_episodes=args.test_episodes)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.