Git Product home page Git Product logo

rlpy's Introduction

Travis_

RLPy - Reinforment Learning Framework

RLPy is a framework to conduct sequential decision making experiments. The current focus of this project lies on value-function-based reinforcement learning. The project is distributed under the 3-Clause BSD License.

Important Links

Install

Installation instructions can be found at http://rlpy.readthedocs.org/en/latest/install.html

rlpy's People

Contributors

amarack avatar anandtrex avatar bobklein2 avatar chrodan avatar edgera avatar epruesse avatar gadgy avatar jannschu avatar meyerd avatar perimosocordiae avatar pixelgruff avatar smcgregor avatar timotheemonceaux avatar tobytoyuito avatar zyc9012 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rlpy's Issues

Errors of installing rlpy through Pypi

Hello,
I tried to install the RLPy for the first time using command:

pip install -U rlpy

But I got following errors:

Error compiling Cython file:
------------------------------------------------------------
...
from libcpp cimport bool
cimport numpy as np
import numpy as np
cimport cython
ctypedef unsigned int uint
cimport c_kernels
       ^
------------------------------------------------------------

rlpy\Representations\kernels.pyx:15:8: 'c_kernels.pxd' not found

Error compiling Cython file:
------------------------------------------------------------
...
    cdef np.ndarray[double, ndim=2, mode="c"] s1 = widths
    x1 = np.ascontiguousarray(x, dtype=np.float64)
    cdef int i

    for i in xrange(len(centers)):
        res[i] = c_kernels.gaussian_kernel(&x1[0], &y1[i,0], dimv, &s1[i, 0])
                                          ^
------------------------------------------------------------

rlpy\Representations\kernels.pyx:59:43: Cannot convert 'double *' to Python object

I also tried to install it using setup.py file without any success due to:

rlpy/Representations/c_kernels.cc(37) : error C3861: 'fmin': identifier not found

I had my python environment installed with:
64-bit windows 7 Professional
Anaconda with conda version 3.18.6
Microsoft Visual C++ Compiler for Python 2.7

I tried to find the errors on google, the error C3861 'fmin' seems caused by the lake of C99 feature of visual studio, does this mean I need to install studio as well? or have I missed any dependencies?

Including Policy Gradient Techniques

Pierre-Luc Bacon

The project description suggests that RLPy is mainly about value function based algorithms. However, I think it'd be nice to add Will Dabney's implementation of some of the popular policy gradient methods.
https://github.com/amarack/python-rl/blob/master/pyrl/agents/policy_gradient.py

Christoph Dann

We totally agree with you. This is definitely a near-future goal for RLPy. Which specific method you suggest to address first?
Btw: There is an implementation of Natural Actor Critic in RLPy, but unfortunately it is tested very little so far (c.f. the simple example in examples/gridworld/nac.py)

Pierre-Luc Bacon

I think that all of Will's code should be included !
Having an implementation of REINFORCE would also be a useful baseline.

Can't install in Python 3.6

Is there any update on this?

RLPy requires Python 2.7 to run. We do not support Python 3 at the moment since most scientific libraries still require Python 2.

RBF representation error

The attribute state_dimensions is a list in RBF representation. This list is used as index in phi_nonTerminal method which causes this error

  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 350, in run
    self.evaluate(total_steps, episode_number, visualize_performance)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 445, in evaluate
    total_steps, visualize=visualize > j)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 212, in performanceRun
    a = self.agent.policy.pi(s, eps_term, p_actions)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Policies/eGreedy.py", line 44, in pi
    b_actions = self.representation.bestActions(s, terminal, p_actions)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 358, in bestActions
    Qs = self.Qs(s, terminal, phi_s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 170, in Qs
    phi_s = self.phi(s, terminal)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 222, in phi
    return self.phi_nonTerminal(s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/RBF.py", line 112, in phi_nonTerminal
    s = s[self.state_dimensions]
TypeError: list indices must be integers, not list

Pinball domain can violate its statespace_limits

This should be easy to fix, but I just ran into it. In Pinball the ball dx/dy variable can leave the limits defined in statespace_limits due to collisions and the fact that in one location the value is not being clipped. Specifically, lines 451 and 452 in Pinball.py need to be wrapped in a call to self.ball._clip.

Is this project still in use?

I am trying to implement a university projekt using RL - i wanted to ask if RLPY is still in use and/or development; i tried to download the latest version and follow the instructions given in the documentary - but i cannot run "gridworld.py": TypeError: init() takes exactly 2 arguments (1 given)

=> maybe if the project is not supported any more, can you suggest an alternative?

thank you

Style Guide?

Do you have a style guide we should follow for newly developed code we intend to contribute? It looks like you aren't following PEP8 dogmatically. If you use something else, I would like to reformat before my line count increases. Thanks!

statespace_limits with Tabular and IncrTabular issue

Hi, I've been testing rlpy already some days. I've encountered a problem when I wanted to put different statespace_limits in my domain. When the values in different dimensions are not the same, it just doesn't work. I tried with RBF and Fourier and I don't have that problem. I add the space limit I use:

 statespace_limits = np.array([[-10., 10.], [-5, 5]])  # not working
 # statespace_limits = np.array([[-10., 10.], [-10., 10.]]) # working

And I add the error trace so you can see it.

Traceback (most recent call last):
  File "test.py", line 103, in <module>
    visualize_performance=False)  # show performance runs?
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 350, in run
    self.evaluate(total_steps, episode_number, visualize_performance)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 445, in evaluate
    total_steps, visualize=visualize > j)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 212, in performanceRun
    a = self.agent.policy.pi(s, eps_term, p_actions)
  File "/home/aronnax/dev/rlpy_test/tuto/eGreedyTut.py", line 26, in pi
    b_actions = self.representation.bestActions(s, terminal, p_actions)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 358, in bestActions
    Qs = self.Qs(s, terminal, phi_s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 170, in Qs
    phi_s = self.phi(s, terminal)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 222, in phi
    return self.phi_nonTerminal(s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Tabular.py", line 31, in phi_nonTerminal
    hashVal = self.hashState(s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 299, in hashState
    ds = self.binState(s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 335, in binState
    assert (np.all(s >= limits[:, 0]))
AssertionError

can't install in osX

Hi I have been trying to install and here is what I get:
<<<<<<<<<<<<<<<<<<<
Exception: Cython-generated file 'rlpy/Representations/c_kernels.pxd' not found.
Cython is required to compile rlpy from a development branch.
Please install Cython or download a release package of rlpy.

Command "/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python -u -c "import setuptools, tokenize;file='/private/tmp/pip-build-0IehzP/rlpy/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-_rXqnl-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/tmp/pip-build-0IehzP/rlpy/
<<<<<<<<<<<<<<<<<<<
I had installed cython and xcode

Best,

TD-Error not meaningful?

We exposed the class member td_error in the TDControlAgent in an attempt to log algorithm progress, but the values seem noisy and do not converge even when the agent finds an excellent representation. Is td_error hidden for a reason; are we missing something by trying to track it?

Importance sampling in Greedy GQ(lambda)

First of all, thanks a lot for making this wonderful framework. I started looking at it and feeling really good about it.

I have a question about the Greedy-GQ implementation. I think the implementation you have is basically for lambda=0, and when lambda>0, you have heuristically added the eligibility trace vector.

However, there is a more principled Greedy GQ(lambda) with eligibility traces in Maei's (2011) thesis which uses importance sampling ratios. I am wondering whether you considered using such ratios in your implementation.

How to set grid_bins and resolution in the RBFs representation?

I did the cart-pole and mountain car experiments. Both of them perform badly using RBFs representation when the variable grid_bins are set. But they could get a good result when grid_bins are none which means RBFs centers are chosen uniform-randomly and resolution are set to 10 to 20, but still perform badly when resolution are set too small. I don't know how to choose these parameters so that experiments can be sure to perform well. Thanks a lot.

Deterministic Hyperopt [Migration from BitBucket]

William Dabney created an issue 2014-03-20

The parameters generated by hyperopt appear to come from a fixed seed value plus the current job id. The result is that the parameter settings considered are essentially deterministic.
The current development version of hyperopt (0.0.3-dev) in their git repository has fixed this problem, but there does not appear to be any pip based installation available yet.
It doesn't look like this is something that can easily be fixed on our side, but I just want to make you guys aware of it (in case you aren't already) because using hyperopt might yield some weird results. I can't test if this occurs on condor as well, but it happens when using joblib for parallelization.

Example

Hello, I have been using this library, and I did not found a basic linear function approximation model(only tabular and more advanced ones such as iFDD) and examples of use. My case is a very simple one: I have a binary feature vector of size 300 where each position in the vector tells about the state(is it near the stair? is it far from the origin, etc), and I would like to train linear q-learning with it.
Thanks for reading!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.