rlpy / rlpy Goto Github PK

RLPy Reinforcement Learning Framework

License: BSD 3-Clause "New" or "Revised" License

Python 57.27% C++ 0.08% C 5.55% PLSQL 37.11%

rlpy's Issues

Style Guide?

Do you have a style guide we should follow for newly developed code we intend to contribute? It looks like you aren't following PEP8 dogmatically. If you use something else, I would like to reformat before my line count increases. Thanks!

Cassandra format

Hi,

I'm curious if there are plans to add support for POMDP via the cassandra format (http://www.pomdp.org/code/pomdp-file-spec.html) so that these models can be solved with rlpy? Or is this beyond the scope? Thanks.

Deterministic Hyperopt [Migration from BitBucket]

William Dabney created an issue 2014-03-20

The parameters generated by hyperopt appear to come from a fixed seed value plus the current job id. The result is that the parameter settings considered are essentially deterministic.
The current development version of hyperopt (0.0.3-dev) in their git repository has fixed this problem, but there does not appear to be any pip based installation available yet.
It doesn't look like this is something that can easily be fixed on our side, but I just want to make you guys aware of it (in case you aren't already) because using hyperopt might yield some weird results. I can't test if this occurs on condor as well, but it happens when using joblib for parallelization.

statespace_limits with Tabular and IncrTabular issue

Hi, I've been testing rlpy already some days. I've encountered a problem when I wanted to put different statespace_limits in my domain. When the values in different dimensions are not the same, it just doesn't work. I tried with RBF and Fourier and I don't have that problem. I add the space limit I use:

 statespace_limits = np.array([[-10., 10.], [-5, 5]])  # not working
 # statespace_limits = np.array([[-10., 10.], [-10., 10.]]) # working

And I add the error trace so you can see it.

Traceback (most recent call last):
  File "test.py", line 103, in <module>
    visualize_performance=False)  # show performance runs?
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 350, in run
    self.evaluate(total_steps, episode_number, visualize_performance)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 445, in evaluate
    total_steps, visualize=visualize > j)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 212, in performanceRun
    a = self.agent.policy.pi(s, eps_term, p_actions)
  File "/home/aronnax/dev/rlpy_test/tuto/eGreedyTut.py", line 26, in pi
    b_actions = self.representation.bestActions(s, terminal, p_actions)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 358, in bestActions
    Qs = self.Qs(s, terminal, phi_s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 170, in Qs
    phi_s = self.phi(s, terminal)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 222, in phi
    return self.phi_nonTerminal(s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Tabular.py", line 31, in phi_nonTerminal
    hashVal = self.hashState(s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 299, in hashState
    ds = self.binState(s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 335, in binState
    assert (np.all(s >= limits[:, 0]))
AssertionError

Errors of installing rlpy through Pypi

Hello,
I tried to install the RLPy for the first time using command:

pip install -U rlpy

But I got following errors:

Error compiling Cython file:
------------------------------------------------------------
...
from libcpp cimport bool
cimport numpy as np
import numpy as np
cimport cython
ctypedef unsigned int uint
cimport c_kernels
       ^
------------------------------------------------------------

rlpy\Representations\kernels.pyx:15:8: 'c_kernels.pxd' not found

Error compiling Cython file:
------------------------------------------------------------
...
    cdef np.ndarray[double, ndim=2, mode="c"] s1 = widths
    x1 = np.ascontiguousarray(x, dtype=np.float64)
    cdef int i

    for i in xrange(len(centers)):
        res[i] = c_kernels.gaussian_kernel(&x1[0], &y1[i,0], dimv, &s1[i, 0])
                                          ^
------------------------------------------------------------

rlpy\Representations\kernels.pyx:59:43: Cannot convert 'double *' to Python object

I also tried to install it using setup.py file without any success due to:

rlpy/Representations/c_kernels.cc(37) : error C3861: 'fmin': identifier not found

I had my python environment installed with:
64-bit windows 7 Professional
Anaconda with conda version 3.18.6
Microsoft Visual C++ Compiler for Python 2.7

I tried to find the errors on google, the error C3861 'fmin' seems caused by the lake of C99 feature of visual studio, does this mean I need to install studio as well? or have I missed any dependencies?

Pinball domain can violate its statespace_limits

This should be easy to fix, but I just ran into it. In Pinball the ball dx/dy variable can leave the limits defined in statespace_limits due to collisions and the fact that in one location the value is not being clipped. Specifically, lines 451 and 452 in Pinball.py need to be wrapped in a call to self.ball._clip.

Example

Hello, I have been using this library, and I did not found a basic linear function approximation model(only tabular and more advanced ones such as iFDD) and examples of use. My case is a very simple one: I have a binary feature vector of size 300 where each position in the vector tells about the state(is it near the stair? is it far from the origin, etc), and I would like to train linear q-learning with it.
Thanks for reading!

How to set grid_bins and resolution in the RBFs representation?

I did the cart-pole and mountain car experiments. Both of them perform badly using RBFs representation when the variable grid_bins are set. But they could get a good result when grid_bins are none which means RBFs centers are chosen uniform-randomly and resolution are set to 10 to 20, but still perform badly when resolution are set too small. I don't know how to choose these parameters so that experiments can be sure to perform well. Thanks a lot.

Is this project still in use?

I am trying to implement a university projekt using RL - i wanted to ask if RLPY is still in use and/or development; i tried to download the latest version and follow the instructions given in the documentary - but i cannot run "gridworld.py": TypeError: init() takes exactly 2 arguments (1 given)

=> maybe if the project is not supported any more, can you suggest an alternative?

thank you

Can't install in Python 3.6

Is there any update on this?

RLPy requires Python 2.7 to run. We do not support Python 3 at the moment since most scientific libraries still require Python 2.

RBF representation error

The attribute state_dimensions is a list in RBF representation. This list is used as index in phi_nonTerminal method which causes this error

  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 350, in run
    self.evaluate(total_steps, episode_number, visualize_performance)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 445, in evaluate
    total_steps, visualize=visualize > j)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 212, in performanceRun
    a = self.agent.policy.pi(s, eps_term, p_actions)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Policies/eGreedy.py", line 44, in pi
    b_actions = self.representation.bestActions(s, terminal, p_actions)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 358, in bestActions
    Qs = self.Qs(s, terminal, phi_s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 170, in Qs
    phi_s = self.phi(s, terminal)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 222, in phi
    return self.phi_nonTerminal(s)
  File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/RBF.py", line 112, in phi_nonTerminal
    s = s[self.state_dimensions]
TypeError: list indices must be integers, not list

Importance sampling in Greedy GQ(lambda)

First of all, thanks a lot for making this wonderful framework. I started looking at it and feeling really good about it.

I have a question about the Greedy-GQ implementation. I think the implementation you have is basically for lambda=0, and when lambda>0, you have heuristically added the eligibility trace vector.

However, there is a more principled Greedy GQ(lambda) with eligibility traces in Maei's (2011) thesis which uses importance sampling ratios. I am wondering whether you considered using such ratios in your implementation.

can't install in osX

Hi I have been trying to install and here is what I get:
<<<<<<<<<<<<<<<<<<<
Exception: Cython-generated file 'rlpy/Representations/c_kernels.pxd' not found.
Cython is required to compile rlpy from a development branch.
Please install Cython or download a release package of rlpy.

Command "/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python -u -c "import setuptools, tokenize;file='/private/tmp/pip-build-0IehzP/rlpy/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-_rXqnl-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/tmp/pip-build-0IehzP/rlpy/
<<<<<<<<<<<<<<<<<<<
I had installed cython and xcode

Best,

TD-Error not meaningful?

We exposed the class member td_error in the TDControlAgent in an attempt to log algorithm progress, but the values seem noisy and do not converge even when the agent finds an excellent representation. Is td_error hidden for a reason; are we missing something by trying to track it?

Including Policy Gradient Techniques

Pierre-Luc Bacon

The project description suggests that RLPy is mainly about value function based algorithms. However, I think it'd be nice to add Will Dabney's implementation of some of the popular policy gradient methods.
https://github.com/amarack/python-rl/blob/master/pyrl/agents/policy_gradient.py

Christoph Dann

We totally agree with you. This is definitely a near-future goal for RLPy. Which specific method you suggest to address first?
Btw: There is an implementation of Natural Actor Critic in RLPy, but unfortunately it is tested very little so far (c.f. the simple example in examples/gridworld/nac.py)

Pierre-Luc Bacon

I think that all of Will's code should be included !
Having an implementation of REINFORCE would also be a useful baseline.

rlpy / rlpy Goto Github PK

rlpy's Issues

William Dabney created an issue 2014-03-20

Pierre-Luc Bacon

Christoph Dann

Pierre-Luc Bacon

Recommend Projects

Recommend Topics

Recommend Org