rlpy / rlpy Goto Github PK
View Code? Open in Web Editor NEWRLPy Reinforcement Learning Framework
License: BSD 3-Clause "New" or "Revised" License
RLPy Reinforcement Learning Framework
License: BSD 3-Clause "New" or "Revised" License
Do you have a style guide we should follow for newly developed code we intend to contribute? It looks like you aren't following PEP8 dogmatically. If you use something else, I would like to reformat before my line count increases. Thanks!
Hi,
I'm curious if there are plans to add support for POMDP via the cassandra format (http://www.pomdp.org/code/pomdp-file-spec.html) so that these models can be solved with rlpy? Or is this beyond the scope? Thanks.
The parameters generated by hyperopt appear to come from a fixed seed value plus the current job id. The result is that the parameter settings considered are essentially deterministic.
The current development version of hyperopt (0.0.3-dev) in their git repository has fixed this problem, but there does not appear to be any pip based installation available yet.
It doesn't look like this is something that can easily be fixed on our side, but I just want to make you guys aware of it (in case you aren't already) because using hyperopt might yield some weird results. I can't test if this occurs on condor as well, but it happens when using joblib for parallelization.
Hi, I've been testing rlpy already some days. I've encountered a problem when I wanted to put different statespace_limits
in my domain. When the values in different dimensions are not the same, it just doesn't work. I tried with RBF and Fourier and I don't have that problem. I add the space limit I use:
statespace_limits = np.array([[-10., 10.], [-5, 5]]) # not working
# statespace_limits = np.array([[-10., 10.], [-10., 10.]]) # working
And I add the error trace so you can see it.
Traceback (most recent call last):
File "test.py", line 103, in <module>
visualize_performance=False) # show performance runs?
File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 350, in run
self.evaluate(total_steps, episode_number, visualize_performance)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 445, in evaluate
total_steps, visualize=visualize > j)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 212, in performanceRun
a = self.agent.policy.pi(s, eps_term, p_actions)
File "/home/aronnax/dev/rlpy_test/tuto/eGreedyTut.py", line 26, in pi
b_actions = self.representation.bestActions(s, terminal, p_actions)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 358, in bestActions
Qs = self.Qs(s, terminal, phi_s)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 170, in Qs
phi_s = self.phi(s, terminal)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 222, in phi
return self.phi_nonTerminal(s)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Tabular.py", line 31, in phi_nonTerminal
hashVal = self.hashState(s)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 299, in hashState
ds = self.binState(s)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 335, in binState
assert (np.all(s >= limits[:, 0]))
AssertionError
Hello,
I tried to install the RLPy for the first time using command:
pip install -U rlpy
But I got following errors:
Error compiling Cython file:
------------------------------------------------------------
...
from libcpp cimport bool
cimport numpy as np
import numpy as np
cimport cython
ctypedef unsigned int uint
cimport c_kernels
^
------------------------------------------------------------
rlpy\Representations\kernels.pyx:15:8: 'c_kernels.pxd' not found
Error compiling Cython file:
------------------------------------------------------------
...
cdef np.ndarray[double, ndim=2, mode="c"] s1 = widths
x1 = np.ascontiguousarray(x, dtype=np.float64)
cdef int i
for i in xrange(len(centers)):
res[i] = c_kernels.gaussian_kernel(&x1[0], &y1[i,0], dimv, &s1[i, 0])
^
------------------------------------------------------------
rlpy\Representations\kernels.pyx:59:43: Cannot convert 'double *' to Python object
I also tried to install it using setup.py file without any success due to:
rlpy/Representations/c_kernels.cc(37) : error C3861: 'fmin': identifier not found
I had my python environment installed with:
64-bit windows 7 Professional
Anaconda with conda version 3.18.6
Microsoft Visual C++ Compiler for Python 2.7
I tried to find the errors on google, the error C3861 'fmin' seems caused by the lake of C99 feature of visual studio, does this mean I need to install studio as well? or have I missed any dependencies?
This should be easy to fix, but I just ran into it. In Pinball the ball dx/dy variable can leave the limits defined in statespace_limits due to collisions and the fact that in one location the value is not being clipped. Specifically, lines 451 and 452 in Pinball.py need to be wrapped in a call to self.ball._clip.
Hello, I have been using this library, and I did not found a basic linear function approximation model(only tabular and more advanced ones such as iFDD) and examples of use. My case is a very simple one: I have a binary feature vector of size 300 where each position in the vector tells about the state(is it near the stair? is it far from the origin, etc), and I would like to train linear q-learning with it.
Thanks for reading!
I did the cart-pole and mountain car experiments. Both of them perform badly using RBFs representation when the variable grid_bins are set. But they could get a good result when grid_bins are none which means RBFs centers are chosen uniform-randomly and resolution are set to 10 to 20, but still perform badly when resolution are set too small. I don't know how to choose these parameters so that experiments can be sure to perform well. Thanks a lot.
I am trying to implement a university projekt using RL - i wanted to ask if RLPY is still in use and/or development; i tried to download the latest version and follow the instructions given in the documentary - but i cannot run "gridworld.py": TypeError: init() takes exactly 2 arguments (1 given)
=> maybe if the project is not supported any more, can you suggest an alternative?
thank you
Is there any update on this?
RLPy requires Python 2.7 to run. We do not support Python 3 at the moment since most scientific libraries still require Python 2.
The attribute state_dimensions
is a list in RBF representation. This list is used as index in phi_nonTerminal
method which causes this error
File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 350, in run
self.evaluate(total_steps, episode_number, visualize_performance)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 445, in evaluate
total_steps, visualize=visualize > j)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Experiments/Experiment.py", line 212, in performanceRun
a = self.agent.policy.pi(s, eps_term, p_actions)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Policies/eGreedy.py", line 44, in pi
b_actions = self.representation.bestActions(s, terminal, p_actions)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 358, in bestActions
Qs = self.Qs(s, terminal, phi_s)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 170, in Qs
phi_s = self.phi(s, terminal)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/Representation.py", line 222, in phi
return self.phi_nonTerminal(s)
File "/usr/local/lib/python2.7/dist-packages/rlpy/Representations/RBF.py", line 112, in phi_nonTerminal
s = s[self.state_dimensions]
TypeError: list indices must be integers, not list
First of all, thanks a lot for making this wonderful framework. I started looking at it and feeling really good about it.
I have a question about the Greedy-GQ implementation. I think the implementation you have is basically for lambda=0, and when lambda>0, you have heuristically added the eligibility trace vector.
However, there is a more principled Greedy GQ(lambda) with eligibility traces in Maei's (2011) thesis which uses importance sampling ratios. I am wondering whether you considered using such ratios in your implementation.
Hi I have been trying to install and here is what I get:
<<<<<<<<<<<<<<<<<<<
Exception: Cython-generated file 'rlpy/Representations/c_kernels.pxd' not found.
Cython is required to compile rlpy from a development branch.
Please install Cython or download a release package of rlpy.
Command "/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python -u -c "import setuptools, tokenize;file='/private/tmp/pip-build-0IehzP/rlpy/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-_rXqnl-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/tmp/pip-build-0IehzP/rlpy/
<<<<<<<<<<<<<<<<<<<
I had installed cython and xcode
Best,
We exposed the class member td_error in the TDControlAgent in an attempt to log algorithm progress, but the values seem noisy and do not converge even when the agent finds an excellent representation. Is td_error hidden for a reason; are we missing something by trying to track it?
The project description suggests that RLPy is mainly about value function based algorithms. However, I think it'd be nice to add Will Dabney's implementation of some of the popular policy gradient methods.
https://github.com/amarack/python-rl/blob/master/pyrl/agents/policy_gradient.py
We totally agree with you. This is definitely a near-future goal for RLPy. Which specific method you suggest to address first?
Btw: There is an implementation of Natural Actor Critic in RLPy, but unfortunately it is tested very little so far (c.f. the simple example in examples/gridworld/nac.py)
I think that all of Will's code should be included !
Having an implementation of REINFORCE would also be a useful baseline.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.