Git Product home page Git Product logo

Comments (4)

boris-il-forte avatar boris-il-forte commented on July 19, 2024

You can use Cartpole from gym, if you are using the Gym class of mushroom_rl, that simply interface any openai gym environment with mushroom_rl.

Gym cartpole and mushroom cartpole are different environments. Look at the documentation of mushroom_rl to find the related paper.
Maybe you are not using a sufficient amount of features (try to use generate from GaussianRBF, it will generate a uniform grid in the space).
Another problem may be the exploration: if your initial policy doesn't explore the state space sufficiently, lspi may fail. A common trick is to reuse the learned policy to extract a better dataset.

from mushroom-rl.

kishanpb avatar kishanpb commented on July 19, 2024

I am using the learned policy to get a new dataset as follows:

# Train
    core.learn(n_episodes=1500, n_episodes_per_fit=100)

This will essentially use the learned policy in every ~100th episode to generate new dataset, until 1500 episodes are executed.

I'll try this suggestion:

Maybe you are not using a sufficient amount of features (try to use generate from GaussianRBF, it will generate a uniform grid in the space).

from mushroom-rl.

kishanpb avatar kishanpb commented on July 19, 2024

This is hard! I tried various bases. Nothing seems to work! Any more help?

# basis 1
    basis = [PolynomialBasis()]

    s1 = np.array([-4, -3, 0, 3, 4])
    s2 = np.array([-1, 0, 1])
    s3 = np.array([-2*np.pi, -np.pi, 0, np.pi, 2*np.pi]) * .25
    s4 = np.array([-1, 0, 1])
    s = np.array(np.meshgrid(s1,s2,s3,s4)).T.reshape(-1,4)
    for i in s:
        basis.append(GaussianRBF(i, np.array([1.])))

    # basis 2
    basis = [PolynomialBasis()]
    s = ([1,1,1,1], [0,0,0,0] , [1,0,1,0])
    s = np.array(s)
    for i in s:
        basis.append(GaussianRBF(i, np.array([2.])))

    # basis 3
    basis=GaussianRBF.generate(n_centers=[3,3,3,3], low=[-4,-3,-np.pi,-3],\
                                     high=[4,3,np.pi,3], dimensions=[1,1,1,1])
    basis.append(PolynomialBasis())

    # basis 4
    basis = PolynomialBasis.generate(max_degree=10, input_size=4)

from mushroom-rl.

boris-il-forte avatar boris-il-forte commented on July 19, 2024

basis 3 is wrong. Remove the dimension parameter so it will use all dimensions.
basis 4 uses unreasonable parameters: a polynomial of degree 10 is way to complex than needed. try with lower grade polynomial (degree 2 or 3).

If nothing works, try to change the algorithm: use sarsa lambda or dqn.

from mushroom-rl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.