Git Product home page Git Product logo

mbcd's People

Contributors

jupilogy avatar lucasalegre avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

mbcd's Issues

Out of bound log prob

Hi,
as you can see from the screenshot I linked, there could be a problem in the calculation of the log probability in the function def get_logprob2(self, x, means, variances): at line 65 in mbcd.py file.
The log probability goes above zero consistently after some thousand steps. Should it be in the interval (-inf and 0] ?
As you can see from the debugger the log_prob is 0.92 and the prob is 2.52.
Thank you very much for your time.

log_prob_out_of_bound_2

Errors on imports

Hi,
when executing the code there are some imports that are not found in the file mbcd/models/bnn.py.

16 from drl_cd.models.utils import get_required_argument, TensorStandardScaler
17 from drl_cd.models.fc import FC
18
19 from drl_cd.utils.logger import Progress, Silent

drl_cd doesn't exist. I suppose it has to be substituted by mbcd. Is this right?

Furthermore, in the file experiments/mbcd_run.py at line

70 model.deepRLCD.save_current()
71 model.deepRLCD.save_models()

deepRLCD can't be found and the execution stops.

Traceback (most recent call last):
  File "/home/valerio/PycharmProjects/mbcd/experiments/mbcd_run.py", line 95, in <module>
    main(config)
  File "/home/valerio/PycharmProjects/mbcd/experiments/mbcd_run.py", line 71, in main
    model.deepRLCD.save_current()
AttributeError: 'SAC' object has no attribute 'deepRLCD'

Log-likelihood doubt

Hi,
I have some doubts on the likelihood estimation performed by the function def get_logprob2(self, x, means, variances): at line 77 in mbcd.py file.
From the comment at line 76 the variable log_prob should have [num_networks, batch_size] size but the result of line 77 is a scalar. The following lines(80) assume that the result is a matrix,

76    ## [ num_networks, batch_size ]
77    log_prob = -1/2 * (k*np.log(2*np.pi) + np.log(variance).sum(-1) + (np.power(x-mean, 2)/variance).sum(-1))  # [1,]
78
79    ## [ batch_size ]
80    prob = np.exp(log_prob).sum(axis=0)
81
82    ## [ batch_size ]
83    log_prob = np.log(prob + 1e-8)  # Avoid log of zero

Is that a problem?

Furthermore, I want to ask you some clarification about the var_mean formulas at line 85 and 86.
In the code there are two formulas which leads to different results.

85    var_mean = np.var(means, axis=0).max(axis=-1)
86    var_mean = np.linalg.norm(np.std(means, axis=0), axis=-1)

Which of the two is the right one to use?
Thank you in advance

Reproducibility issues

Hi,
I'm trying to get deterministic results once I fix a seed.
At the moment I'm passing the seed argument when execution starts and I fixed the 'n_cpu_tf_sess' variable to 1 as suggested in the code comments.
Doing so, the result is deterministic until policy learning starts. So from timestep 0 to 'learning_starts' timesteps, the result is repeatable. When policy learning starts after the initial exploring phase the results are non longer deterministic.
Is there anything else I need to set up in order to get deterministic results?
Thank you for your time,

Valerio

Doubt on rollout length calculation

Hi,
I have a doubt about the following method.

def set_rollout_length(self):

If I'm not mistaken, this method should set the rollout length based on self.rollout_schedule attribute.
I don't understand why at line 649 we initialize for the first time a new attribute named self._next_idx. The attribute is not use anywhere else and so its value is not used. Maybe the line was intended to be self.replay_buffer._next_idx = len(self.replay_buffer)?

Thank you in advance for your time

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.