Git Product home page Git Product logo

finenvs's People

Contributors

hmomin avatar mugiwarakaizoku avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

mugiwarakaizoku

finenvs's Issues

SAC Issues

@mugiwarakaizoku

I'm having trouble getting SAC to learn Cartpole effectively. Below is sample output of one of the better trials, but in most trials, it can't even break above a total reward of 10.

Also, there is a memory leak somewhere that triggers after about 2.2-million samples for me: based on the error message, it looks like it's resulting from not detaching the output from means = self.forward(states) in the actor file, but I'll let you see to it.

Importing module 'gym_37' (/home/momin/Documents/isaacgym/python/isaacgym/_bindings/linux-x86_64/gym_37.so)
Setting GYM_USD_PLUG_INFO_PATH to /home/momin/Documents/isaacgym/python/isaacgym/_bindings/linux-x86_64/usd/plugInfo.json
PyTorch version 1.10.2+cu113
Device count 1
/home/momin/Documents/isaacgym/python/isaacgym/_bindings/src/gymtorch
Using /home/momin/.cache/torch_extensions/py37_cu113 as PyTorch extensions root...
Emitting ninja build file /home/momin/.cache/torch_extensions/py37_cu113/gymtorch/build.ninja...
Building extension module gymtorch...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module gymtorch...
/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/gym/spaces/box.py:112: UserWarning: WARN: Box bound precision lowered by casting to float32
  logger.warn(f"Box bound precision lowered by casting to {self.dtype}")
Not connected to PVD
+++ Using GPU PhysX
Physics Engine: PhysX
Physics Device: cuda:0
GPU Pipeline: enabled
num samples: 48640 - evaluation return: 42.664555 - mean training return: 19.240843 - std dev training return: 21.298613
num samples: 75264 - evaluation return: 17.117111 - mean training return: 37.337551 - std dev training return: 24.090464
num samples: 96256 - evaluation return: 9.022424 - mean training return: 34.875938 - std dev training return: 26.308626
num samples: 116224 - evaluation return: 7.290071 - mean training return: 33.133209 - std dev training return: 25.416498
num samples: 136192 - evaluation return: 8.945022 - mean training return: 32.170544 - std dev training return: 25.197166
num samples: 157184 - evaluation return: 11.210396 - mean training return: 32.678501 - std dev training return: 22.590132
num samples: 178176 - evaluation return: 8.651594 - mean training return: 33.387737 - std dev training return: 24.583792
num samples: 203264 - evaluation return: 19.072805 - mean training return: 33.129204 - std dev training return: 22.507452
num samples: 224256 - evaluation return: 10.693352 - mean training return: 33.824829 - std dev training return: 22.631231
num samples: 242688 - evaluation return: 7.536970 - mean training return: 35.717468 - std dev training return: 26.653385
num samples: 263168 - evaluation return: 9.159291 - mean training return: 39.677971 - std dev training return: 26.461964
num samples: 284672 - evaluation return: 13.753101 - mean training return: 38.719288 - std dev training return: 25.779493
num samples: 306688 - evaluation return: 11.256489 - mean training return: 41.800442 - std dev training return: 27.398203
num samples: 326656 - evaluation return: 7.761618 - mean training return: 41.407909 - std dev training return: 30.619041
num samples: 349184 - evaluation return: 12.650271 - mean training return: 39.605946 - std dev training return: 26.463860
num samples: 368640 - evaluation return: 7.406431 - mean training return: 43.363243 - std dev training return: 31.207508
num samples: 391168 - evaluation return: 13.261372 - mean training return: 40.798859 - std dev training return: 28.053375
num samples: 411136 - evaluation return: 8.921464 - mean training return: 47.900318 - std dev training return: 34.384724
num samples: 431616 - evaluation return: 10.965508 - mean training return: 41.790977 - std dev training return: 30.708284
num samples: 459264 - evaluation return: 24.945358 - mean training return: 42.953075 - std dev training return: 33.013519
num samples: 480256 - evaluation return: 10.809840 - mean training return: 41.681019 - std dev training return: 29.536516
num samples: 500736 - evaluation return: 10.699786 - mean training return: 42.309608 - std dev training return: 30.666838
num samples: 519680 - evaluation return: 7.754902 - mean training return: 38.211960 - std dev training return: 29.142509
num samples: 538112 - evaluation return: 7.239990 - mean training return: 40.609222 - std dev training return: 30.046165
num samples: 557568 - evaluation return: 7.792455 - mean training return: 41.511486 - std dev training return: 29.129290
num samples: 579584 - evaluation return: 15.246098 - mean training return: 43.569519 - std dev training return: 30.588730
num samples: 598528 - evaluation return: 6.837287 - mean training return: 44.639370 - std dev training return: 32.618675
num samples: 616960 - evaluation return: 8.379328 - mean training return: 44.910011 - std dev training return: 30.007103
num samples: 635392 - evaluation return: 7.636024 - mean training return: 41.894653 - std dev training return: 29.709085
num samples: 654848 - evaluation return: 10.105590 - mean training return: 42.790398 - std dev training return: 29.942787
num samples: 674304 - evaluation return: 9.454279 - mean training return: 42.145344 - std dev training return: 30.102503
num samples: 696832 - evaluation return: 15.113770 - mean training return: 42.603848 - std dev training return: 26.551289
num samples: 743936 - evaluation return: 69.809341 - mean training return: 44.716377 - std dev training return: 31.713352
num samples: 765952 - evaluation return: 14.800228 - mean training return: 48.687096 - std dev training return: 35.265812
num samples: 784384 - evaluation return: 8.898764 - mean training return: 44.878021 - std dev training return: 29.701893
num samples: 810496 - evaluation return: 22.929743 - mean training return: 42.003948 - std dev training return: 29.101030
num samples: 830464 - evaluation return: 8.730614 - mean training return: 46.895416 - std dev training return: 30.673267
num samples: 850944 - evaluation return: 10.750460 - mean training return: 44.366295 - std dev training return: 32.735119
num samples: 869376 - evaluation return: 7.646038 - mean training return: 42.031437 - std dev training return: 30.974838
num samples: 888320 - evaluation return: 8.660542 - mean training return: 45.897411 - std dev training return: 35.273087
num samples: 910336 - evaluation return: 14.657757 - mean training return: 42.573399 - std dev training return: 29.213062
num samples: 951808 - evaluation return: 59.844833 - mean training return: 44.369228 - std dev training return: 32.552788
num samples: 972288 - evaluation return: 10.970460 - mean training return: 42.581337 - std dev training return: 26.832909
num samples: 990208 - evaluation return: 8.688063 - mean training return: 42.989204 - std dev training return: 27.803591
num samples: 1009664 - evaluation return: 10.115323 - mean training return: 44.869339 - std dev training return: 32.852955
num samples: 1028608 - evaluation return: 7.315423 - mean training return: 41.035736 - std dev training return: 32.797501
num samples: 1051648 - evaluation return: 17.410482 - mean training return: 43.608242 - std dev training return: 32.394970
num samples: 1070080 - evaluation return: 8.257707 - mean training return: 44.351231 - std dev training return: 29.345806
num samples: 1089024 - evaluation return: 7.072944 - mean training return: 44.150719 - std dev training return: 31.034515
num samples: 1107968 - evaluation return: 7.315763 - mean training return: 45.740803 - std dev training return: 29.843706
num samples: 1126912 - evaluation return: 8.030341 - mean training return: 48.802032 - std dev training return: 32.735664
num samples: 1147904 - evaluation return: 12.481560 - mean training return: 46.902039 - std dev training return: 30.762377
num samples: 1165824 - evaluation return: 7.350004 - mean training return: 49.774536 - std dev training return: 34.108013
num samples: 1184768 - evaluation return: 8.855827 - mean training return: 48.475475 - std dev training return: 33.205433
num samples: 1203200 - evaluation return: 6.800958 - mean training return: 43.822147 - std dev training return: 27.918304
num samples: 1249280 - evaluation return: 60.188492 - mean training return: 48.652798 - std dev training return: 32.888950
num samples: 1267200 - evaluation return: 7.280651 - mean training return: 43.635883 - std dev training return: 29.472729
num samples: 1314816 - evaluation return: 68.751907 - mean training return: 45.681065 - std dev training return: 32.199825
num samples: 1334784 - evaluation return: 10.479751 - mean training return: 46.177887 - std dev training return: 33.436707
num samples: 1363456 - evaluation return: 27.123913 - mean training return: 45.143280 - std dev training return: 32.398781
num samples: 1382912 - evaluation return: 10.328647 - mean training return: 43.507858 - std dev training return: 35.936104
num samples: 1401856 - evaluation return: 7.638084 - mean training return: 44.668758 - std dev training return: 31.289669
num samples: 1432576 - evaluation return: 32.943344 - mean training return: 44.900688 - std dev training return: 31.360880
num samples: 1452544 - evaluation return: 9.221864 - mean training return: 41.564133 - std dev training return: 27.927759
num samples: 1473536 - evaluation return: 11.704432 - mean training return: 48.011837 - std dev training return: 34.653778
num samples: 1496064 - evaluation return: 15.954937 - mean training return: 50.346596 - std dev training return: 35.377712
num samples: 1514496 - evaluation return: 8.035228 - mean training return: 47.771240 - std dev training return: 33.395077
num samples: 1533952 - evaluation return: 8.281386 - mean training return: 41.216488 - std dev training return: 30.946314
num samples: 1553408 - evaluation return: 10.508433 - mean training return: 44.966591 - std dev training return: 29.735842
num samples: 1575936 - evaluation return: 16.217566 - mean training return: 44.983177 - std dev training return: 33.251244
num samples: 1594368 - evaluation return: 8.081646 - mean training return: 44.372837 - std dev training return: 34.626404
num samples: 1615872 - evaluation return: 12.964675 - mean training return: 45.627056 - std dev training return: 30.419598
num samples: 1636864 - evaluation return: 11.474453 - mean training return: 44.596386 - std dev training return: 30.335295
num samples: 1656320 - evaluation return: 9.743287 - mean training return: 48.475723 - std dev training return: 34.589176
num samples: 1676288 - evaluation return: 9.889929 - mean training return: 45.983326 - std dev training return: 32.190174
num samples: 1706496 - evaluation return: 31.201733 - mean training return: 44.044250 - std dev training return: 29.999483
num samples: 1751552 - evaluation return: 67.660858 - mean training return: 47.377201 - std dev training return: 33.140793
num samples: 1771520 - evaluation return: 11.536253 - mean training return: 48.449409 - std dev training return: 33.765598
num samples: 1788928 - evaluation return: 7.400703 - mean training return: 46.131039 - std dev training return: 34.952114
num samples: 1810944 - evaluation return: 14.474745 - mean training return: 42.301899 - std dev training return: 35.764179
num samples: 1831936 - evaluation return: 10.688743 - mean training return: 46.439331 - std dev training return: 31.543478
num samples: 1860096 - evaluation return: 26.456173 - mean training return: 45.223267 - std dev training return: 31.939152
num samples: 1881600 - evaluation return: 10.944726 - mean training return: 44.479214 - std dev training return: 27.474779
num samples: 1905152 - evaluation return: 18.194380 - mean training return: 50.965813 - std dev training return: 35.656303
num samples: 1925120 - evaluation return: 9.182425 - mean training return: 46.331676 - std dev training return: 33.462132
num samples: 1943040 - evaluation return: 7.370675 - mean training return: 49.256039 - std dev training return: 32.469273
num samples: 1963520 - evaluation return: 10.486354 - mean training return: 46.099960 - std dev training return: 32.586018
num samples: 1980416 - evaluation return: 6.084354 - mean training return: 45.396633 - std dev training return: 33.130478
num samples: 2000384 - evaluation return: 11.129582 - mean training return: 45.089146 - std dev training return: 33.360134
num samples: 2021376 - evaluation return: 11.500680 - mean training return: 46.762657 - std dev training return: 32.203106
num samples: 2042880 - evaluation return: 12.560404 - mean training return: 49.993290 - std dev training return: 33.039204
num samples: 2061824 - evaluation return: 9.376321 - mean training return: 45.685165 - std dev training return: 33.842228
num samples: 2081792 - evaluation return: 11.399903 - mean training return: 44.692337 - std dev training return: 33.136116
num samples: 2101760 - evaluation return: 10.711651 - mean training return: 44.341721 - std dev training return: 31.529741
num samples: 2128896 - evaluation return: 24.776752 - mean training return: 47.567459 - std dev training return: 30.484776
num samples: 2200064 - evaluation return: 87.675751 - mean training return: 46.126362 - std dev training return: 31.622690
Traceback (most recent call last):
  File "examples/SAC_MLP_Isaac_Gym.py", line 30, in <module>
    train_SAC_MLP_on_environiment("Cartpole")
  File "examples/SAC_MLP_Isaac_Gym.py", line 20, in train_SAC_MLP_on_environiment
    actions = agent.step(states)
  File "/home/momin/Documents/GitHub/FinEnvs/finenvs/agents/SAC/SAC_agent.py", line 117, in step
    actions = self.actor.get_actions_and_log_probs(states)[0]
  File "/home/momin/Documents/GitHub/FinEnvs/finenvs/agents/SAC/actor.py", line 58, in get_actions_and_log_probs
    distribution = self.get_distribution(states)
  File "/home/momin/Documents/GitHub/FinEnvs/finenvs/agents/SAC/actor.py", line 51, in get_distribution
    mean = self.forward(states)
  File "/home/momin/Documents/GitHub/FinEnvs/finenvs/agents/networks/multilayer_perceptron.py", line 26, in forward
    return self.network(inputs)
  File "/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/torch/nn/modules/activation.py", line 499, in forward
    return F.elu(input, self.alpha, self.inplace)
  File "/home/momin/anaconda3/envs/rlgpu/lib/python3.7/site-packages/torch/nn/functional.py", line 1391, in elu
    result = torch._C._nn.elu(input, alpha)
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 10.75 GiB total capacity; 8.52 GiB already allocated; 31.38 MiB free; 8.60 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

real	0m36.206s
user	0m37.513s
sys	0m3.575s

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.