Git Product home page Git Product logo

deeprl's People

Contributors

duyunshu avatar gabrieledcjr avatar itsukara avatar miyosuda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

deeprl's Issues

typo

File "/mnt/sdb1/NCBR_grywalizacja/kod/RL/DeepRL/pretrain/common/replay_memory/replay_memory.py", line 517, in _sample_by_indices
dtype=np.float3)

Test Score Too Low

When I do the pretraining for Ms PacMan, it only shows a score of 60 or 90 for the test of the pretrained network, even though my human demo scores are around 5000. It shows it is correctly loading the scores, and gets 94% train/test accuracy for them. Shouldn't that test score it shows in "classify_demo" be a lot higher than 60?

Here's the info it shows:

python3 pretrain/run_experiment.py --gym-env=MsPacmanNoFrameskip-v4 --cpu-only --classify-demo --use-mnih-2015 --train-max-steps=2 --batch-size=32 --grad-norm-clip=0.5 --demo-ids=1,2,3,4,5,6,7,8

Colocations handled automatically by placer.
2019-04-28 11:23:48,385 classify_demo INFO optimizer: RMSPropOptimizer
2019-04-28 11:23:48,385 classify_demo INFO learning_rate: 0.0007
2019-04-28 11:23:48,385 classify_demo INFO epsilon: 1e-05
2019-04-28 11:23:48,385 classify_demo INFO decay: 0.99
2019-04-28 11:23:48,386 classify_demo INFO train_max_steps: 10000
2019-04-28 11:23:48,386 classify_demo INFO batch_size: 32
2019-04-28 11:23:48,386 classify_demo INFO eval_freq: 5000
2019-04-28 11:23:48,386 classify_demo INFO use_onevsall: False
2019-04-28 11:23:48,386 classify_demo INFO sampling_type: None
2019-04-28 11:23:48,386 classify_demo INFO clip_norm: 0.5
2019-04-28 11:23:48,387 classify_demo INFO reward_constant: 2.0
2019-04-28 11:23:48,387 util INFO Loading data from memory
2019-04-28 11:23:48,387 util INFO memory_folder: collected_demo/MsPacmanNoFrameskip_v4
2019-04-28 11:23:48,387 util INFO demo_ids: 1,2,3,4,5,6,7,8
2019-04-28 11:23:48,388 replay_memory INFO Load memory from collected_demo/MsPacmanNoFrameskip_v4/data/desktop/20190427_194019...
2019-04-28 11:23:48,415 replay_memory INFO Load memory from collected_demo/MsPacmanNoFrameskip_v4/data/desktop/20190427_194110...
2019-04-28 11:23:48,438 replay_memory INFO Load memory from collected_demo/MsPacmanNoFrameskip_v4/data/desktop/20190427_194151...
2019-04-28 11:23:48,483 replay_memory INFO Load memory from collected_demo/MsPacmanNoFrameskip_v4/data/desktop/20190427_194324...
2019-04-28 11:23:48,531 replay_memory INFO Load memory from collected_demo/MsPacmanNoFrameskip_v4/data/desktop/20190427_194506...
2019-04-28 11:23:48,568 util INFO replay_buffers size: 5
2019-04-28 11:23:48,568 util INFO total_rewards: {1: 2650.0, 2: 940.0, 3: 3590.0, 4: 5780.0, 5: 4110.0}
2019-04-28 11:23:48,568 util INFO total_memory: 5068
2019-04-28 11:23:48,569 util INFO total_steps: 5068
2019-04-28 11:23:48,569 util INFO action_distribution: {0: 4339, 1: 161, 2: 149, 3: 257, 4: 158, 7: 1, 8: 3}
2019-04-28 11:23:48,569 util INFO Data loaded!

2019-04-28 11:29:42,943 classify_demo DEBUG i=10000 train_acc=0.9375 test_acc=0.9434 loss=0.2340 max_val=14.886861801147461

2019-04-28 11:31:01,645 classify_demo DEBUG test: trial=77 score=60.0 steps=316 total_steps=24231
2019-04-28 11:31:02,629 classify_demo DEBUG test: trial=78 score=60.0 steps=319 total_steps=24547
2019-04-28 11:31:03,136 classify_demo INFO test: final score=60.0 final steps=318 # trials=78
2019-04-28 11:31:03,137 classify_demo INFO Now saving data. Please wait
2019-04-28 11:31:03,661 classify_demo INFO Data saved!

use-transfer path error

The supervised pre-training part works fine on my Ubuntu 16.04 server at
/home/ubuntu/newrl/DeepRL/
python3 pretrain/run_experiment.py --gym-env=MsPacmanNoFrameskip-v4 --cpu-only --classify-demo --use-mnih-2015 --train-max-steps=10000 --batch-size=32 --grad-norm-clip=0.5 --demo-ids=1,2,3,4,5,6,7,8

but when I then try to use the pre-trained network in A3C-TB like:
/home/ubuntu/newrl/DeepRL/
python3 a3c/run_experiment.py --gym-env=MsPacmanNoFrameskip-v4 --parallel-size=1 --max-time-step-fraction=1.0 --initial-learn-rate=0.0007 --rmsp-epsilon=1e-5 --grad-norm-clip=0.5 --use-mnih-2015 --unclipped-reward --transformed-bellman --use-transfer --append-experiment-num=1

it gives this error:

Traceback (most recent call last):
File "a3c/run_experiment.py", line 159, in
main()
File "a3c/run_experiment.py", line 155, in main
run_a3c(args)
File "/home/ubuntu/newrl/DeepRL/a3c/a3c.py", line 410, in run_a3c
var_list=transfer_var_list,
File "/home/ubuntu/newrl/DeepRL/a3c/game_ac_network.py", line 245, in load_transfer_model
assert folder.is_dir()
AssertionError

I know the "--use-transfer" is supposed to automatically find the pretrained model, but since that did not work I also tried adding a path for it at the end of the command like this but it gave the same error::
python3 a3c/run_experiment.py --gym-env=MsPacmanNoFrameskip-v4 --parallel-size=1 --max-time-step-fraction=1.0 --initial-learn-rate=0.0007 --rmsp-epsilon=1e-5 --grad-norm-clip=0.5 --use-mnih-2015 --unclipped-reward --transformed-bellman --use-transfer --append-experiment-num=1 --pretrained-model-folder='/home/ubuntu/newrl/DeepRL/results/pretrain_models/MsPacmanNoFrameskip_v4_mnih2015_l2beta1E-04_clipnorm5E-01/transfer_model/'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.