alsora / deep-briscola Goto Github PK

View Code? Open in Web Editor NEW

21.0 3.0 6.0 31.4 MB

Tensorflow Deep Reinforcement Learning agents playing Briscola card game

Python 98.18% Dockerfile 1.24% Shell 0.58%

tensorflow deep-reinforcement-learning briscola card-game

deep-briscola's Introduction

deep-briscola

Tensorflow deep reinforcement learning agent playing Briscola card game.

What is Briscola??

This repository contains a Briscola game environment where different agents can play.

RandomAgent: choose each move in a random fashion
AIAgent: knows the rules and the strategies for winning the game
DeepAgent: agent trained using deep reinforcement learning
HumanAgent: yourself

Dependencies

sudo apt-get update && sudo apt-get install -y \
  python-dev \
  python3-pip
  
sudo pip install \
  tensorflow \
  hyperopt \
  matplotlib \
  pandas

Alternatively, a Dockerfile with all the dependencies installed is provided in this repo. To use it:

Install Docker on Ubuntu

$ bash docker/build.sh
$ bash docker/run.sh

Usage

Train a model

$ python3 train.py --network dqn --saved_model saved_model_dir

Play against trained deep agent

$ python3 human_vs_ai.py --network dqn --saved_model saved_model_dir

Play against AI Agent

$ python3 human_vs_ai.py

Features

Different networks implemented

Specify the network type using --network command line argument

Deep Q Network
Deep Recurrent Q Network
WIP Synchronous Advantage Actor Critic (A2C)

Self Play

Train multiple agents using the self_train.py python script.

$ python3 self_train.py --network dqn --saved_model saved_model_dir

Results

Training a Deep Q Network model for 75k epochs: achieved 85% winrate against a random player.
Training a Deep Q Network model for 100k epochs using Self Play: achieved 90% winrate against a random player.

deep-briscola's People

Contributors

Stargazers

Watchers

Forkers

fatate vetoplayer hubayirp qmchenry anshu7919 tizianoalbore

deep-briscola's Issues

check_end_game

Before updating the master with self play we must update the check_end_game function

old one

    def check_end_game(self):
        ''' check if the game is ended'''
        return self.deck.end_deck

new onw

    def check_end_game(self):
        return len(self.players[0].hand) == 0

The game was thought to be finished 3 turns before it should have

Full of Errors

Hi there, interesting project. Thanks for the effort you put into this.

I'm not sure if you're still interested in maintaining it, but I am struggling quite a bit to use it.

I have followed your instructions in the README using the pip install (not docker). When I try to run the train command:

python3 train.py --network dqn --saved_model saved_model_dir

I run into numerous errors:

There is no such option --saved_model I believe you meant --model_dir
I think you used tensorflow version 1, but that version is no longer available on pip and I don't know how else to get it on ubuntu.

When I try to run it on the latest version of tensorflow, I get an error "[AttributeError: module 'tensorflow' has no attribute 'app'] which I was able to solve in my fork by changing the tensorflow import to: import tensorflow.compat.v1 as tf https://github.com/1337micro/deep-briscola

I then was able to run the command to start the training:
python3 train.py --network dqn --model_dir saved_model_dir

The training worked, but there are full of errors and the saving part the end failed. Here's the output:

bluezone@DESKTOP-7T8A4VL:~/deep-briscola$ python3 train.py --network dqn --model_dir mod
2023-01-31 23:13:29.049476: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-01-31 23:13:29.050890: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-01-31 23:13:30.155620: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-01-31 23:13:30.157471: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-01-31 23:13:30.157936: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2023-01-31 23:13:31.971426: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2023-01-31 23:13:31.973196: W tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303)
2023-01-31 23:13:31.973899: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (DESKTOP-7T8A4VL): /proc/driver/nvidia/version does not exist
/home/bluezone/deep-briscola/networks/dqn.py:97: UserWarning: tf.layers.dense is deprecated and will be removed in a future version. Please use tf.keras.layers.Dense instead.
last_tensor = tf.layers.dense(last_tensor, layer_size, tf.nn.relu, kernel_initializer=w_initializer,
/home/bluezone/deep-briscola/networks/dqn.py:100: UserWarning: tf.layers.dense is deprecated and will be removed in a future version. Please use tf.keras.layers.Dense instead.
self.q = tf.layers.dense(last_tensor, self.n_actions, kernel_initializer=w_initializer,
/home/bluezone/deep-briscola/networks/dqn.py:107: UserWarning: tf.layers.dense is deprecated and will be removed in a future version. Please use tf.keras.layers.Dense instead.
last_tensor = tf.layers.dense(last_tensor, layer_size, tf.nn.relu, kernel_initializer=w_initializer,
/home/bluezone/deep-briscola/networks/dqn.py:110: UserWarning: tf.layers.dense is deprecated and will be removed in a future version. Please use tf.keras.layers.Dense instead.
self.q_next = tf.layers.dense(last_tensor, self.n_actions, kernel_initializer=w_initializer,
2023-01-31 23:13:32.357391: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled
Epoch: 1000
Total wins: [299, 201]
QAgent 0 won 59.80% with average points 65.35
RandomAgent 1 won 40.20% with average points 54.65
Traceback (most recent call last):
File "/home/bluezone/deep-briscola/train.py", line 99, in
tf.app.run()
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/platform/app.py", line 36, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
File "/home/bluezone/deep-briscola/train.py", line 60, in main
train(game, agents, FLAGS.num_epochs, FLAGS.evaluate_every, FLAGS.num_evaluations, FLAGS.model_dir)
File "/home/bluezone/deep-briscola/train.py", line 32, in train
agents[0].save_model(model_dir)
File "/home/bluezone/deep-briscola/agents/q_agent.py", line 135, in save_model
self.q_learning.save_model(output_dir)
File "/home/bluezone/deep-briscola/networks/base_network.py", line 36, in save_model
self.saver.save(self.session, './' + output_dir + '/')
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/training/saver.py", line 1280, in save
self._build_eager(
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/training/saver.py", line 946, in _build_eager
self._build(
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/training/saver.py", line 971, in _build
self.saver_def = self._builder._build_internal( # pylint: disable=protected-access
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/training/saver.py", line 514, in _build_internal
saveables = saveable_object_util.validate_and_slice_inputs(
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/training/saving/saveable_object_util.py", line 371, in validate_and_slice_inputs
for converted_saveable_object in saveable_objects_for_op(op, name):
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/training/saving/saveable_object_util.py", line 222, in saveable_objects_for_op
raise ValueError("Can only save/restore ResourceVariables when "
ValueError: Can only save/restore ResourceVariables when executing eagerly, got type: <class 'tensorflow.python.framework.ops.Tensor'>.

Let me know if you're interested in helping me out. I can try to find a way to get tensorflow v1 in the meantime.

Tests to do for the choice of the hand orderd function

Three possible ways identified

New card added at the bottom of the hand
Reordering cards by value
New card added at the place of the last played card

Copy agent broken

  File "<ipython-input-21-3398d2efe1fe>", line 5, in <module>
    best_total_wins = self_train(game, agent1, agent2,
                                    FLAGS.num_epochs,
                                    FLAGS.evaluate_every,
                                    FLAGS.num_evaluations,
                                    FLAGS.model_dir)

  File "<ipython-input-14-8bba47156315>", line 22, in self_train
    brisc.play_episode(game, agents)

  File "/home/torquato/Desktop/Deep_Briscola/environment.py", line 356, in play_episode
    action = agent.select_action(available_actions)

  File "/home/torquato/Desktop/Deep_Briscola/agents/q_agent.py", line 88, in select_action
    q = self.q_learning.get_q_table(self.state)

  File "/home/torquato/Desktop/Deep_Briscola/networks/drqn.py", line 177, in get_q_table
    q = self.session.run([q_op], feed_dict={states_op: input_state, self.events_length : 1})

  File "/home/torquato/miniconda3/envs/ProbProg/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)

  File "/home/torquato/miniconda3/envs/ProbProg/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1095, in _run
    'Cannot interpret feed_dict key as Tensor: ' + e.args[0])

TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder:0", dtype=int32) is not an element of this graph.

Error occurs when Copy agent tries to pick an action in the self play file of the master

Attempting to use uninitialized value eval_net_id2/e1_id2/bias

I'm trying to implement self play with dqrn agents. To do that I noticed that changing the names of tensorflow objects was necessary. So now they dynamically change for the two different agents. Everything works except for

agents[0].save_model(model_dir)

line 110 of self_train.py which is in self_play branch

the error given is

Attempting to use uninitialized value eval_net_id2/e1_id2/bias

At least two variables have the same name: eval_net/e1/bias

I've made some changes to self play. Now the agent plays against an agent which is randomly chosen from past copies of itself at different level of training.
I made this by saving the agent model at every evaluation step.
The problem is that when I try to restore the saved model current model

agent.save_model('cur_model_copy')
new_old_agent = QAgent()
new_old_agent.load_model('cur_model_copy')
old_agents.append(new_old_agent)

ValueError: At least two variables have the same name: eval_net/e1/bias

It seems that when I call the save function on the agent he tries to save the graph for both the agent and its copy. Since they have the same variables name this error occurs.
Is there a way to store old copies of the agent?
Or should we create an other type of agent which is just a copy that can only make choices for actions but not update its weights?

I've also tried to use deepcopy to copy the agent, instead of saving the model, but it doesn't work.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.