This repository corresponds to the state of the art, I do on Reinforcement Learning
- Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto. MIT Press, 1st edition 1998 2nd edition 2017 in progress
- Algorithms for Reinforcement Learning, Csaba Szepesvari, 2009
- Multi-Robot Formation Control Using Reinforcement Learning. Abhay Rawat, and Kamalakar Karlapalem. arXiv:2001.04527, submitted January 2020.
- Reinforcement Learning of Control Policy for Linear Temporal Logic Specifications Using Limit-Deterministic Büchi Automata. Ryohei Oura, Ami Sakakibara, and Toshimitsu Ushio. arXiv:2001.04669, submitted January 2020.
- Inducing Cooperation in Multi-Agent Games Through Status-Quo Loss. Pinkesh Badjatiya, Mausoom Sarkar, Abhishek Sinha, Siddharth Singh, Nikaash Puri, and Balaji Krishnamurthy. arXiv:2001.05458, submitted January 2020.
- PoPS: Policy Pruning and Shrinking for Deep Reinforcement Learning. Dor Livne, and Kobi Cohen. arXiv:2001.05012, submitted January 2020.
- Continuous-action Reinforcement Learning for Playing Racing Games: Comparing SPG to PPO. Mario S. Holubar, and Marco A. Wiering. arXiv:2001.05270, submitted January 2020.
- Lipschitz Lifelong Reinforcement Learning. Erwan Lecarpentier, David Abel, Kavosh Asadi, Yuu Jinnai, Emmanuel Rachelson, and Michael L. Littman. arXiv:2001.05411, submitted January 2020.
- Robotic Grasp Manipulation Using Evolutionary Computing and Deep Reinforcement Learning. Priya Shukla, Hitesh Kumar, and G. C. Nandi. arXiv:2001.05443, submitted January 2020.
- Reward Engineering for Object Pick and Place Training. Raghav Nagpal, Achyuthan Unni Krishnan, and Hanshen Yu. arXiv:2001.03792, submitted January 2020.
- Sparse Black-box Video Attack with Reinforcement Learning. Huanqian Yan, Xingxing Wei, and Bo Li. arXiv:2001.03754, submitted January 2020.
- Learning to drive via Apprenticeship Learning and Deep Reinforcement Learning. Wenhui Huang, Francesco Braghin, and Zhuo Wang. arXiv:2001.03864, submitted January 2020.
- Addressing Value Estimation Errors in Reinforcement Learning with a State-Action Return Distribution Function. Jingliang Duan, Yang Guan, Yangang Ren, Shengbo Eben Li, and Bo Cheng. arXiv:2001.02811, submitted January 2020.
- Population-Guided Parallel Policy Search for Reinforcement Learning. Whiyoung Jung, Giseung Park, and Youngchul Sung. arXiv:2001.02907, submitted January 2020.
- Deep Interactive Reinforcement Learning for Path Following of Autonomous Underwater Vehicle. Qilei Zhang, Jinying Lin, Qixin Sha, Bo He, and Guangliang Li. arXiv:2001.03359, submitted January 2020.
- Information Theoretic Model Predictive Q-Learning. Mohak Bhardwaj, Ankur Handa, Dieter Fox, and Byron Boots. arXiv:2001.02153, submitted January 2020.
- Deep Reinforcement Learning for Active Human Pose Estimation. Erik Gärtner, Aleksis Pirinen, and Cristian Sminchisescu. arXiv:2001.02024, submitted January 2020.
- Hierarchical Reinforcement Learning as a Model of Human Task Interleaving. Christoph Gebhardt, Antti Oulasvirta, and Otmar Hilliges. arXiv:2001.02122, submitted January 2020.
- Optimal Options for Multi-Task Reinforcement Learning Under Time Constraints. Manuel Del Verme, Bruno Castro da Silva, and Gianluca Baldassarre. arXiv:2001.01620, submitted January 2020.
- High-speed Autonomous Drifting with Deep Reinforcement Learning. Peide Cai, Xiaodong Mei, Lei Tai, Yuxiang Sun, and Ming Liu. arXiv:2001.01377, submitted January 2020.
- Represented Value Function Approach for Large Scale Multi Agent Reinforcement Learning. Weiya Ren. arXiv:2001.01096, submitted January 2020.
- Learning Reusable Options for Multi-Task Reinforcement Learning. Francisco M. Garcia, Chris Nota, and Philip S. Thomas. arXiv:2001.01577, submitted January 2020.
- Intelligent Roundabout Insertion using Deep Reinforcement Learning. Alessandro Paolo Capasso, Giulio Bacchiani, and Daniele Molinari. arXiv:2001.00786, submitted January 2020.
- Making Sense of Reinforcement Learning and Probabilistic Inference. Brendan O'Donoghue, Ian Osband, and Catalin Ionescu. arXiv:2001.00805, submitted January 2020.
- Fairness in Multi-agent Reinforcement Learning for Stock Trading. Wenhang Bao. arXiv:2001.00918, submitted January 2020.
- Zero-Shot Reinforcement Learning with Deep Attention Convolutional Neural Networks. Sahika Genc, Sunil Mallya, Sravan Bodapati, Tao Sun, and Yunzhe Tao. arXiv:2001.00605, submitted January 2020.
- Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics. Michael Neunert, Abbas Abdolmaleki, Markus Wulfmeier, Thomas Lampe, Jost Tobias Springenberg, Roland Hafner, Francesco Romano, Jonas Buchli, Nicolas Heess, and Martin Riedmiller. arXiv:2001.00449, submitted January 2020.
- Uncertainty-Based Out-of-Distribution Classification in Deep Reinforcement Learning. Andreas Sedlmeier, Thomas Gabor, Thomy Phan, Lenz Belzner, and Claudia Linnhoff-Popien. arXiv:2001.00496, submitted January 2020.
- Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies. Sungryull Sohn, Hyunjae Woo, Jongwook Choi, and Honglak Lee. arXiv:2001.00248, submitted January 2020.
- Deep Reinforced Self-Attention Masks for Abstractive Summarization (DR.SAS). Ankit Chadha, and Mohamed Masoud. arXiv:2001.00009, submitted January 2020.
- Reinforcement Learning with Goal-Distance Gradient. Kai Jiang, and XiaoLong Qin. arXiv:2001.00127, submitted January 2020.
- Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards. Xingyu Lu, Stas Tiomkin, and Pieter Abbeel. arXiv:1912.13414, submitted December 2019.
- A New Framework for Query Efficient Active Imitation Learning. Daniel Hsu. arXiv:1912.13037, submitted December 2019.
- Augmented Replay Memory in Reinforcement Learning With Continuous Control. Mirza Ramicic, and Andrea Bonarini. arXiv:1912.12719, submitted December 2019.
- Real-time Policy Distillation in Deep Reinforcement Learning. Yuxiang Sun, and Pooyan Fazli. arXiv:1912.12630, submitted December 2019.
- Speeding up reinforcement learning by combining attention and agency features. Berkay Demirel, and Martí Sánchez-Fibla. arXiv:1912.12623, submitted December 2019.
- SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement Learning. Keng Wah Loon, Laura Graesser, and Milan Cvitkovic. arXiv:1912.12482, submitted December 2019.
- Weak Supervision for Fake News Detection via Reinforcement Learning. Yaqing Wang, Weifeng Yang, Fenglong Ma, Jin Xu, Bin Zhong, Qiang Deng, and Jing Gao. arXiv:1912.12520, submitted December 2019.
- Individual specialization in multi-task environments with multiagent reinforcement learners. Marco Jerome Gasparrini, Ricard Solé, and Martí Sánchez-Fibla. arXiv:1912.12671, submitted December 2019.
- Loss aversion fosters coordination among independent reinforcement learners. Marco Jerome Gasparrini, and Martí Sánchez-Fibla. arXiv:1912.12633, submitted December 2019.
- Crowdfunding Dynamics Tracking: A Reinforcement Learning Approach. Jun Wang, Hefu Zhang, Qi Liu, Zhen Pan, and Hanqing Tao. arXiv:1912.12016, submitted December 2019.
- Quasi-Newton Trust Region Policy Optimization. Devesh Jha, Arvind Raghunathan, and Diego Romeres. arXiv:1912.11912, submitted December 2019.
- Pseudo Random Number Generation: a Reinforcement Learning approach. Luca Pasqualini, and Maurizio Parton. arXiv:1912.11531, submitted December 2019.
- Defining AI in Policy versus Practice. P. M. Krafft, Meg Young, Michael Katell, Karen Huang, and Ghislain Bugingo. arXiv:1912.11095, submitted December 2019.
- Discrete and Continuous Action Representation for Practical RL in Video Games. Olivier Delalleau, Maxim Peter, Eloi Alonso, and Adrien Logut. arXiv:1912.11077, submitted December 2019.
- Monte-Carlo Tree Search for Policy Optimization. Xiaobai Ma, Katherine Driggs-Campbell, Zongzhang Zhang, Mykel and J. Kochenderfer. arXiv:1912.10648, submitted December 2019.
- Direct and indirect reinforcement learning. Yang Guan, Shengbo Eben Li, Jingliang Duan, Jie Li, Yangang Ren, and Bo Cheng. arXiv:1912.10600, submitted December 2019.
- Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning. Tian Tan, Zhihan Xiong, and Vikranth R. Dwaracherla. arXiv:1912.10577, submitted December 2019.
- Optimizing Collision Avoidance in Dense Airspace using Deep Reinforcement Learning. Sheng Li, Maxim Egorov, and Mykel Kochenderfer. arXiv:1912.1014, submitted December 2019.
- Teaching robots to perceive time -- A reinforcement learning approach (Extended version). Inês Lourenço, Bo Wahlberg, and Rodrigo Ventura. arXiv:1912.10113, submitted December 2019.
- Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning. Richard Li, Allan Jabri, Trevor Darrell, and Pulkit Agrawal. arXiv:1912.11032, submitted December 2019.
- A Survey of Deep Reinforcement Learning in Video Games. Kun Shao, Zhentao Tang, Yuanheng Zhu, Nannan Li, and Dongbin Zhao. arXiv:1912.10944, submitted December 2019.
- Does AlphaGo actually play Go? Concerning the State Space of Artificial Intelligence. Holger Lyre. arXiv:1912.10005, submitted December 2019.
- Mastering Complex Control in MOBA Games with Deep Reinforcement Learning. Deheng Ye, Zhao Liu, Mingfei Sun, Bei Shi, Peilin Zhao, Hao Wu, Hongsheng Yu, Shaojie Yang, Xipeng Wu, Qingwei Guo, Qiaobo Chen, Yinyuting Yin, Hao Zhang, Tengfei Shi, Liang Wang, Qiang Fu, Wei Yang, and Lanxiao Huang. arXiv:1912.09729, submitted December 2019.
- Extendable NFV-Integrated Control Method Using Reinforcement Learning. Akito Suzuki, Ryoichi Kawahara, Masahiro Kobayashi, Shigeaki Harada, Yousuke Takahashi, and Keisuke Ishibashi. arXiv:1912.09022, submitted December 2019.
- Deep Reinforcement Learning for Motion Planning of Mobile Robots. Leonid Butyrev, Thorsten Edelhäußer, Christopher and Mutschler. arXiv:1912.09260, submitted December 2019.
- Interestingness Elements for Explainable Reinforcement Learning: Understanding Agents' Capabilities and Limitations. Pedro Sequeira, and Melinda Gervasio. arXiv:1912.09007, submitted December 2019.
- Deep Reinforcement Learning Designed RF Pulse: DeepRFSLR. Dongmyung Shin, Sooyeon Ji, Doohee Lee, Jieun Lee, Se-Hong Oh, and Jongho Lee. arXiv:1912.09015, submitted December 2019. *Exploratory Not Explanatory: Counterfactual Analysis of Saliency Maps for Deep RL. Akanksha Atrey, Kaleigh Clary, and David Jensen. arXiv:1912.05743, submitted December 2019.
- Text as Environment: A Deep Reinforcement Learning Text Readability Assessment Model. Hamid Mohammadi, and Seyed Hossein Khasteh. arXiv:1912.05957, submitted December 2019.
- Online Deep Reinforcement Learning for Autonomous UAV Navigation and Exploration of Outdoor Environments. Bruna G. Maciel-Pearson, Letizia Marchegiani, Samet Akcay, Amir Atapour-Abarghouei, James Garforth, and Toby P. Breckon. arXiv:1912.05684, submitted December 2019.
- Biases for Emergent Communication in Multi-agent Reinforcement Learning. Tom Eccles, Yoram Bachrach, Guy Lever, Angeliki Lazaridou, and Thore Graepel. arXiv:1912.05676, submitted December 2019.
- Measuring the Reliability of Reinforcement Learning Algorithms. Stephanie C.Y. Chan, Sam Fishman, John Canny, Anoop Korattikara, and Sergio Guadarrama. arXiv:1912.05663, submitted December 2019.
- Learning To Reach Goals Without Reinforcement Learning. Dibya Ghosh, Abhishek Gupta, Justin Fu, Ashwin Reddy, Coline Devine, Benjamin Eysenbach, and Sergey Levine. arXiv:1912.06088, submitted December 2019.
- The PlayStation Reinforcement Learning Environment (PSXLE). Carlos Purves, Cătălina Cangea, and Petar Veličković. arXiv:1912.06101, submitted December 2019.
- Control-Tutored Reinforcement Learning. Francesco De Lellis, Fabrizia Auletta, Giovanni Russo, Piero De Lellis, and Mario di Bernardo. arXiv:1912.06085, submitted December 2019.
- Recruitment-imitation Mechanism for Evolutionary Reinforcement Learning. Shuai Lü, Shuai Han, Wenbo Zhou, and Junwei Zhang. arXiv:1912.06310, submitted December 2019.
- Spatial Influence-aware Reinforcement Learning for Intelligent Transportation System. Wenhang Bao, and Xiao-yang Liu. arXiv:1912.06880, submitted December 2019.
- Long-Term Planning and Situational Awareness in OpenAI Five. Jonathan Raiman, Susan Zhang, and Filip Wolski. arXiv:1912.06721, submitted December 2019.
- PixelRL: Fully Convolutional Network with Reinforcement Learning for Image Processing. Ryosuke Furuta, Naoto Inoue, and Toshihiko Yamasaki. arXiv:1912.07190, submitted December 2019.
- Coordination in Adversarial Sequential Team Games via Multi-Agent Deep Reinforcement Learning. Andrea Celli, Marco Ciccone, Raffaele Bongo, and Nicola Gatti. arXiv:1912.07712, submitted December 2019.
- UNAS: Differentiable Architecture Search Meets Reinforcement Learning. Arash Vahdat, Arun Mallya, Ming-Yu Liu, and Jan Kautz. arXiv:1912.07651, submitted December 2019.
- Taming an autonomous surface vehicle for path following and collision avoidance using deep reinforcement learning. Eivind Meyer, Haakon Robinson, Adil Rasheed, and Omer San. arXiv:1912.08578, submitted December 2019.
- Hierarchical Deep Q-Network with Forgetting from Imperfect Demonstrations in Minecraft. Alexey Skrynnik, Aleksey Staroverov, Ermek Aitygulov, Kirill Aksenov, Vasilii Davydov, and Aleksandr I. Panov. arXiv:1912.08664, submitted December 2019.
- Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation. Tianhong Dai, Kai Arulkumaran, Samyakh Tukra, Feryal Behbahani, and Anil Anthony Bharath. arXiv:1912.08324, submitted December 2019.
- Learning to grow: control of materials self-assembly using evolutionary reinforcement learning. Stephen Whitelam, and Isaac Tamblyn. arXiv:1912.08333, submitted December 2019.
- Quality of syntactic implication of RL-based sentence summarization. Hoa T. Le, Christophe Cerisara, and Claire Gardent. https://arxiv.org/abs/1912.05493, submitted December 2019.
- Energy-aware Scheduling of Jobs in Heterogeneous Cluster Systems Using Deep Reinforcement Learning. Amirhossein Esmaili, and Massoud Pedram. arXiv:1912.05160, submitted December 2019.
- Efficient Robotic Task Generalization Using Deep Model Fusion Reinforcement Learning. Tianying Wang, Hao Zhang, Wei Qi Toh, Hongyuan Zhu, Cheston Tan, Yan Wu, Yong Liu, and Wei Jing. arXiv:1912.05205, submitted December 2019.
- Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods. Riashat Islam, Raihan Seraj, Pierre-Luc Bacon, and Doina Precup. arXiv:1912.05104, submitted December 2019.
- Flow Rate Control in Smart District Heating Systems Using Deep Reinforcement Learning. Tinghao Zhang, Jing Luo, Ping Chen, and Jie Liu. arXiv:1912.05313, submitted December 2019.
- Efficient and Robust Reinforcement Learning with Uncertainty-based Value Expansion. Bo Zhou, Hongsheng Zeng, Fan Wang, Yunxiang Li, and Hao Tian. arXiv:1912.05328, submitted December 2019.
- Value-of-Information based Arbitration between Model-based and Model-free Control. Krishn Bera, Yash Mandilwar, and Bapi Raju. arXiv:1912.05453, submitted December 2019.
- SMiRL: Surprise Minimizing RL in Dynamic Environments. Glen Berseth, Daniel Geng, Coline Devin, Chelsea Finn, Dinesh Jayaraman, and Sergey Levine. arXiv:1912.05510, submitted December 2019.
- Surface Following using Deep Reinforcement Learning and a GelSightTactile Sensor. Chen Lu, Jing Wang, and Shan Luo. 1912.00745, submitted December 2019.
- Automated curriculum generation for Policy Gradients from Demonstrations. Anirudh Srinivasan, Dzmitry Bahdanau, Maxime Chevalier-Boisvert, and Yoshua Bengio. arXiv:1912.00444, submitted December 2019.
- Optimization for Reinforcement Learning: From Single Agent to Cooperative Agents. Donghwan Lee, Niao He, Parameswaran Kamalaruban, and Volkan Cevher. arXiv:1912.00498, submitted December 2019.
- Multi-Agent Deep Reinforcement Learning with Adaptive Policies. Yixiang Wang, and Feng Wu. arXiv:1912.00949, submitted December 2019.
- Neighborhood Cognition Consistent Multi-Agent Reinforcement Learning. Hangyu Mao, Wulong Liu, Jianye Hao, Jun Luo, Dong Li, Zhengchao Zhang, Jun Wang, and Zhen Xiao. arXiv:1912.01160, submitted December 2019.
- SafeLife 1.0: Exploring Side Effects in Complex Environments. Carroll L. Wainwright, and Peter Eckersley. arXiv:1912.01217, submitted December 2019.
- On-policy Reinforcement Learning with Entropy Regularization. Jingbin Liu, Xinyang Gu, Dexiang Zhang, and Shuai Liu. arXiv:1912.01557, submitted December 2019.
- AlgaeDICE: Policy Gradient from Arbitrary Experience. Ofir Nachum, Bo Dai, Ilya Kostrikov, Yinlam Chow, Lihong Li, and Dale Schuurmans. arXiv:1912.02074, submitted December 2019.
- Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning. Hengyuan Hu, and Jakob N Foerster. arXiv:1912.02288, submitted December 2019.
- Improving Policies via Search in Cooperative Partially Observable Games. Adam Lerer, Hengyuan Hu, Jakob Foerster, and Noam Brown. arXiv:1912.02318, submitted December 2019.
- Inter-Level Cooperation in Hierarchical Reinforcement Learning. Abdul Rahman Kreidieh, Samyak Parajuli, Nathan Lichtle, Yiling You, Rayyan Nasr, and Alexandre M. Bayen. arXiv:1912.02368, submitted December 2019.
- Reinforcement Learning with Non-Markovian Rewards. Maor Gaon, and Ronen I. Brafman. arXiv:1912.02552, submitted December 2019.
- Cooperative Reasoning on Knowledge Graph and Corpus: A Multi-agentReinforcement Learning Approach. Yunan Zhang, Xiang Cheng, Heting Gao, and Chengxiang Zhai. arXiv:1912.02206, submitted December 2019.
- Iterative Policy-Space Expansion in Reinforcement Learning. Jan Malte Lichtenberg, and Özgür Şimşek. arXiv:1912.02532, submitted December 2019.
- Reinforcement Learning Upside Down: Don't Predict Rewards -- Just Map Them to Actions. Juergen Schmidhuber. arXiv:1912.02875, submitted December 2019.
- Training Agents using Upside-Down Reinforcement Learning. Rupesh Kumar Srivastava, Pranav Shyam, Filipe Mutz, Wojciech Jaśkowski, and Jürgen Schmidhuber. arXiv:1912.02877, submitted December 2019.
- Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems. Guannan Qu, Adam Wierman, and Na Li. arXiv:1912.02906, submitted December 2019.
- Observational Overfitting in Reinforcement Learning. Xingyou Song, Yiding Jiang, Yilun Du, and Behnam Neyshabur. arXiv:1912.02975, submitted December 2019.
- A pedestrian path-planning model in accordance with obstacle's danger with reinforcement learning. Thanh-Trung Trinh, Dinh-Minh Vu, and Masaomi Kimura. arXiv:1912.02945, submitted December 2019.
- From Reinforcement Learning to Optimal Control: A unified framework for sequential decisions. Warren B Powell. arXiv:1912.03513, submitted December 2019.
- Intelligent Coordination among Multiple Traffic Intersections Using Multi-Agent Reinforcement Learning. Ujwal Padam Tewari, Vishal Bidawatka, Varsha Raveendran, and Vinay Sudhakaran. arXiv:1912.03851, submitted December 2019.
- Transformer Based Reinforcement Learning For Games. Uddeshya Upadhyay, Nikunj Shah, Sucheta Ravikanti, and Mayanka Medhe. arXiv:1912.03918, submitted December 2019.
- Reinforcement Learning with Convolutional Reservoir Computing. Hanten Chang, and Katsuya Futagami. arXiv:1912.04161, submitted December 2019.
- Unsupervised Curricula for Visual Meta-Reinforcement Learning. Allan Jabri, Kyle Hsu, Ben Eysenbach, Abhishek Gupta, Sergey Levine, and Chelsea Finn. arXiv:1912.04226, submitted December 2019.
- Efficient Object Detection in Large Images using Deep Reinforcement Learning. Burak Uzkent, Christopher Yeh, and Stefano Ermon. arXiv:1912.03966, submitted December 2019.
- Deep Reinforcement Learning for Routing a Heterogeneous Fleet of Vehicles. Jose Manuel Vera, and Andres G. Abad. arXiv:1912.03341, submitted December 2019.
- Effects of a Social Force Model reward in Robot Navigation based on Deep Reinforcement Learning. Óscar Gil Viyuela, and Alberto Sanfeliu. arXiv:1912.03747, submitted December 2019.
- Reinforcement Learning based Visual Navigation with Information-Theoretic Regularization. Qiaoyun Wu, Kai Xu, Jun Wang, Mingliang Xu, and Dinesh Manocha. arXiv:1912.04078, submitted December 2019.
- Decentralized Multi-Agent Reinforcement Learning with Networked Agents: Recent Advances. Kaiqing Zhang, Zhuoran Yang, and Tamer Başar. arXiv:1912.03821, submitted December 2019.
- ChainerRL: A Deep Reinforcement Learning Library. Yasuhiro Fujita, Toshiki Kataoka, Prabhat Nagarajan, and Takahiro Ishikawa. arXiv:1912.03905, submitted December 2019.
- CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning. Jiachen Yang, Alireza Nakhaei, David Isele, Kikuo Fujimura, and Hongyuan Zha. arXiv:1809.05188, submitted Septembre 2018.
- Integrating independent and centralized multi-agent reinforcement learning for traffic signal network optimization. Zhi Zhang, Jiachen Yang, and Hongyuan Zha. arXiv:1909.10651, submitted September 2019.
- Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery. Jiachen Yang, Igor Borovikov, and Hongyuan Zha. arXiv:1912.03558, submitted December 2019.
- ColosseumRL: A Framework for Multiagent Reinforcement Learning in N-Player Games. Alexander Shmakov, John Lanier, Stephen McAleer, Rohan Achar, Cristina Lopes, and Pierre Baldi. arXiv:1912.04451, submitted December 2019.
- Combined Model for Partially-Observable and Non-Observable Task Switching:Solving Hierarchical Reinforcement Learning Problems. Nibraas Khan, and Joshua Phillips. arXiv:1911.10425, submitted November 2019.
- Induction of Subgoal Automata for Reinforcement Learning. Daniel Furelos-Blanco, Mark Law, Alessandra Russo, Krysia Broda, and Anders Jonsson. arXiv:1911.13152, submitted November 2019.
- Class Teaching for Inverse Reinforcement Learners. Manuel Lopes, and Francisco Melo. arXiv:1911.13009, submitted November 2019.
- Simulation-based reinforcement learning for real-world autonomous driving. Błażej Osiński, Adam Jakubowski, Piotr Miłoś, Paweł Zięcina, Christopher Galias, and Henryk Michalewski. arXiv:1911.12905, submitted November 2019.
- Augmented Random Search for Quadcopter Control: An alternative to Reinforcement Learning. Ashutosh Kumar Tiwari, and Sandeep Varma Nadimpalli. arXiv:1911.12553, submitted November 2019.
- Distributed Soft Actor-Critic with Multivariate Reward Representation and Knowledge Distillation. Dmitry Akimov. arXiv:1911.13056, submitted November 2019.
- Playing Games in the Dark: An approach for cross-modality transfer in reinforcement learning. Rui Silva, Miguel Vasco, Francisco S. Melo, Ana Paiva, and Manuela Veloso. arXiv:1911.12851, submitted November 2019.
- Option-critic in cooperative multi-agent systems. Jhelum Chakravorty, Nadeem Ward, Julien Roy, Maxime Chevalier-Boisvert, Sumana Basu, Andrei Lupu, and Doina Precup. arXiv:1911.12825, submitted November 2019.
- Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction. Vishal Jain, William Fedus, Hugo Larochelle, Doina Precup, and Marc G. Bellemare. arXiv:1911.12511, submitted November 2019.
- Stigmergic Independent Reinforcement Learning for Multi-Agent Collaboration. Xu Xing, Li Rongpeng, Zhao Zhifeng, and Zhang Honggang. arXiv:1911.12504, submitted November 2019.
- Which Channel to Ask My Question? Personalized Customer Service RequestStream Routing using DeepReinforcement Learning. Zining Liu, Chong Long, Xiaolu Lu, Zehong Hu, Jie Zhang, and Yafang Wang. arXiv:1911.10521, submitted November 2019.
- Iteratively-Refined Interactive 3D Medical Image Segmentation with Multi-Agent Reinforcement Learning. Xuan Liao, Wenhao Li, Qisen Xu, Xiangfeng Wang, Bo Jin, Xiaoyun Zhang, Ya Zhang, and Yanfeng Wang. arXiv:1911.10334, submitted November 2019.
- Mitigate Bias in Face Recognition using Skewness-Aware Reinforcement Learning. Mei Wang, and Weihong Deng. arXiv:1911.10692, submitted November 2019.
- DeepSynth: Program Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning. Mohammadhosein Hasanbeig, Natasha Yogananda Jeppu, Alessandro Abate, Tom Melham, and Daniel Kroening. arXiv:1911.10244, submitted November 2019.
- Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms. Kaiqing Zhang, Zhuoran Yang, and Tamer Başar. arXiv:1911.10635, submitted November 2019.
- ORL: Reinforcement Learning Benchmarks for Online Stochastic Optimization Problems. Bharathan Balaji, Jordan Bell-Masterson, Enes Bilgin, Andreas Damianou, Pablo Moreno Garcia, Arpit Jain, Runfei Luo, Alvaro Maggiar, Balakrishnan Narayanaswamy, and Chun Ye. arXiv:1911.10641, submitted November 2019.
- A Deep Reinforcement Learning Architecture for Multi-stage Optimal Control. Yuguang Yang . arXiv:1911.10684, submitted November 2019.
- End-to-End Model-Free Reinforcement Learning for Urban Driving using Implicit Affordances. Marin Toromanoff, Emilie Wirbel, Fabien Moutarde. arXiv:1911.10868, submitted November 2019.
- Natural Language Generation Using Reinforcement Learning with External Rewards. Vidhushini Srinivasan, Sashank Santhanam, and Samira Shaikh. arXiv:1911.11404, submitted November 2019.
- Multi-Vehicle Mixed-Reality Reinforcement Learning for Autonomous Multi-Lane Driving. Rupert Mitchell, Jenny Fletcher, Jacopo Panerati, and Amanda Prorok. arXiv:1911.11699, submitted November 2019.
- Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem. John Holler, Risto Vuorio, Zhiwei Qin, Xiaocheng Tang, Yan Jiao, Tiancheng Jin, Satinder Singh, Chenxi Wang, and Jieping Ye. arXiv:1911.11260, submitted November 2019.
- Behavior Regularized Offline Reinforcement Learning. Yifan Wu, George Tucker, and Ofir Nachum. arXiv:1911.11361, submitted November 2019.
- Control-Tutored Reinforcement Learning: an application to the Herding Problem. Francesco De Lellis, Fabrizia Auletta, Giovanni Russo, and Mario di Bernardo. arXiv:1911.11444, submitted November 2019.
- From Persistent Homology to Reinforcement Learning with Applications for Retail Banking. Jeremy Charlier. arXiv:1911.11573, submitted November 2019.
- Join Query Optimization with Deep Reinforcement Learning Algorithms. Jonas Heitz, and Kurt Stockinger. arXiv:1911.11689, submitted November 2019.
- A selected review on reinforcement learning based control for autonomous underwater vehicles. Yachu Hsu, Hui Wu, Keyou You, and Shiji Song. arXiv:1911.11991, submitted November 2019.
- Improving Fictitious Play Reinforcement Learning with Expanding Models. Rong-Jun Qin, Jing-Cheng Pang, and Yang Yu. arXiv:1911.11928, submitted November 2019.
- Deep Reinforcement Learning based Adaptive Moving Target Defense. Taha Eghtesad, Yevgeniy Vorobeychik, and Aron Laszka. arXiv:1911.11972, submitted November 2019.
- Towards Similarity Graphs Constructed by Deep Reinforcement Learning. Dmitry Baranchuk, and Artem Babenko. arXiv:1911.12122, submitted November 2019.
- Information-Theoretic Confidence Bounds for Reinforcement Learning. Xiuyuan Lu, and Benjamin Van Roy. arXiv:1911.09724, submitted November 2019.
- Efficient Exploration through Intrinsic Motivation Learning for Unsupervised Subgoal Discovery in Model-Free Hierarchical Reinforcement Learning. Jacob Rafati, and David C. Noelle. arXiv:1911.10164, submitted November 2019.
- Contextual Reinforcement Learning of Visuo-tactile Multi-fingered Grasping Policies. Visak Kumar, Tucker Herman, Dieter Fox, Stan Birchfield, and Jonathan Tremblay. arXiv:1911.09233, submitted November 2019.
- Accelerating Reinforcement Learning with Suboptimal Guidance. Eivind Bøhn, Signe Moe, and Tor Arne Johansen. arXiv:1911.09391, submitted November 2019.
- Memory-Efficient Episodic Control Reinforcement Learning with Dynamic Online k-means. Andrea Agostinelli, Kai Arulkumaran, Marta Sarrico, Pierre Richemond, and Anil Anthony Bharath. arXiv:1911.09560, submitted November 2019.
- Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic Control. Marta Sarrico, Kai Arulkumaran, Andrea Agostinelli, Pierre Richemond, and Anil Anthony Bharath. arXiv:1911.09615, submitted November 2019.
- Generalizable Resource Allocation in Stream Processing via Deep Reinforcement Learning. Xiang Ni, Jing Li, Mo Yu, Wang Zhou, and Kun-Lung Wu. arXiv:1911.08517, submitted November 2019.
- Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning. Sanket Shah, Arunesh Sinha, Pradeep Varakantham, Andrew Perrault, and Milind Tambe. arXiv:1911.08799, submitted November 2019.
- Efficient decorrelation of features using Gramian in Reinforcement Learning.Borislav Mavrin, Daniel Graves, and Alan Chan. arXiv:1911.08610, submitted November 2019.
- Corruption Robust Exploration in Episodic Reinforcement Learning. Thodoris Lykouris, Max Simchowitz, Aleksandrs Slivkins, and Wen Sun. arXiv:1911.08689, submitted November 2019.
- Deep Reinforcement Learning with Explicitly Represented Knowledge and Variable State and Action Spaces. Jaromír Janisch, Tomáš Pevný, and Viliam Lisý. arXiv:1911.08756, submitted November 2019.
- Hierarchical Average Reward Policy Gradient Algorithms. Akshay Dharmavaram, Matthew Riemer, and Shalabh Bhatnagar. arXiv:1911.08826, submitted November 2019.
- Adversarial Inverse Reinforcement Learning for Decision Making in Autonomous Driving. Pin Wang, Dapeng Liu, Jiayu Chen, and Ching-Yao Chan. arXiv:1911.08044, submitted November 2019.
- Attention Privileged Reinforcement Learning For Domain Transfer. Sasha Salter, Dushyant Rao, Markus Wulfmeier, Raia Hadsell, and Ingmar Posner. arXiv:1911.08363, submitted November 2019.
- Deep Tile Coder: an Efficient Sparse Representation Learning Approach with applications in Reinforcement Learning. Yangchen Pan. arXiv:1911.08068, submitted November 2019.
- Placement Optimization of Aerial Base Stations with Deep Reinforcement Learning. Jin Qiu, Jiangbin Lyu, and Liqun Fu. arXiv:1911.08111, submitted November 2019.
- Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning, Runsheng Yu, Zhenyu Shi, Xinrun Wang, Rundong Wang, Buhong Liu, Xinwen Hou, Hanjiang Lai, and Bo An. arXiv:1911.07712, submitted November 2019
- Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation, Juncheng Li, Xin Wang, Siliang Tang, Haizhou Shi, Fei Wu, Yueting Zhuang, and William Yang Wang. arXiv:1911.07450, submitted November 2019
- Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient, Kevin Sebastian Luck, Mel Vecerik, Simon Stepputtis, Heni Ben Amor, and Jonathan Scholz. arXiv:1911.06833, submitted November 2019
- Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning. Cameron Voloshin, Hoang M. Le, Nan Jiang, and Yisong Yue. arXiv:1911.06854, submitted November 2019
- Inverse Reinforcement Learning with Missing Data. Tien Mai, Quoc Phong Nguyen, Kian Hsiang Low, and Patrick Jaillet. arXiv:1911.06930, submitted November 2019
- Off-Policy Policy Gradient Algorithms by Constraining the State Distribution Shift. Riashat Islam, Komal K. Teru, and Deepak Sharma. arXiv:1911.06970, submitted November 2019
- Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance. Mingxuan Jing, Xiaojian Ma, Wenbing Huang, Fuchun Sun, Chao Yang, Bin Fang, and Huaping Liu. arXiv:1911.07109, submitted November 2019
- Automated Augmentation with Reinforcement Learning and GANs for Robust Identification of Traffic Signs using Front Camera Images, Sohini Roy Chowdhury, Lars Tornberg, Robin Halvfordsson, Jonatan Nordh, Adam Suhren Gustafsson, Joel Wall, Mattias Westerberg, Adam Wirehed, Louis Tilloy, Zhanying Hu, Haoyuan Tan, Meng Pan, and Jonas Sjoberg. arXiv:1911.06486, submitted November 2019.
- Mastering the Game of Go with Deep Neural Networks and Tree Search, David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel and Demis Hassabis. Nature, vol. 529, January 2016.
- A Brief Survey of Deep Reinforcement Learning, Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, and Anil Anthony Bharath. IEEE Signal Processing Magazine, Special Issue on Deep Learning for Image Understanding.
- Deep Learning for Video Game Playing, Niels Justesen, Philip Bontrager, Julian Togelius, and Sebastian Risi.
- Towards Deep Symbolic Reinforcement Learning, Marta Garnelo, Kai Arulkumaran, Murray Shanahan.
- Multi-task learning with deep model based reinforcement learning, Asier Mujika.
- Imagination-Augmented Agents for Deep Reinforcement Learning, Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, David Silver, Daan Wierstra.
- Playing FPS Games with Deep Reinforcement Learning, Guillaume Lample, Devendra Singh Chaplot.
- Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search, Ali Yahya, Adrian Li, Mrinal Kalakrishnan, Yevgen Chebotar, Sergey Levine.
- Learning to compose words into sentences with Reinforcement Learning, Dani Yogatama, Phil Blunsom, Chris Dyer, Edward Grefenstette, Wang Ling.
- Reinforcement Learning approach for Real Time Strategy Games Battle City and S3, Harshit Sethy, Amit Patel.
- Extending the OpenAI Gym for robotics: a toolkit for reinforcement learning using ROS and Gazebo, Iker Zamora, Nestor Gonzalez Lopez, Victor Mayoral Vilches, Alejandro Hernandez Cordero.
- Deep Reinforcement Learning with Macro-Actions, Ishan P. Durugkar, Clemens Rosenbaum, Stefan Dernbach, Sridhar Mahadevan.
- Deep Reinforcement Learning: an Overview, Yuxi Li.
- Learning model-based planning from scratch, Razvan Pascanu, Yujia Li, Oriol Vinyals, Nicolas Heess, Lars Buesing, Sebastien Racanière, David Reichert,Théophane Weber, Daan Wierstra, Peter Battaglia.
- Learning from Demonstrations for Real World Reinforcement Learning, Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John Agapiou, Joel Z. Leibo, Audrunas Gruslys.
- Unifying Task Specification in Reinforcement Learning, Martha White.
- ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games, Yuandong Tian, Qucheng Gong, Wenling Shang, Yuxin Wu, Larry Zitnick.
- Learning Macromanagement in StarCraft from Replays using Deep Learning, Niels Justesen, Sebastian Risi.
- The Atari Grand Challenge Dataset, Vitaly Kurin, Sebastian Nowozin, Katja Hofmann, Lucas Beyer, Bastian Leibe.
- Experience Replay Using Transition Sequences, Thommen George Karimpanal, Roland Bouffanais.
- Reinforcement Learning with a Corrupted Reward Channel, Tom Everitt, Victoria Krakovna, Laurent Orseau, Marcus Hutter, Shane Legg.
- Bridging the Gap Between Value and Policy Based Reinforcement Learning, Ofir Nachum, Mohammad Norouzi, Kelvin Xu, Dale Schuurmans.
- Algorithm Selection for Reinforcement Learning, Romain Laroche, Raphael Feraud.
- On-line Building Energy Optimization using Deep Reinforcement Learning, Elena Mocanu, Decebal Constantin Mocanu, Phuong H. Nguyen, Antonio Liotta, Michael E. Webber, Madeleine Gibescu, J.G. Slootweg.
- Task-Oriented Query Reformulation with Reinforcement Learning, Rodrigo Nogueira, Kyunghyun Cho.
- Inverse Reinforcement Learning under Noisy Observations, Shervin Shahryari, Pradant Doshi. In Proceedings of AAMAS 2017, pp. 1733-1735.
- Speeding up Tabular Reinforcement Learning Using State-Action Similarities, Ariel Rosenfeld, Matthew E. Taylor, Sarit Kraus. In Proceedings of AAMAS 2017, pp. 1722-1724.
- Analysing Congestion Problems in Multi-agent Reinforcement Learning, Roxana Radulescu, Peter Vrancx, Ann Nowé. In Proceedings of AAMAS 2017, pp. 1705-1707.
- Autonomous Model Management via Reinforcement Learning, Elad Liebman, Eric Zavesky, Peter Stone. In Proceedings of AAMAS 2017, pp. 1601-1603.
- Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Chelsea Finn, Pieter Abbeel, Sergey Levine.
- Trial without Error: Towards Safe Reinforcement Learning via Human Intervention, William Saunders, Girish Sastry, Andreas Stuhlmueller, Owain Evans.
- Reverse Curriculum Generation for Reinforcement Learning, Carlos Florensa, David Held, Markus Wulfmeier, Pieter Abbeel.
- Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning, James Steven Supancic III, Deva Ramanan.
- Control of a Quadrotor with Reinforcement Learning, Jemin Hwangbo, Inkyu Sa, Roland Siegwart, Marco Hutter.
- Reinforcement Learning for Architecture Search by Network Transformation, Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, Jun Wang.
- Sentence Simplification with Deep Reinforcement Learning, Xingxing Zhang, Mirella Lapata.
- Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning, Paweł Budzianowski, Stefan Ultes, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Iñigo Casanueva, Lina Rojas-Barahona, Milica Gašić.
- Deep Reinforcement Learning: Framework, Applications, and Embedded Implementations, Hongjia Li, Tianshu Wei, Ao Ren, Qi Zhu, Yanzhi Wang.
- Towards Deep Symbolic Reinforcement Learning, Marta Garnelo, Kai Arulkumaran, Murray Shanahan.
- Learning to Act by Predicting the Future, Alexey Dosovitskiy and Vladlen Koltun. ICLR 2017.
- The Predictron: End-to-End Learning and Planning, David Silver, Hado van Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, Tim Harley, Gabriel Dulac-Arnold, David Reichert, Neil Rabinowitz, André Barreto and Thomas Degris.
- Pick and Place Without Geometric Object Models, Marcus Gualtieri, Andreas ten Pas, Robert Platt.
- Multiagent Reinforcement Learning: an Overview, L. Busoniu, R. Babuska, and B. De Schutter, “Multi-agent reinforcement learning: An overview,” Chapter 7 in Innovations in Multi-Agent Systems and Applications – 1 (D. Srinivasan and L.C. Jain, eds.), vol. 310 of Studies in Computational Intelligence, Berlin, Germany: Springer, pp. 183–221, 2010.
- A multi-agent reinforcement learning model of common-pool resource appropriation, Julien Perolat, Joel Z. Leibo, Vinicius Zambaldi, Karl Tuyls, Thore Graepel.
- Counterfactual Multi-agent Policy Gradients, Jakob Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, Shimon Whiteson.
- Inverse Reinforcement Learning in Swarm Systems, Adrian Šoši c, Wasiur R. KhudaBukhsh, Abdelhak M. Zoubir and Heinz Koepp.In Proceedings of AAMAS 2017, pp. 1413-1421.
- Simultaneously Learning and Advising in Multiagent Reinforcement Learning, Felipe Leno da Silva, Ruben Glatt, and Anna Helena Reali Costa. In Proceedings of AAMAS 2017, pp. 1100-1108).
- Reinforcement Learning for Multi-Step Expert Advice, Patrick Philipp, Achim Rettinger. In Proceedings of AAMAS 2017, pp. 962-971.
- Reward Shaping in Episodic Reinforcement Learning, Marek Grzes. In Proceedings of AAMAS 2017, pp. 565-573.
- Terrain-Adaptive Locomotion Skills Using Deep Reinforcement Learning, Xue Bin Peng, Glen Berseth, Michiel van den Panne, ACM Transactions on Graphics, volume 35 number 4, 2016.
- Virtual-to-real Deep Reinforcement Learning: Continuous Control of Mobile Robots for Mapless Navigation, Lei Tai, Giuseppe Paolo, Ming Liu.
- https://github.com/chncyhn/flappybird-qlearning-bot
- https://github.com/yenchenlin/DeepLearningFlappyBird
- https://yanpanlau.github.io/2016/07/10/FlappyBird-Keras.html
- http://www.danielslater.net/2016/03/deep-q-learning-pong-with-tensorflow.html
- https://medium.com/@dhruvp/how-to-write-a-neural-network-to-play-pong-from-scratch-956b57d4f6e0