harryzhangog / deep-rl-notes Goto Github PK
View Code? Open in Web Editor NEWA collection of comprehensive notes on Deep Reinforcement Learning, customized for UC Berkeley's CS 285 (prev. CS 294-112)
A collection of comprehensive notes on Deep Reinforcement Learning, customized for UC Berkeley's CS 285 (prev. CS 294-112)
Hi Harry,
Thanks for taking these notes! They are clear and very helpful. I found a typo in 2.3.1, which should be
It should be a easy fix in L20 in https://github.com/harryzhangOG/Deep-RL-Notes/blob/master/imitation.tex
Thanks a lot!
Hi Harry, great note! Just found some small typos in chapter 10:
1. The sigma (page 60)
$p(x|z) = \mathcal{N}(\mu_{nn}(z),\mu_{nn}(z))$
And the correct should be this:
$p(x|z) = \mathcal{N}(\mu_{nn}(z),\sigma_{nn}(z))$
2. The theta display (page 60)
[
theta\leftarrow \argmaxA_\theta\frac{1}{N}\sum_i\mathbb{E}{z\sim p(z|x_i)}\log p\theta(x_i)
]
Hi, thanks for your great notes.
I just found a typo in Chapter 10.1.1. :
The KL-divergence
And the correct should be this:
Hi,
First of all, thanks so much for the notes! They are extremely useful.
I just wanted to point out a few typos:
J(\theta)
after last grad on the denominator. Link\max_a
in front of Q (since we are finding the best state-action function by training a NN to learn the best action \mu_{\theta}
). Also, although this is not a typo per se, I think after the summation in line 6 there should be a \frac{d \mu_{\theta}}{d \theta}
rather that \frac{d a}}{d \theta}
, as that "a" in that derivative is the output from the NN above. Link\sum_i
as we are updating one value at a time. LinkI hope this helps! Thanks!
Javier.
The first equation seems to have its first parenthesis in the wrong place: https://youtu.be/VgdSubQN35g?list=PL_iWQOsE6TfU9DwANRsUZf0YUUJS_ySr0&t=152
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.