MasterXiong - 簡書

MasterXiong

IP屬地：英格蘭

Chapter 9
Chapter 9: On-policy Prediction with Approximation From this chapter, we...

182 0 0
Chapter 7
Chapter 7: n-step Bootstrapping n-step TD methods span a spectrum with M...

187 0 0

Chapter 6
Chapter 6: Temporal-Difference Learning Temporal-difference (TD) learnin...

220 0 0
Chapter 5
Chapter 5: Monte Carlo Methods Monte Carlo (MC) methods are learning met...

140 0 0
Chapter 4
Chapter 4: Dynamic Programming Dynamic programming computes optimal poli...

473 0 0
Chapter 3
Chapter 3: Finite Markov Decision Processes Basic Definitions MDP is the...

273 0 0
Chapter 2
Chapter 2: Multi-armed Bandits Multi-armed bandits can be seen as the si...

214 0 0

Pointer Networks
Pointer Networks Oriol Vinyals, Meire Fortunato, Navdeep JaitlyGoogle, B...

500 0 0
Neural Computation of Decisions in Optimization Problems
Neural Computation of Decisions in Optimization Problems J. J. Hopfield,...

495 0 0