From 0 to 1: Code the Classic Reinforcement Learning Algorithms
Programming Course, Bilibili, 2025
Q-Learning
Click the web link and start learning: Bilibili - 强化学习 Q-learning玩21点纸牌 纯白板逐行代码Python实现
*Note: This course is in Chinese, but it won’t affect learning
Double Q-Learning
Click the web link and start learning: Bilibili - 强化学习 Double Q-learning 纯白板逐行代码Python实现
*Note: This course is in Chinese, but it won’t affect learning.
Deep Q-Network (DQN)
This is a step-by-step tutorial on how to implement Deep Q-Network (DQN) using Python.
Click the web link and start learning: Bilibili - 深度强化学习 DQN 纯白板逐行代码Python实现
*Note: This course is in Chinese, but it won’t affect learning.
Deep Determenistic Policy Gradient (DDPG)
This video will help you deploy Deep Determenistic Policy Gradient (DDPG) method step-by-step using Python.
Click the web link and start learning: Bilibili - 深度强化学习 DDPG 纯白板逐行代码Python实现
*Note: This course is in Chinese, but it won’t affect learning.
Multi-Agent Deep Determenistic Policy Gradient (MADDPG)
Click the web link and start learning: Bilibili - 多智能体深度强化学习 MADDPG 纯白板逐行代码Python实现
*Note: This course is in Chinese, but it won’t affect learning.
Proximal Policy Optimization (PPO)
Click the web link and start learning: Bilibili - 深度强化学习 PPO 纯白板逐行代码Python实现
*Note: This course is in Chinese, but it won’t affect learning.
Soft Actor-Critic (SAC)
Click the web link and start learning: Bilibili - 深度强化学习 SAC 纯白板逐行代码Python实现
*Note: This course is in Chinese, but it won’t affect learning.
