From 0 to 1: Code the Classic Reinforcement Learning Algorithms

Programming Course, Bilibili, 2025

Q-Learning

Click the web link and start learning: Bilibili - 强化学习 Q-learning玩21点纸牌 纯白板逐行代码Python实现

*Note: This course is in Chinese, but it won’t affect learning

Double Q-Learning

Click the web link and start learning: Bilibili - 强化学习 Double Q-learning 纯白板逐行代码Python实现

*Note: This course is in Chinese, but it won’t affect learning.

Deep Q-Network (DQN)

This is a step-by-step tutorial on how to implement Deep Q-Network (DQN) using Python.

Click the web link and start learning: Bilibili - 深度强化学习 DQN 纯白板逐行代码Python实现

*Note: This course is in Chinese, but it won’t affect learning.

Deep Determenistic Policy Gradient (DDPG)

This video will help you deploy Deep Determenistic Policy Gradient (DDPG) method step-by-step using Python.

Click the web link and start learning: Bilibili - 深度强化学习 DDPG 纯白板逐行代码Python实现

*Note: This course is in Chinese, but it won’t affect learning.

Multi-Agent Deep Determenistic Policy Gradient (MADDPG)

Click the web link and start learning: Bilibili - 多智能体深度强化学习 MADDPG 纯白板逐行代码Python实现

*Note: This course is in Chinese, but it won’t affect learning.

Proximal Policy Optimization (PPO)

Click the web link and start learning: Bilibili - 深度强化学习 PPO 纯白板逐行代码Python实现

*Note: This course is in Chinese, but it won’t affect learning.

Soft Actor-Critic (SAC)

Click the web link and start learning: Bilibili - 深度强化学习 SAC 纯白板逐行代码Python实现

*Note: This course is in Chinese, but it won’t affect learning.