Reinforcement learning studies how agents learn to make better decisions through interaction with an environment. Agents act, observe consequences, receive feedback, and adapt future behavior. This specialization develops reinforcement learning as a framework for sequential decision-making under uncertainty, progressing from classical foundations to scalable deep learning methods and reward design.
The first course, Classical Reinforcement Learning, introduces finite-state decision problems, Markov chains, Markov decision processes, discounted rewards, Bellman equations, planning with known models, and learning from sampled experience. Learners study value iteration, policy iteration, Monte Carlo methods, temporal-difference learning, SARSA, and Q-learning.
The second course, Deep Reinforcement Learning, shows how reinforcement learning scales beyond tabular settings using neural-network-based function approximation. Learners study Deep Q-Networks, replay buffers, target networks, policy-gradient methods, actor–critic algorithms, and modern methods such as PPO, DDPG, and SAC, with attention to stability, diagnosis, evaluation, and reproducibility.
The third course, Reward Programming, addresses how to design, infer, monitor, and revise objectives so agents learn intended behavior. Learners study temporal logic, automata, reward machines, reward shaping, inverse reinforcement learning, preference feedback, safety, shielding, auditing, and stress testing.
Applied Learning Project
Learners complete conceptual quizzes throughout the specialization to check their understanding of reinforcement learning foundations, deep RL methods, and reward-design principles. These assessments emphasize interpreting algorithms, diagnosing learning behavior, and reasoning about how agents make decisions under uncertainty.

















