I learn by building. This site captures the journey of my projects as I transition from high-level theory to working systems. My focus is two-fold: the “how”—the rigorous implementation details often left out of academic papers—and the “why”—the mathematical intuition and architectural decisions that truly make these systems work.
Current Work
- Annotated PPO: A deep dive into Proximal Policy Optimization, focusing on implementation details and the transition from theory to PyTorch code.
Areas of Interest
My work currently involves exploring the following domains:
- Reinforcement Learning: Policy gradients, environment design, and agent stability.
- Large Language Models: Architecture internals and optimization strategies.
- Systems for ML: Efficient training and deployment workflows.
Knowledge Graph
The connections between these topics are visualized in the Graph View. As I explore new papers or build out new projects, new nodes will appear there automatically.
NOTE
This site is a reflection of my learning process. Everything here is subject to “Gradient Descent”—it gets better and more refined with every iteration.