I learn by building. This site captures the journey of my projects as I transition from high-level theory to working systems. My focus is two-fold: the “how”—the rigorous implementation details often left out of academic papers—and the “why”—the mathematical intuition and architectural decisions that truly make these systems work.


Current Work

  • Annotated PPO: A deep dive into Proximal Policy Optimization, focusing on implementation details and the transition from theory to PyTorch code.

Areas of Interest

My work currently involves exploring the following domains:

  • Reinforcement Learning: Policy gradients, environment design, and agent stability.
  • Large Language Models: Architecture internals and optimization strategies.
  • Systems for ML: Efficient training and deployment workflows.

Knowledge Graph

The connections between these topics are visualized in the Graph View. As I explore new papers or build out new projects, new nodes will appear there automatically.


GitHub | X/Twitter


NOTE

This site is a reflection of my learning process. Everything here is subject to “Gradient Descent”—it gets better and more refined with every iteration.