Research and Engineering Log

I learn by building. This site captures the journey of my projects as I transition from high-level theory to working systems. My focus is two-fold: the “how”—the rigorous implementation details often left out of academic papers—and the “why”—the mathematical intuition and architectural decisions that truly make these systems work.

Current Work

Annotated PPO: A deep dive into Proximal Policy Optimization, focusing on implementation details and the transition from theory to PyTorch code.

Areas of Interest

My work currently involves exploring the following domains:

Reinforcement Learning: Policy gradients, environment design, and agent stability.
Large Language Models: Architecture internals and optimization strategies.
Systems for ML: Efficient training and deployment workflows.

Knowledge Graph

The connections between these topics are visualized in the Graph View. As I explore new papers or build out new projects, new nodes will appear there automatically.

GitHub | X/Twitter

NOTE

This site is a reflection of my learning process. Everything here is subject to “Gradient Descent”—it gets better and more refined with every iteration.

anirudh.blog

Explorer

Research and Engineering Log

Current Work

Areas of Interest

Knowledge Graph

Graph View

Table of Contents