Technical Writing
A chronological list of essays, tutorials, and technical notes on systems architecture and frontend engineering.
2025
Efficient Large Language Models: Techniques for Production Deployment
Deploying large language models in production requires careful optimization. This article covers quantization, distillation, pruning, and other techniques that make LLMs faster, smaller, and more efficient without sacrificing performance.
Federated Learning: Privacy-Preserving Machine Learning at Scale
Federated learning enables training machine learning models across decentralized data without exposing raw information. This article explores the techniques, challenges, and real-world applications of this privacy-preserving approach.
Transformer Architectures: Beyond the Basics
Transformers have revolutionized natural language processing and are now making waves in computer vision, audio, and multimodal applications. This deep dive explores the architectural innovations that make transformers so powerful.