Technical Writing

Deep dives and how-tos at the intersection of backend, ML, and the cloud.

2025

Efficient Large Language Models: Techniques for Production Deployment

Deploying large language models in production requires careful optimization. This article covers quantization, distillation, pruning, and other techniques that make LLMs faster, smaller, and more efficient without sacrificing performance.

Jun 10

Federated Learning: Privacy-Preserving Machine Learning at Scale

Federated learning enables training machine learning models across decentralized data without exposing raw information. This article explores the techniques, challenges, and real-world applications of this privacy-preserving approach.

Mar 20

Transformer Architectures: Beyond the Basics

Transformers have revolutionized natural language processing and are now making waves in computer vision, audio, and multimodal applications. This deep dive explores the architectural innovations that make transformers so powerful.