Writing

Blogs

“Knowledge brings liberation.”

March 30, 2025

Making CPUs Go Brrr

From clock frequencies to ILP, SIMD, multithreading, and multiprocessing — how CPUs evolved to execute programs faster and why the free lunch ended.

SYSTEMS
CPU
PARALLEL COMPUTING
May 20, 2024

The Aya Project: My Experience and Learnings

A personal account of contributing to the Aya multilingual NLP project at Cohere for AI — from UI development to paper writing and everything in between.

NLP
EXPERIENCE
RESEARCH
February 26, 2024

Writing a Compiler in Rust #1: Lexical Analysis

Building a lexer from scratch in Rust — understanding tokens, regex, and how compilers break source code into meaningful pieces.

RUST
COMPILER
May 31, 2023

LIMA: Less is More for Alignment

Exploring the LIMA paper and the Superficial Alignment Hypothesis — why 1,000 carefully curated examples might be all you need to align an LLM.

NLP
PAPER DISSECTION
LLM
December 10, 2021

Transformers: Attention is all you need

A deep dive into the Transformer architecture — breaking down multi-headed attention, positional encodings, encoder-decoder structure, and how it all fits together.

DEEP LEARNING
PAPER DISSECTION
November 15, 2021

Activation Functions - The why you never asked!

Deep Learning is probably one of those things that everyone thinks is magical and usually, I take immense pleasure in seeing the reaction they have when I tell them it's essentially matrices that are unlike the nodes they expect but Neural Networks aren't limited to that and neither, is that the thing, that makes it special...

DEEP LEARNING
PYTORCH
July 04, 2021

Computing the Mean and Std of a Dataset in Pytorch

PyTorch provides various inbuilt mathematical utilities to monitor the descriptive statistics of a dataset at hand one of them being mean and standard deviation.

PYTORCH
June 8, 2021

PyTorch Lightning: DataModules, Callbacks, TPU, and Loggers

A walkthrough of PyTorch Lightning's DataModules, callbacks, TPU training, and experiment logging with Weights & Biases.

DEEP LEARNING
PYTORCH LIGHTNING
June 5, 2021

Class Imbalance comes in Like a Lion

A practical guide to handling class imbalance — from choosing the right metrics to SMOTE, ADASYN, cost-sensitive learning, and data augmentation strategies.

DATA PRE-PROCESSING
MACHINE LEARNING
May 16, 2021

Reading rpt files with Pandas

In most cases, we usually have a CSV file to load the data from, but there are other formats such as JSON, rpt, TSV, etc. that can be used to store data. Pandas provide us with the utility to load data from them. In this article, we'll see how we can load data from an rpt file with the use of Pandas.

DATA LOADING
MACHINE LEARNING
April 29, 2021

Training SVM over Custom Kernels

How to create and train SVMs with custom kernel functions and precomputed Gram matrices — from theory to sklearn implementation.

SVM
MACHINE LEARNING
April 21, 2021

Adding Mean and Median to Histogram in R

Visualizing data can help gather insights from it that descriptive statistics can't. Anscombe's Quartet shows us how those statistics could be misleading, hence it becomes to analyze the data visually. We'll see how we can create histograms in R Programming Language and how to add mean and median lines to them.

STATISTICS
R LANGUAGE
April 16, 2021

Y Scrambling for Model Validation

Y Scrambling is a method that one can use in order to test whether the predictions made by the model aren't made just by chance. It is used in the validation of multi linear regression QSPR models.This process is amazingly simple to execute, and we'll learn about it in detail.

MODEL VALIDATION
MACHINE LEARNING
March 12, 2021

Training Neural Networks with Validation using PyTorch

It's important that our network performs better not only on data it's trained on but also data that it has never seen before. Let's see how we can keep track of validation metric at each training step and also save the model weights with best performance.

PYTORCH
DEEP LEARNING
January 22, 2021

Adjusting Learning Rate of a Neural Network in PyTorch

Learning Rate's value determines how fast the Neural Network would converge to minima. We usually tune our parameters to find the best value for the learning rate. But is there a way we can improve this process?

PYTORCH
DEEP LEARNING
January 4, 2021

Data Exploration using Pandas GUI

Pandas is a tool that we use very often for manipulating the data, along with seaborn and matplotlib for Data Visualization. PandasGUI is a library that makes this task much easier by providing a GUI interface that can be used to make the task easier.

DATA ANALYSIS
AUTOMATED EDA
January 4, 2021

Deploying ML Models as API using FastAPI

Apart from the two mentioned there is another framework that is becoming quite popular, so much so that companies like Netflix and Uber are using it, and that framework is FastAPI.

DEPLOYMENT
MACHINE LEARNING
January 3, 2021

Data and its Types

Hello Fellow Readers. If you are here you're probably interested in Data Science and probably willing to take a dive in a wide golden ocean called Statistics. If I had to explain what stats is I'll say it is playing with data, and boy is it fun.

STATISTICS
December 08, 2020

Understanding Lightning DataModules

We can use DataLoaders in Lightning to train the model but Lightning also provides us with a better approach called DataModules. They are reusable and shareable class that encapsulates the DataLoaders with the steps required to process data.

DEEP LEARNING
PYTORCH LIGHTNING
November 26, 2020

Training Neural Networks using Pytorch Lightning

In PyTorch every time you start a project you have to rewrite those training and testing loop. Lightning fixes the problem by not only reducing boilerplate code but also providing added functionality that might come handy while training.

DEEP LEARNING
PYTORCH LIGHTNING
October 30, 2020

Training and Testing a Basic Neural Network using Pytorch

Google and FaceBook have blessed us with 2 of the most popular neural nets library TensorFlow and Pytorch that make the job quite easy. For this article I'll using Pytorch and I'll use Tensorflow in the next one.

DEEP LEARNING
PYTORCH