Writing
Blogs
“Knowledge brings liberation.”
Making CPUs Go Brrr
From clock frequencies to ILP, SIMD, multithreading, and multiprocessing — how CPUs evolved to execute programs faster and why the free lunch ended.
The Aya Project: My Experience and Learnings
A personal account of contributing to the Aya multilingual NLP project at Cohere for AI — from UI development to paper writing and everything in between.
Writing a Compiler in Rust #1: Lexical Analysis
Building a lexer from scratch in Rust — understanding tokens, regex, and how compilers break source code into meaningful pieces.
LIMA: Less is More for Alignment
Exploring the LIMA paper and the Superficial Alignment Hypothesis — why 1,000 carefully curated examples might be all you need to align an LLM.
Transformers: Attention is all you need
A deep dive into the Transformer architecture — breaking down multi-headed attention, positional encodings, encoder-decoder structure, and how it all fits together.
November 15, 2021Activation Functions - The why you never asked!
Deep Learning is probably one of those things that everyone thinks is magical and usually, I take immense pleasure in seeing the reaction they have when I tell them it's essentially matrices that are unlike the nodes they expect but Neural Networks aren't limited to that and neither, is that the thing, that makes it special...
July 04, 2021Computing the Mean and Std of a Dataset in Pytorch
PyTorch provides various inbuilt mathematical utilities to monitor the descriptive statistics of a dataset at hand one of them being mean and standard deviation.
PyTorch Lightning: DataModules, Callbacks, TPU, and Loggers
A walkthrough of PyTorch Lightning's DataModules, callbacks, TPU training, and experiment logging with Weights & Biases.
Class Imbalance comes in Like a Lion
A practical guide to handling class imbalance — from choosing the right metrics to SMOTE, ADASYN, cost-sensitive learning, and data augmentation strategies.
May 16, 2021Reading rpt files with Pandas
In most cases, we usually have a CSV file to load the data from, but there are other formats such as JSON, rpt, TSV, etc. that can be used to store data. Pandas provide us with the utility to load data from them. In this article, we'll see how we can load data from an rpt file with the use of Pandas.
Training SVM over Custom Kernels
How to create and train SVMs with custom kernel functions and precomputed Gram matrices — from theory to sklearn implementation.
April 21, 2021Adding Mean and Median to Histogram in R
Visualizing data can help gather insights from it that descriptive statistics can't. Anscombe's Quartet shows us how those statistics could be misleading, hence it becomes to analyze the data visually. We'll see how we can create histograms in R Programming Language and how to add mean and median lines to them.
April 16, 2021Y Scrambling for Model Validation
Y Scrambling is a method that one can use in order to test whether the predictions made by the model aren't made just by chance. It is used in the validation of multi linear regression QSPR models.This process is amazingly simple to execute, and we'll learn about it in detail.
March 12, 2021Training Neural Networks with Validation using PyTorch
It's important that our network performs better not only on data it's trained on but also data that it has never seen before. Let's see how we can keep track of validation metric at each training step and also save the model weights with best performance.
January 22, 2021Adjusting Learning Rate of a Neural Network in PyTorch
Learning Rate's value determines how fast the Neural Network would converge to minima. We usually tune our parameters to find the best value for the learning rate. But is there a way we can improve this process?
January 4, 2021Data Exploration using Pandas GUI
Pandas is a tool that we use very often for manipulating the data, along with seaborn and matplotlib for Data Visualization. PandasGUI is a library that makes this task much easier by providing a GUI interface that can be used to make the task easier.
January 4, 2021Deploying ML Models as API using FastAPI
Apart from the two mentioned there is another framework that is becoming quite popular, so much so that companies like Netflix and Uber are using it, and that framework is FastAPI.
January 3, 2021Data and its Types
Hello Fellow Readers. If you are here you're probably interested in Data Science and probably willing to take a dive in a wide golden ocean called Statistics. If I had to explain what stats is I'll say it is playing with data, and boy is it fun.
December 08, 2020Understanding Lightning DataModules
We can use DataLoaders in Lightning to train the model but Lightning also provides us with a better approach called DataModules. They are reusable and shareable class that encapsulates the DataLoaders with the steps required to process data.
November 26, 2020Training Neural Networks using Pytorch Lightning
In PyTorch every time you start a project you have to rewrite those training and testing loop. Lightning fixes the problem by not only reducing boilerplate code but also providing added functionality that might come handy while training.
October 30, 2020Training and Testing a Basic Neural Network using Pytorch
Google and FaceBook have blessed us with 2 of the most popular neural nets library TensorFlow and Pytorch that make the job quite easy. For this article I'll using Pytorch and I'll use Tensorflow in the next one.