10 Steps to Building a Million Parameter ChatGPT in Python

Published On Wed Mar 05 2025
10 Steps to Building a Million Parameter ChatGPT in Python

Building a Perfect Million Parameter LLM Like ChatGPT in Python

Quick Note — We will first train a tokenizer and then build a 29-million-parameter LLM from scratch. This will give us a model that generates proper sentences. Next, we will fine-tune it using (SFT) to improve its knowledge and response style, making it more like ChatGPT. I have deployed my trained tiny model on Hugging Face space. You can chat with it there. Web app link

Take a look at a few chat conversations between me and our trained LLM. Instead of going through all the theory at once, we will code alongside it to understand everything properly. Everything, from the dataset to the model weights, is replaceable.

Supervised Fine-tuning: customizing LLMs

Coding tutorials and news

The developer homepage gitconnected.com & skilled.dev & levelup.dev

I write on AI, https://www.linkedin.com/in/fareed-khan-dev/

Understanding and Using Supervised Fine-Tuning (SFT) for Language ...

HelpStatusAboutCareersPressBlogPrivacyTermsText to speechTeams