Troubleshoot LLM Training with MosaicML Foundation Models

Published On Mon May 08 2023

LLM Training Code for MosaicML Foundation Models

If you're looking for code to train, finetune, evaluate, and deploy LLMs for inference with Composer and the MosaicML platform, you've come to the right place. This repository contains everything you need to get started.

The codebase here is designed with ease-of-use, efficiency, and flexibility in mind, enabling rapid experimentation with the latest techniques. Here's what you'll find:

MPT-7B, the first model in the MosaicML Foundation Series of models
The base model and several variants, including a 64K context length fine-tuned model
Instructions and scripts for running the models locally

To get started, clone this repo and install the requirements. Here's an end-to-end workflow for preparing a subset of the C4 dataset, training an MPT-125M model for 10 batches, converting the model to HuggingFace format, evaluating the model on the Winograd challenge, and generating responses to prompts:

If you have a write-enabled HuggingFace auth token, you can optionally upload your model to the Hub! Just export your token and uncomment the line containing --hf_repo_for_upload. Remember that this is a quickstart demo and the LLM must be trained for longer than 10 batches for good quality.

If you run into any problems with the code, please file Github issues directly to this repo. And if you're interested in training LLMs on the MosaicML platform, reach out to us at [email protected]!