Boost Your Model's Accuracy with Microsoft Phi-3.5 Series

Published On Mon Oct 14 2024
Boost Your Model's Accuracy with Microsoft Phi-3.5 Series

Fine-Tuning Phi-3.5 on E-Commerce Classification Dataset

Listen Share Discover Microsoft’s new LLM series and boost your model’s accuracy from 65% to 86% by fine-tuning it on the E-commerce classification dataset. Microsoft has joined the competitive landscape of large language models alongside Meta AI with the introduction of the Phi-3.5 series. This series includes a small language model, a vision language model, and employs a Mixture-of-Experts approach to achieve top-tier performance.

Exploring Microsoft Phi-3.5 Models

In this tutorial, we will explore the Microsoft Phi-3.5 family of models. We will load the Phi-3.5-mini-instruct model and fine-tune it to classify e-commerce products based on their text descriptions. In the final steps, we will demonstrate how to merge the LoRA (Low-Rank Adaptation) with the base model and push it to Hugging Face. This will enable efficient cloud deployment, making the model accessible for various applications.

Microsoft Launches Open-Source Phi-3.5 Models for Advanced AI ...

Take the Master Large Language Models (LLMs) Concepts course and learn about LLM applications, training methodologies, ethical considerations, and the latest research.

Introducing Microsoft Phi-3.5 Models

The Microsoft Phi-3.5 release introduces three innovative models: Phi-3.5-mini, Phi-3.5-vision, and the latest addition, Phi-3.5-MoE, a Mixture-of-Experts model.

Phi-3.5-mini is optimized for enhanced multi-lingual support with an impressive 128K context length. Despite its smaller size, it delivers performance that rivals larger models, thanks to rigorous enhancements through supervised fine-tuning, proximal policy optimization, and direct preference optimization, ensuring precise instruction adherence.

Phi-3.5-vision is a cutting-edge, lightweight multimodal model which was trained on the datasets composed of synthetic data and filtered public websites. It excels in multi-frame image understanding and reasoning, making it ideal for detailed image comparison, multi-image summarization/storytelling, and video summarization, with broad application potential.

Phi-3.5-MoE features a Mixture-of-Experts architecture with 16 experts and 6.6 billion active parameters. It offers exceptional performance with reduced latency and robust safety, alongside comprehensive multi-lingual support.

Marketplace · GitHub

Phi-3.5 Model Applications

The Phi-3.5 model family offers cost-effective, high-performance solutions for the open-source community, advancing small language models and generative AI. To learn about Phi-3 architecture, features, and applications, follow the Phi-3 Tutorial: Hands-On With Microsoft’s Smallest AI Model guide.

Fine-Tuning the Model

In this section, we will load the Phi-3.5-mini-instruct model and run the model inference in the Kaggle platform. We got an accurate and detailed result. As we can see, the model has performed quite well, flagging the call as fraudulent and providing an explanation.

If you are experiencing issues running the model on the Kaggle platform, please refer to the Simple Model Inference of Phi-3.5 Kaggle notebook. It comes with a pre-built setup and code along with outputs.