Generative AI-based style recommendation using fashion item analysis
This research work delves into the development of an innovative style recommendation system leveraging generative AI and deep learning techniques to analyze fashion photos. The primary objective of this system is to process various input images, including selfies and studio-quality photos, to generate a detailed text file providing feedback on an individual's style along with suggestions for enhancement.
![ChatGPT v4.5 (Chat AI) - AI Chatbot | Figma](https://s3-alpha.figma.com/hub/file/5172674681/ce91982d-5cac-436e-8932-432c1d4b2961-cover.png)
Key Components of the System
The system comprises two key components:
- The YOLOv8 convolutional neural network, trained on the DeepFashion2 dataset, is responsible for detecting and cropping different clothing items.
- The GPT-4.0 large language model, accessible via the OpenAI API, is utilized to produce comprehensive style commentary and recommendations for users.
In this system, YOLOv8 undergoes brief training on specific datasets to enhance its ability to recognize ten distinct types of clothing. On the other hand, GPT-4.0 focuses on offering coherent and concise style recommendations.
Evaluation and Comparison
To assess the effectiveness of the proposed solution, real experimental trials were conducted at multiple events in Madrid and Tallinn. The study compared three well-known AI models for fashion recommendation:
- OpenAI’s GPT-4.0 Vision
- Google’s Gemini 1.5 Pro
- Anthropic’s Claude 3–Opus
![PDF) Generative AI-based Style Recommendation Using Fashion ...](https://www.researchgate.net/publication/381448625/figure/fig5/AS:11431281252036466@1718478369266/A-random-sample-input-intermediate-Yolo-V8-and-Claude-3-Opus-based-fashion-advisor_Q320.jpg)
Participants evaluated the quality of fashion recommendations generated by each model. The results indicated that GPT-4.0 Vision and Gemini 1.5 Pro received similar average ratings, suggesting higher perceived quality compared to Claude 3–Opus.
Implications and Future Prospects
This research showcases the potential of cutting-edge computer vision and natural language processing technologies in revolutionizing personalized fashion advisory services. By enhancing the accuracy and relevance of style recommendations, these advancements hold the promise of transforming the fashion industry.