Exploring the Capabilities of ChatGPT for Phishing Detection

Published On Sat May 13 2023

Investigating ChatGPT Phishing Detection Capabilities

As the use of large language models in various fields continues to grow, the question arises as to whether these models can replace the existing complex and multi-layered detection systems in cybersecurity, specifically in detecting threats such as phishing. To find an answer to this question, a team of cybersecurity professionals conducted an experiment to test the capabilities of ChatGPT, an AI model that analyzes websites to detect malicious links. In this article, we will discuss the experiment's findings and explore how an LLM can be applied to cybersecurity tasks.

The Experiment

To test ChatGPT's phishing detection capabilities, the team used the OpenAI API to query the model called GPT-3.5-turbo, which is the backend for ChatGPT. They gathered a corpus of a few thousand links that their detection technologies deemed phishing and threw in a few thousand safe URLs. They then asked ChatGPT to judge whether a particular URL was a phishing attempt or not.

The team found that ChatGPT was good at recognizing potentially malicious links and could explain why it thought a link was a phishing attempt. In addition, ChatGPT could provide the answer in various formats, including machine-readable JSON and even poems. However, the false positive rate was too high, making it unsuitable to replace current detection systems. A well-known paper called URLnet achieved the same level of detection rate with a false positive rate of about 0.4% using a convolutional neural network.

Limitations and Future Work

As with any emerging technology, large language models have limitations that need to be considered. OpenAI has specifically stated that GPT-4 has significant limitations for cybersecurity operations due to its "hallucination" tendency and limited context window. It is natural to assume that this limitation also applies to ChatGPT.

The team also encountered several limitations during their experiment, such as the inability to adjust the threshold to trade false positive rate for detection rate. The team's prompt was probably too specific and cued the language model to view the link with suspicion. Instead of asking if the link is phishing, they could ask if it is safe to visit.

In conclusion, while ChatGPT shows promise in detecting malicious links, it cannot replace current detection systems due to its high false positive rate. Nonetheless, further experiments are needed to explore the full potential of large language models in cybersecurity.