The National Cyber Security Centre (NCSC) in the UK has issued a warning about the potential cybersecurity risks associated with chatbots. According to the agency, hackers can manipulate chatbots through a technique called “prompt injection” to cause real-world consequences. Prompt injection involves creating an input or prompt that causes the language model behind the chatbot to behave in unintended ways. As chatbots rely on artificial intelligence to provide responses to user queries, these manipulations can lead to unintended actions and potentially compromise data security.
Chatbots are designed to mimic human-like conversations and are commonly used in online banking and shopping platforms to handle simple requests. Large language models (LLMs) like OpenAI’s ChatGPT and Google’s AI chatbot Bard are trained using vast amounts of data to generate human-like responses to user prompts. However, since chatbots pass data to third-party applications and services, the NCSC warns that the risks of malicious prompt injection attacks will continue to grow.
If a user inputs a statement or question that the language model is not familiar with, or if they find a combination of words to override the model’s original script or prompts, the chatbot can perform unintended actions. For example, this could result in the generation of offensive content or the disclosure of confidential information. The NCSC highlights the case of a Stanford University student who was able to reveal the entire prompt of Microsoft’s Bing Chat by using a prompt injection. This prompt injection allowed the student to access the list of statements that determine how the chatbot interacts with users, which is typically hidden from users.
Another security researcher discovered that Google’s ChatGPT could respond to new prompts through a third party that was not initially requested. By running a prompt injection through YouTube transcripts, the researcher found that ChatGPT could access these transcripts, potentially leading to more indirect prompt injection vulnerabilities. The NCSC emphasizes that prompt injection attacks can have real-world consequences if systems are not designed with security in mind.
The agency suggests that designing the entire system with security in mind is crucial to mitigating these risks. By being aware of the vulnerabilities associated with the machine learning component of chatbots, it is possible to prevent exploitation and catastrophic failures. One approach is to apply a rules-based system on top of the machine learning model to prevent it from taking damaging actions, even when prompted to do so. The NCSC also emphasizes the importance of understanding attack techniques that exploit inherent vulnerabilities in machine learning algorithms to enhance system security.
In conclusion, the NCSC’s warning about the cybersecurity risks posed by chatbots highlights the potential for malicious prompt injection attacks. As chatbots become increasingly integrated with third-party applications and services, the risks of unintended actions and data compromises grow. Designing the entire system with security in mind and understanding the vulnerabilities of machine learning algorithms are crucial steps in mitigating these risks.