Mark Breitenbach,

Adrian Wood,

Win Suen

and

Po-Ning Tseng

Don’t you (forget NLP): prompt injection using repeated sequences in ChatGPT (pdf, video)

In April 2023, we observed unusual behavior with OpenAI’s GPT-3.5 and GPT-4 models where control characters (such as backspace and carriage returns) are interpreted as tokens. If user input is incorporated into an existing prompt with instructions, the behavior we discovered provides user-controlled input the ability circumvent system instructions designed to constrain the question and information context. In extreme cases, the models will also hallucinate or respond with an answer to a completely different question. Given the peculiar responses returned, it suggested the possibility that our input thwarted server-side model controls or highlighted edge cases not addressed during model training. Because of the closed-box nature of the vendor API solution, however, we could not confirm intended server-side behavior. The prompt injection susceptibility is also not well documented by OpenAI and appears to be a novel technique for prompt injection.