OpenAI is working on a solution to prevent plagiarism of content generated by AI language models, such as their own GPT model. The proposed security measure resembles a watermark and aims to aid educators in detecting students who use text-generating AIs to produce academic assignments.
Despite potential workarounds, the proposed technique leverages a small element of randomness in the AI’s token selection process, replacing it with a cryptographic tool to embed a unique signature trace that is invisible to humans. While OpenAI has not disclosed the exact workings of the watermark tool, the company plans to publish its research in the near future.
OpenAI has developed a prototype that can detect a trademark signature in a short segment of text, according to a post by Aaronson. The purpose of this feature is to create a website where users can verify if text was generated by OpenAI’s GPT model.
The trademark signature is designed to be an unnoticeable signal within the choices of words generated by GPT, making it more difficult for someone to pass off GPT output as human-generated text. However, Aaronson noted that there are ways for students and other users to circumvent this system.
Mark Ryan from the University of Birmingham cautioned that plagiarism detection tools can be approximate and should be used with caution.
Bill Buchanan from Edinburgh Napier University pointed out that inserting detectable signatures into text is more challenging than doing so with images.