Users are able to trick ChatGPT into writing code for malicious software applications by entering a prompt that makes the artificial intelligence chatbot respond as if it were in developer mode, Japanese cybersecurity experts said Thursday.

The discovery has highlighted the ease with which safeguards put in place by developers to prevent criminal and unethical use of the tool can be circumvented.

Amid growing concerns that AI chatbots will lead to more crime and social fragmentation, calls are growing for discussions on appropriate regulations at the Group of Seven summit in Hiroshima next month and other international forums.

G-7 digital ministers also plan to call for accelerated research and increased governance of generative AI systems as part of their two-day meeting in Takasaki, Gunma Prefecture, at the end of this month.

Photo taken April 20, 2023, shows ChatGPT obeying instructions to create malware after it was prompted to act in developer mode. (Kyodo)

Meanwhile, Yokosuka in Kanagawa Prefecture, south of Tokyo, on Thursday started trial use of ChatGPT across all of its offices in a first among local governments in Japan.

While ChatGPT is trained to decline unethical uses, such as requests for how to write a virus or make a bomb, such restrictions can be evaded by telling it to act in developer mode, according to Takashi Yoshikawa, an analyst at Mitsui Bussan Secure Directions.

When further prompted to write code for ransomware, a type of malware that encrypts data and demand payments in exchange for restoring access, it completed the task in a few minutes, with the application successfully infecting an experimental PC.

"It is a threat (to society) that a virus can be created in a matter of minutes while conversing purely in Japanese. I want AI developers to place importance on measures to prevent misuse," Yoshikawa said.

OpenAI, the U.S. venture that developed ChatGPT, said that while it is impossible to predict all the ways the tool could be abused, it would endeavor to create a safer AI based on feedback from real-world use.

ChatGPT, launched in November 2022 as a prototype, is driven by a machine learning model that works much like the human brain. It was trained on massive amounts of data, enabling it to process and simulate human-like conversations with users.

Cybercriminals have been studying prompts they can use to trick AI for nefarious purposes, with the information actively shared on the dark web.


Related coverage:

Yokosuka becomes 1st local gov't in Japan to begin ChatGPT trial use