Deepfakes remain a crucial concern for law enforcement and cybersecurity experts, and the United Nations has sounded the alarm over their role in spreading hate and misinformation online. A science team at MIT now says they have developed a novel defense against the weaponization of real photos.
During a presentation at the 2023 International Conference on Machine Learning on Tuesday, the researchers explained that small coding changes can cause meaningful distortions in derivative AI-generated images.
The team specifically proposed mitigating the risk of deepfakes created with large diffusion models by adding tiny changes or "attacks" to images that are hard to see but change how the models work, causing them to generate images that don't look real.
“The key idea is to immunize images so as to make them resistant to manipulation by these models,” the researchers said. “This immunization relies on the injection of imperceptible adversarial perturbations designed to disrupt the operation of the targeted diffusion models, forcing them to generate unrealistic images.”
Such an encoder attack would theoretically derail the entire diffusion-generating process and prevent the creation of realistic fake images.
The MIT researchers acknowledge, however, that these methods would require the involvement of AI platform developers to implement, and cannot rely on individual users.
“The abundance of readily available data on the Internet has played a significant role in recent breakthroughs in deep learning, but has also raised concerns about the potential misuse of such data when training models,” the researchers said.
More conventional image protections like watermarking have also been proposed as a way to make deepfakes more detectable. Photo libraries like Getty, Shutterstock, and Canva use watermarks to prevent the use of unpaid, unlicensed content.
Leading generative AI firms OpenAI, Google, and Microsoft recently floated the possibility of a coordinated watermarking initiative to make it easier to identify AI-generated images.
Echoing the AI firms, the MIT researchers also proposed using watermarking but acknowledged that deepfake detection software or watermarking could not protect images from being manipulated in the first place.
“While some deepfake detection methods are more effective than others, no single method is foolproof,” the researchers said.
The team also acknowledged that image and text generators will continue to advance and that preventative measures will need to continue to improve, or else they will eventually be easily circumvented.
MIT has not yet responded to Decrypt's request for comment.