4 min read
OpenAI, a pioneer in the field of generative AI, is stepping up to the challenge of detecting deepfake imagery amid a rising prevalence of misleading content spreading on social media. At the recent Wall Street Journal's Tech Live conference in Laguna Beach, California, the company's chief technology officer, Mira Murati, unveiled a new deepfake detector.
Murati said OpenAI's new tool boasts "99% reliability" in determining if a picture was produced using AI.
AI-generated images can include everything from light-hearted creations like Pope Francis sporting a puffy Balenciaga coat to deceptive images that can cause financial havoc. The potential and pitfalls of AI are evident. As these tools become more sophisticated, distinguishing between what's real and what's AI-generated is proving to be a challenge.
While the tool's release date remains under wraps, its announcement has stirred significant interest, especially in light of OpenAI's past endeavors.
In January 2022, the company unveiled a text classifier that purportedly distinguished human writing from machine-generated text from models like ChatGPT. But by July, OpenAI quietly shut down the tool, posting an update that it had an unacceptably high error rate. Their classifier incorrectly labeled genuine human writing as AI-generated 9% of the time.
If Murati’s claim is true, this would be a significant moment for the industry, ascurrent methods of detecting AI-generated images are not typically automated. Usually, enthusiasts rely on gut feeling and focus on well-known challenges that stymie generative AI like depicting hands, teeth, and patterns. The difference between AI-generated images and AI-edited images remains blurry, especially if one tries to use AI to detect AI.
OpenAI is not only working on detecting harmful AI images, it is also setting guardrails to censor its own model even beyond what is publicly stated in its content guidelines.
As Decrypt found, the Dall-E tool from OpenAI seems to be configured to modify prompts without notice, and quietly throw errors when asked to generate specific outputs even if they comply with published guidelines and avoid creating sensitive content involving specific names, artist styles, and ethnicities.
Part of what could be Dall-E 3’s prompt in ChatGPT. Source: Decrypt
Detecting deepfakes isn't solely an endeavor of OpenAI. One company developing the capability is DeepMedia, working specifically with government customers.
Big names like Microsoft and Adobe are also rolling up their sleeves. They've introduced what's been dubbed an 'AI watermarking' system. This mechanism, driven by the Coalition for Content Provenance and Authenticity (C2PA), incorporates a distinct "cr" symbol inside a speech bubble, signaling AI-generated content. The symbol is intended to act as a beacon of transparency, allowing users to discern the origin of the content.
As with any technology, however, it's not foolproof. There's a loophole where the metadata carrying this symbol can be stripped away. However, as an antidote, Adobe has also come up with a cloud service capable of recovering the lost metadata, thereby ensuring the symbol's presence. It, too, is not hard to circumvent.
With regulators inching towards criminalizing deepfakes, these innovations are not just technological feats but societal necessities. The recent moves by OpenAI and the likes of Microsoft and Adobe underscore a collective endeavor to ensure authenticity in the digital age. Even as these tools are upgraded to deliver a higher degree of authenticity, their effective implementation hinges on widespread adoption. This involves not just tech giants but also content creators, social media platforms, and end-users.
With generative AI evolving rapidly, detectors continue to struggle to differentiate authenticity in text, images and audio. For now, human judgment and vigilance are our best line of defense against AI misuse. Humans, however, are not infallible. Lasting solutions will require tech leaders, lawmakers and the public to work together in navigating this complex new frontier.
Edited by Ryan Ozawa.
Decrypt-a-cookie
This website or its third-party tools use cookies. Cookie policy By clicking the accept button, you agree to the use of cookies.