Apple Shakes Up Open-Source AI With MGIE Image Editor

The new technique lets users chat in natural language with the model and produces better results than more established methods like Pix2Pix.

Feb 7, 2024

4 min read

After seemingly lurking on the sidelines most of last year, Apple is starting to shake things up in the field of artificial intelligence—and open-source AI in particular.

The Cupertino-based tech giant has partnered with the University of Santa Barbara to develop an AI model that can edit images based on natural language, the same way people interact with ChatGPT. Apple calls it Multimodal Large-Language Model-Guided Image Editing (MGIE).

MGIE interprets text instructions provided by users, processing and refining them to generate precise image editing commands. Integrating a diffusion model enhances the process, enabling MGIE to apply edits based on the characteristics of the original image.

Multimodal Large Language Models (MLLMs), which can process both text and images, form the foundation of the MGIE method. Unlike traditional single-mode AIs focusing solely on text or images, MLLMs can process complex instructions and work in a wider range of situations. For example, a model may understand a text instruction, analyze the elements of a specific photo, then take something out of the image and create a new picture without that element.

To perform these actions, an AI system must have different capabilities, including generative text, generative image, segmentation, and CLIP analysis, all in the same process.

The introduction of MGIE brings Apple closer to achieving capabilities akin to OpenAI's ChatGPT Plus, enabling users to engage in conversational interactions with AI models to create customized images based on text input. With MGIE, users can provide detailed instructions in natural language—"remove the traffic cone from the foreground"—which is translated into image editing commands and executed.

In other words, Users can start with a photo of a blonde person and turn them into a ginger just by saying, "make this person a redhead." Under the hood, The model would understand the instruction, segment the person's hair, generate a command like "red hair, highly detailed, photorealistic, ginger tone," and then execute the changes via inpainting.

Apple's approach aligns with existing tools like Stable Diffusion, which is can be augmented with a rudimentary interface for text-guided image editing. Leveraging third-party tools like Pix2Pix, users can interact with the Stable Diffusion interface using natural language commands, witnessing real-time effects on edited images.

Apple’s approach, however, proves to be more accurate than any other similar method.

Results of editing an image with natural language using Instruct Pix2Pic, LGIE, Apple's MGIE and Ground Truth Image: Apple

Besides generative AI, Apple MGIE can perform other conventional image editing tasks like color grading, resizing, rotations, style changes, and sketching.

Why would Apple make it open source?

Apple's open-source forays are a clear strategic move—with a scope beyond mere licensing requirements.

To build MGIE, Apple uses open-source models such as Llava and Vicuna. Due to the licensing requirements of these models, which limit commercial use by big corporate entities, Apple was likely compelled to share its improvements openly on GitHub.

But this also allows Apple to leverage a worldwide pool of developers in a bid to boost its strength and flexibility. This kind of collaboration moves things forward far faster than Apple working entirely on its own, and starting from scratch. In addition, this openness inspires a wider spectrum of ideas and draws diverse technical talent, allowing MGIE to evolve faster.

Engagement by Apple in the open-source community with projects like MGIE also gives the brand a boost among developers and tech enthusiasts. This aspect is no secret, with Meta and Microsoft both heavily investing in open-source AI.

It's possible that releasing MGIE as open-source software will give Apple a head start in setting still-evolving industry standards for AI and AI-based image editing in particular. With MGIE, Apple has likely given AI artists and developers a solid foundation with which to build the next big thing, providing more accuracy and efficiency than what's available elsewhere.

MGIE will certainly make Apple's products better: it wouldn’t be too difficult to synthesize a voice command sent to Siri and use that text to edit a photo on the user's smartphone, computer, or innersive headset.

Technically savvy AI developers can use MGIE right now. Just visit the project’s GitHub repository.

Edited by Ryan Ozawa.

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.

Recommended News

Coinbase CEO Says Crypto Integration Could Be '10x Unlock' for AI
Coinbase has teamed up with Perplexity AI to bring real-time AI-powered, crypto data to traders, Coinbase co-founder and CEO Brian Armstrong announced on X. In a post on Thursday, Armstrong said the collaboration will undergo a two-phase integration. In the first phase, Perplexity will focus on Coinbase market data, including the COIN50 index, with the information available on Perplexity’s new Comet browser. “I expect enhanced crypto functionality will be a catalyst for AI to achieve another 10x...
NewsArtificial Intelligence
2 min read
Jason NelsonJul 10, 2025
Create an account to save your articles.
Video Game Performers Secure AI Consent Rules in New SAG-AFTRA Deal
Nine major video game studios have agreed to artificial intelligence “guardrails” for performers, ratifying a deal that ends one of the longest entertainment industry strikes over AI rights and forces gaming companies to operate under the same labor standards as traditional Hollywood studios. The ratification vote, certified Wednesday, saw 95.04% of members approve the 2025 SAG-AFTRA Interactive Media Agreement. It ensures an immediate 15.17% pay bump, followed by three annual increases of 3%, b...
NewsArtificial Intelligence
3 min read
Vismaya VJul 10, 2025
Create an account to save your articles.
Elon Musk’s xAI Launches ‘Remarkable, Terrifying’ Grok 4 Model
Elon Musk’s xAI has officially launched Grok 4, the latest iteration of its artificial intelligence model. The release arrives as a slew of public controversies have rocked Musk’s companies. After much nail-biting, the livestream started an hour late from its original schedule for Wednesday night. The new model's release was led by Musk, who opened the show with comments on how their work on AI has progressed so far. "In some ways it's a little terrifying, but the growth of intelligence here is...
NewsArtificial Intelligence
4 min read
Vince DioquinoJul 10, 2025
Create an account to save your articles.

Coin Prices