How to Run the Powerful Flux AI Image Generator on Less Powerful PCs

Flux, the open-source image generator launched last week, required huge amounts of computational power to run. The community has pared it down so you can run it, too. Here's how.

5 min read

Aug 12, 2024

When Flux burst onto the scene a few days ago, it quickly earned a reputation as the crown jewel of open-source image generators. It matched Midjourney's aesthetic prowess while absolutely crushing it in prompt understanding and text generation. The catch? You needed a beefy GPU with over 24GB of VRAM (or even more) just to get it running. That's more horsepower than most gaming rigs, let alone your average work laptop.

But the AI community, never one to back down from a challenge, rolled up its collective sleeves and got to work. Through the magic of quantization—a fancy term for compressing the model's data—they've managed to shrink Flux down to a more manageable size without sacrificing too much of its artistic mojo.

Let's break it down: The original Flux model used full 32-bit precision (FP32), which is like driving a Formula 1 car to the grocery store—overkill for most. The first round of optimizations brought us FP16 and FP8 versions, each trading a smidge of accuracy for a big boost in efficiency. The FP8 version was already a game-changer, letting folks with 6GB GPUs (think RTX 2060) join the party.

To do this, you need to disable System Memory Callback for Stable Diffusion, so your GPU can offload some of its work from its internal VRAM to your system RAM. This avoids the infamous OOM (out-of-memory) error—albeit at the cost of it running considerably slower. To disable this option, follow this tutorial by Nvidia.

But hold onto your hats, because it gets even better.

The real MVPs of the AI world have pushed the envelope further, releasing 4-bit quantized models. These bad boys use something called "Normal Point" (NP) quantization, which delivers a sweet spot of quality and speed that'll make your potato PC feel like it just got a turbo boost. NP quantization does not degrade quality as much as FP quantization, so in general terms, running this model gives great results, at high speeds, requiring little resources.

It's almost too good to be true, but it is true.

Results obtained with Flux Dec in fp8 and np4. Same prompt, different seeds.

How to run Flux on lower-end GPUs

So, how do you actually run this streamlined version of Flux? First, you'll need to grab an interface like SwarmUI, ComfyUI, or Forge. We love ComfyUI for its versatility, but in our tests, Forge gave around a 10-20% speed boost over the others, so that's what we're rolling with here.

Head over to the Forge GitHub repository (https://github.com/lllyasviel/stable-diffusion-webui-forge) and download the one-click installation package. It's open-source and vetted by the community, so no sketchy business here.

For the NP4 Flux models themselves, Civit AI is your one-stop shop. You've got two flavors to choose from: Schnell (for speed) and Dex (for quality). Both can be downloaded from this page.

Once you've got everything downloaded, it's installation time:

Unzip the Forge file and open the Forge folder.
Run update.bat to get all the dependencies.
Fire up run.bat to complete the setup.

Now, drop those shiny new Flux models into the \webui\models\Stable-diffusion folder within your Forge installation. Refresh the Forge web interface (or restart if you're feeling old school), and boom—you're in business.

Pro tip: To really squeeze every last drop of performance out of your resurrected rig, dial back the resolution. Instead of going for full SDXL (1024x1024) resolutions, try the more modest SD1.5 sizes (768x768, 512x768, and similar). You can always upscale later and use Adetailer for those crispy details.

Let's talk numbers: On a humble RTX 2060 with 6GB of VRAM, Flux Schnell in NP4 mode can churn out a 512x768 image in about 30 seconds, versus 107 seconds required by the FP8 version. Want to go big? It'll take about five minutes to upscale that bad boy to 1536x1024 with a high-res fix.

Want to go big without breaking your GPU? A better option is to start with Flux Schnell at SD1.5 resolutions, then send that creation through img2img. Upscale using a standard Stable Diffusion model (SD1.5 or SDXL) with low denoise strength. The whole process clocks in around 50 seconds, rivaling MidJourney's output on a sluggish day. You'll get impressive large-scale results without melting your graphics card.

The real kicker? Some mad lads have reportedly got Flux Schnell NP4 running on a GTX 1060 with 3GB of VRAM, with Flux Dev taking 7.90s per iteration. We're talking about a GPU that's practically wheezing on life support, and it's out here generating cutting-edge AI art. Not too shabby for hardware that's practically eligible for a pension.

Get crypto news straight to your inbox--

sign up for the Decrypt Daily below. (It’s free).

Get Email!

'He’s Full of Shit': JP Morgan's Jamie Dimon Takes Aim at Coinbase CEO Over Clarity Act

JP Morgan CEO Jamie Dimon did not mince words about his stance on the Clarity Act and Coinbase CEO Brian Armstrong in an interview with Fox Business on Friday. The banking executive said he is not happy with the current version of the Clarity Act, a bill that would regulate most crypto activity in America, and says banks will “not accept it that way.” Dimon further vowed that the banking industry will fight it, and if “we lose, we lose.” “It will be fought,” said Dimon. “No one is going to bo...

Treasury Secretary Bessent Says US Has 'Grabbed' $1 Billion in Crypto From Iran

The U.S. government has seized vast sums of cryptocurrency from entities linked to Iran’s military since the war broke out in February, according to Treasury Secretary Scott Bessent. Speaking Friday at the 2026 Reagan National Economic Forum in Simi Valley, California, he said that some actors holding digital assets may not even know the funds are gone. “I believe that we have seized about $1 billion of their crypto,” Bessent told Fox Business Network’s Larry Kudlow. “Just outright grabbed the w...

Celsius Founder Alex Mashinsky Files to Have 12-Year Crypto Fraud Sentence Vacated

Alex Mashinsky, the founder and former CEO of defunct crypto lending platform Celsius, is seeking to vacate his 12-year prison sentence according to a new motion filed in the District Court for the Southern District of New York. The motion to vacate, which is handwritten by Mashinsky, cited ineffective counsel and “fruit of a poisonous tree” as grounds for vacating the sentence. Mashinsky, who pleaded guilty to counts of commodities and securities fraud, attached additional materials to support...

News

Courses

Deep Dives

Coins

Videos