With Stable Diffusion, you may never believe what you see online again

Mike Letterman

Sep 6, 2022, 6:44 PM
22 Views

The rise of deep learning image synthesis

Stable Diffusion is the brainchild of Emad Mostaque, a London-based former hedge fund manager who wants to bring novel applications of deep learning to the mass. Stable Diffusion was not the first image synthesis model to make waves this year.

DALL-E 2 shocked social media with its ability to transform a scene written in words into a variety of visual styles. astronauts on horseback, teddy bears buying bread in ancient Egypt, and novel sculptures in the style of famous artists were created by people with privileged access to the closed off tool.

Enlarge / A screenshot of the OpenAI DALL-E 2 website.

After DALL-E 2, the two companies announced their own text-to- image models. MidJourney is open to the public a few months later and charges for access but has a more illustrative and artistic quality.

Stable Diffusion is next. The open source image generation model released by Stability is similar to DALL-E 2. DreamStudio is a website that sells access to compute time for generating images. Stable Diffusion is open source and anyone can use it.

Dozens of projects that take Stable Diffusion in new directions have sprung up in the last week. People have achieved unexpected results using a technique called "img2img" that has "upgraded" MS-DOS game art, converted a scene from Aladdin into 3D, and more. The ability to richly visualize ideas to a mass audience may be brought about by image synthesis, similar to what Adobe did in the 1990s.

Portraits from Duke Nukem, The Secret of Monkey Island, King's Quest VI, and Star Control II received Stable Diffusion-powered fan upgrades.

Enlarge / Portraits from Duke Nukem, The Secret of Monkey Island, King's Quest VI, and Star Control II received Stable Diffusion-powered fan upgrades.

If you follow a few steps, you can run Stable Diffusion yourself. For the past two weeks, we've been running it on a Windows PC with an Intel Core i9-9900K processor. In about 10 seconds, it can create 512512 images. The time on a 3090 Ti goes down to four seconds. The interface is getting more and more complex, going from crude command-line interface to more polished front-end GUIs. If you aren't technically inclined, hold onto your hats, easier solutions are on the way. You can try a demo online if you don't succeed.

With Stable Diffusion, you may never believe what you see online again

Mike Letterman

The rise of deep learning image synthesis

Tech

Buzz

Travel

Health

Related News

Patrick Mahomes, Kansas City Chiefs prevail against Buffalo Bills, win wild AFC divisional game in overtime

Popular on TVN

Stay In Touch

Follow Us