Platformer is an independent newsletter from Casey Newton that follows the intersection of Silicon Valley and democracy. Subscribe here.
Artificial intelligence was used to generate all the images in this story.
A technology splits the world into before and after periods. I remember the first time I saw a video on a web page, the first time I opened a file on my phone, and the first time I scanned a crowd at a concert.
I remember the first time I used the app, I streamed myself live from my phone. The sense that some new possibilities had been unlocked is what makes these moments stand out. When you could add video clips to the web, what would it look like? Is it possible to summon your phone from the cloud? Is it possible to broadcast yourself to the world?
I haven't seen the kind of technology that made me call my friends and say, "You have to see this" in a while. I added another one to the list this week. I have no idea how DALL-E will be used, but it is one of the most compelling new products I have seen in a long time.
The technology is called DALL-E 2. Openai is a seven-year-old San Francisco company with a mission to create a safe and useful artificial general intelligence. GPT-3, a powerful tool for generating sophisticated text passages from simple prompts, and Copilot, a tool that helps automate writing code for software engineers, are just two of the tools created by Openai.
DALL-E is a project that uses text and images from them. The first version of the tool was limited to 512-by- 512 squares.
The second version feels like a big leap forward from the first one. New techniques such as "inpainting" can be used to replace elements of an image with another. Take a photo of an orange in a bowl and replace it with an apple. DALL-E has improved at understanding the relationship between objects, which helps it depict more fantastic scenes.
The DALL-E-generated images have been taking over my timelines. After I mused about what I might do with the technology, a nice person at Openai took pity on me and invited me into the private research alpha. A company spokeswoman told me today that the company hopes to add 1,000 people a week.
DALL-E's content policy is designed to prevent most of the obvious potential abuses of the platform. The company doesn't allow hate, harassment, violence, sex, or nudity, and they don't want you to create images related to politics or politicians. One of the co-founders of OpenAI is Musk, who is known for his dislike of restrictive policies on social media. He resigned from its board.
Adding the word "shooting" to a block list prevents a lot of possible image creation. It is not allowed to be used to make images intended to deceive. There is no prohibition against trying to make images based on public figures, but you can't upload photos of people without their permission, and the technology blurs most faces to make it clear that the images have been manipulated.
DALL-E has a simple interface that invites you to create whatever you want with content policy. It is possible to use the search bar like it was a computer program. DALL-E has a "surprise me" button that pre-populates the text with a suggested query based on past success. I use this to get ideas for experimenting with different artistic styles, such as a macro 35mm photograph.
DALL-E takes 15 seconds to generate 10 images. The number of images was reduced to make it easier for more people to see them. I would curse and laugh at how good the results were.
A shiba inu dog is dressed as a firefighter.
There is a dog dressed as a wizard.
These fake dogs are so cute. I would like to write books about them. I would like them to join me in the metaverse.
Who else can join us? A frog is wearing a hat
He is perfect.
I started taking requests on the side channel chat. The person was asked to depict the metaverse at night. I thought what returned was grand and abstract.
I don't want to explain how DALL-E is making these images because I'm still learning. Diffusion is one of the core technologies explained in this post. I have been struck by how innovative this technology is.
Two results were shared in my Discord by someone with DALL-E access. Look at the set of results for a bear economist in front of a stock chart.
There is a bull economist in front of a graph of a surging stock market.
The fright and exasperation of the bear and the aggressiveness of the bull are what DALL-E captures in this film. It doesn't make sense to describe what we're looking at as "creative", and yet they have the same effect as looking at something truly creative.
DALL-E will attempt to solve a single problem in many different ways. It had to figure out how to portray the eyes when I asked it to show me a delicious cinnamon bun.
I would have put a pair of plastic looking eyes on the roll. Sometimes it made eyes out of the frosting. It made the eyes out of little cinnamon rolls.
I cursed out loudly and started laughing.
D ALL-E DALL-E is the most advanced image generation tool I have ever seen. I have tried out a similar tool called Midjourney, which is in a very early stage of development, but has yet to be released to the general public. I think the developer of DALL-E Mini will get hit with a cease-and-desist letter soon, since it has no relation to DALL-E or Openai.
I was told by Openai that it hadn't made any decisions about how DALL-E might become available more generally. The goal of the current research is to show people how to use the technology in a way that works best for them.
The number of use cases discovered for DALL-E is surprising. DALL-E is being used to create augmented reality filters. A chef in Miami is using it to find new ways to cook. DALL-E can be used to create cheap environments and objects in the metaverse.
It makes sense to worry about what automation might do to professional illustrators. Many jobs are lost. DALL-E could be useful in their work. They could ask DALL-E to sketch out a few concepts before they started. I used the tool to suggest alternate platformer logos.
I will keep the logo I have. If I were an artist, I would appreciate the alternate suggestions.
It is worth thinking about how these tools might open up for people who wouldn't have the money to hire an illustrator. I wrote my own comic books when I was a kid, but never improved my illustrations. What if I told DALL-E to draw all my heroes for me?
This doesn't seem like a tool that most people would use. In the coming months and years, we will find more and more creative applications of tech like this: in e- commerce, in social apps, in the home and at work. If the copyright issues get sorted out, it could be one of the most powerful tools for changing culture. I don't know if using artificial intelligence to create images of protected works is fair use or not. Do you want to see DALL-E's take on "Batman eating a sandwich"?
I think we will see some harmful uses of this tool. I trust OpenAI to enforce strong policies against the misuse of DALL-E, but surely similar tools will emerge and take more of a moderation approach. The technology is only going to get better, as people are already creating pornographic, often deep fakes, to harass their exes.
When a new technology is released, we tend to focus on its happier and more playful uses only to overlook how it could be used in the future. I am thrilled that I have been able to use DALL-E, but I am worried about what similar tools could do in the hands of companies that aren't as strict.
What positive uses of this technology could be found at scale? What does the creation of artificial intelligence do to our sense of reality? We don't know what to make of what we see.
DALL-E is a breakthrough in consumer tech. Is it the start of a creative revolution or something more worrisome? It is adding 1,000 users a week. It's time to discuss its implications before the rest of the world does.