
Prompt: Generate a photorealistic image of farmer's market in toronto on a saturday in summer 2006, it's a beautiful late june day, people are shopping and eating sandwiches. in focus should be a young asian girl wearing denim overalls and sipping on a strawberry banana smoothie - rest can be blurred. the photo should be reminiscent of that a digital camera from 2006 would take, with a timestamp like a printed photo would have. aspect ratio should be 3:2
OpenAI has continually expanded its ChatGPT offerings, adding an AI voice assistant, file and image understanding, advanced research capabilites, AI agents, and more. However, there's been one glaring omission -- a really capable image generator.
On Tuesday, OpenAI launched 4o image generation. This image model is significantly better -- albeit slower -- than the DALL-E models previously offered by OpenAI. It tackles very difficult prompts such as realistic images and, most impressively, accurate text.
Also: I tried ChatGPT's new Advanced Voice Mode update - here's what changed
For example, in the live stream demo, OpenAI CEO Sam Altman, joined by researchers Gabriel Goh and Prafulla Dhariwal, prompted 4o to create a photo from a specific POV with a flyer that included lots of text. After loading for a few seconds, it got the cinematic direction right and accurately printed all the text.
It also boasts many other capabilities OpenAI's previous image generator didn't have, such as image referencing, which can be used to render a new version of the image (such as an anime version or a selfie), or as inspiration for creating a completely new work.
Because this tool is meant to integrate into creatives' workflows, it can generate images on transparent backgrounds, use specific colors from HEX codes, or implement the chatbot's advanced conversational capabilities in the generations. For example, when prompted to include "humor" in the photo during the demo, it included text that met that criteria.
Because the image generator is accessible in ChatGPT, users can also refine images through a multi-turn conversation. This makes tweaking images easier and allows the model to use the context of previous generations to create new ones. Since GPT-4o has access to the web, that context is also added to creating the images.
According to the company, GPT-4o's image generation also has strong instruction adherence. It can handle 10-20 different objects, which means you can prompt it to generate a high volume of objects in one go.
Looser safeguards
Another new aspect of the image generator is that it can now create more risque content, something Elon Musk's Grok model is known for. During the live stream, Altman shared that you will be able to use GPT-4o's image generation to create offensive content "within reason." In an X post after the livestream, Altman added:
"What we'd like to aim for is that the tool doesn't create offensive stuff unless you want it to, in which case within reason it does. As we talk about in our model spec, we think putting this intellectual freedom and control in the hands of users is the right thing to do, but we will observe how it goes and listen to society."
Also: Grok 3 AI is now free to all X users – here's how it works
The blog post announcing the model noted that it will block requests that violate content policies, including child sexual abuse materials and sexual deepfakes. Another safeguard in place is limiting what can be created when real people are in the context, including "particularly robust safeguards around nudity and graphic violence."
Users can visit the System Card for all the safety information in the 4o image generation model.
How to access
The updated image generation features are rolling out today in ChatGPT and Sora. Regardless of whether they are subscribed, all users (including free) will have access to GPT-4o image generation as the default. If users still want to access DALL-E, they can do so through a dedicated DALL-E GPT. Enterprise and Education users will be given access soon, with access to developers via the API slated for the upcoming weeks.
Also: The best AI image generators: Tested and reviewed
When DALL-E first launched, it lived on its standalone website; at the time, it felt like the greatest and latest. Since then, it has been moved to only reside in ChatGPT; there, the model paled compared to more advanced image generation models from competitors such as Midjourney, Google, and Adobe. This update now helps level the playing field, enabling it to compete better with other models.
Want more stories about AI? Sign up for Innovation, our weekly newsletter.