OpenAI rolls out new image generation model and visual creation tools in ChatGPT

OpenAI has launched a new image generation model and dedicated visual creation tools in ChatGPT, marking a shift away from text-first interaction toward more visual and task-based AI experiences.

OpenAI has launched a new image generation model and a dedicated image creation interface inside ChatGPT, expanding the platform’s capabilities beyond text-based interaction. The update, shared by OpenAI’s CEO of applications Fidji Simo, reflects a broader product shift toward visual, multimodal, and task-oriented AI experiences with implications for education, skills development, and creative workflows.

Moving beyond text-first AI interaction

Simo shared details of the update in a LinkedIn post and expanded on the changes in a Substack essay, describing how ChatGPT is evolving away from purely reactive, text-based interaction.

“Humans don’t just think in words,” Simo wrote. “In fact, some of our most compelling ideas often begin as images, sounds, movements, and patterns in our minds.”

She said the new image generation model is paired with a dedicated visual entry point in ChatGPT, designed to function more like a creative studio than a traditional chat interface. According to Simo, the new experience supports faster image creation, improved instruction-following, and more accurate edits that preserve elements such as lighting, composition, and likeness.

“Creating and editing images is a different kind of task and deserves a space built for visuals,” she wrote.

Visual interfaces and learning workflows

Simo framed the update as part of a wider move toward interfaces that support how people learn, research, and make decisions. She said text-only responses are often insufficient for tasks such as exploring new topics, comparing options, or navigating complex information.

“When you’re researching products or restaurants, you don’t just want a report describing options; you want to see photos and side-by-side specs that help you decide,” she wrote.

She added that ChatGPT will increasingly surface visual elements alongside answers, including highlighted people, places, and products that users can tap to explore further without breaking context. Users will also be able to select words or phrases within responses to access deeper explanations.

For education and EdTech use cases, these changes point toward AI systems that support exploration and comprehension without forcing users to restart or reframe queries, particularly in research-led or skills-based learning.

Toward a more generative user interface

Beyond image creation, Simo outlined broader interface changes across ChatGPT, including updates to writing tools, in-line editing, and integrations with third-party applications. She said the goal is to reduce friction between thinking, creating, and taking action.

“It’s exciting to see ChatGPT move from being primarily text-based and conversational, toward a fully generative UI that brings in the right components based on what you want to do,” she wrote.

She described this shift as closing the gap between ideas and execution, particularly when visuals or structured interfaces are better suited than text alone.

Previous
Previous

ETIH Innovation Awards 2026: spotlight on Best AR/VR Learning Solution

Next
Next

OpenAI communications chief Hannah Wong announces departure