ChatGPT's Nano Banana

Ben's Bites bensbites@substack.com
Reçu le
jeudi 23 avril 2026 à 13:15
Source
Ben's Bites
Message-ID
20260423131010.3.a8a930b452d07247@mg-d1.substack.com
Version nettoyage
v1.0.0 (ok)
Brut (HTML rendu, sandboxé, ressources externes bloquées)
Nettoyé (Markdown — clean déterministe)
View this post on the web at https://www.bensbites.com/p/chatgpts-nano-banana

Hey folks, Keshav here.
For a few months, it felt like Google had won the image generation space. But OpenAI is back in the game. ChatGPT Images 2.0 [ https://substack.com/redirect/b6799ad3-a0df-47b0-8912-0f372cd1d6c9?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] is miles ahead of anything. It’s beyond impressive at text, I haven’t seen any generation with typos, even with hundreds of words per image. See this example I created:
It’s also really good at creating realistic pictures, like this one of Professor Ben.
Oh, sorry, that one’s real. Ben was at Stanford [ https://substack.com/redirect/156daebe-38c2-48de-ac7a-ec9b65f7a3b7?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] this Wednesday, teaching how to build with AI agents.
Image generation is also available in the Codex app as a skill. Use it with thinking models to get the best results—that lets it think and use code/tool calls (like creating a QR from a link, searching logos from the web) and then use them as reference images. It can also create images, reflect on them and improve the generation.
People are creating realistic UI screenshots [ https://substack.com/redirect/ee9e5c8d-2505-4584-9d28-76d1afe4dc5a?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ], multi-page illustrated magazines [ https://substack.com/redirect/19c3b4b6-c077-407e-a79e-e69b68b20bb7?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ], personal style recommendations [ https://substack.com/redirect/8df95fa3-f095-4bf2-842a-e1c8d778ec99?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] and creative QR codes [ https://substack.com/redirect/24607508-760b-4346-909f-e7b3b6096ae3?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] using the new model.
The “generate UI as image” bit is interesting. Maybe there’s finally a solution to GPT-5.4’s lack of design taste. The latest coding models are fairly good at turning screenshots into code, but there are still gaps.
Last weekend, I tested a bunch of tools/models on implementing a design [ https://substack.com/redirect/ad4d1840-204e-4099-be5e-5bc9e336e4ea?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] (for an ads storefront for Ben’s Bites) from a screenshot. I found:
Claude Design > Magicpath AI [ https://substack.com/redirect/f3b2b96e-44c8-4daf-837b-543716ca08c8?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] > Raw models (like Gemini 3.1 Pro/Opus 4.6 in their web apps), when it comes to understanding the concept and making something usable, not just copying the pixel-by-pixel look (ironically, Gemini won that).
When asked to turn designs into a real working app, there was a major drift in how the apps looked. Opus 4.7 [ https://substack.com/redirect/cb05d1fe-081a-4f67-bcfa-534c5bb435a4?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] did better than GPT-5.4 [ https://substack.com/redirect/3d0d07fd-a70c-450e-bfe2-b15c250043ba?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] at visually matching the reference screenshot. Though GPT-5.4’s code was more functional, and the unseen pages (like the admin panel) had a consistent design with the rest of the app.
Also, in many cases, the assets (hero image, icons, background textures) make the UI in a “generated image” stand out. When replicating that UI from a screenshot, you get the barebones UI with the correct buttons and the layout, but without those assets, and the output falls short of expectations.
Ben’s Bites is brought to you by TinyFish [ https://substack.com/redirect/ff6caaad-4557-4931-a595-549926c548bf?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ]
Funny how AI agents can write entire apps but can’t work on the live web. Playwright scripts break, raw fetches eat your context, bot detection blocks you, nothing’s scalable. TinyFish [ https://substack.com/redirect/ff6caaad-4557-4931-a595-549926c548bf?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] gives search, fetch, stealth browser, web agent, all managed in one API. Try it free [ https://substack.com/redirect/ff6caaad-4557-4931-a595-549926c548bf?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ]. Comes with a CLI + Skill [ https://substack.com/redirect/8b68681e-c6ed-46f2-abfe-80f5c8d3c7b3?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ].
Headlines
OpenAI has a new product for Business, Enterprise and Edu users - Workspace Agents [ https://substack.com/redirect/be7149fa-0da7-4be6-b5cb-050ac545a13b?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ]. Codex-powered agents inside ChatGPT with a persona, task and access to external tools (like Linear) and accessible for Slack as well. These agents will also replace custom GPTs down the line (finally). Read more [ https://substack.com/redirect/1be13f7a-4ff6-4295-935f-81d9a201f326?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ].
Gemini Deep Research API [ https://substack.com/redirect/7e298153-ffe1-4102-b187-d5415b985f2a?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] now offers two configurations based on 3.1 Pro. It claims the best performance in web research and finding hard facts. Plus, it gets MCP support and can create charts using Nano Banana or HTML.
Cursor and SpaceX are working together [ https://substack.com/redirect/92b4c089-7f5b-4cf4-863b-c79cc6a079eb?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] - Cursor will train coding models on SpaceX’s GPUs and likely share them with xAI. SpaceX can, in turn, acquire Cursor later this year for $60B, or pay $10B for the partnership if it doesn’t. On a similar note, Thinking Machines [ https://substack.com/redirect/cf52bb8f-bd34-4ed0-935d-6c8172e34052?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] also just signed a multi-billion-dollar Google Cloud deal.
Give your Droid a computer [ https://substack.com/redirect/cf26c63d-e5ab-4ae9-baf7-49eb5ca77974?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] - You can now give your Droid an always-on machine with its own filesystem, credentials, and config for it to keep working on your tasks. This can be in the cloud (managed by Factory), or you can bring your own device.
My feed
Chronicle [ https://substack.com/redirect/698ca1ae-e571-4e4e-8a7a-e161490542d9?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] - Cursor for slides. Never build a deck from scratch again. Turn ideas into stunning presentations in minutes.*
ChatGPT for Excel [ https://substack.com/redirect/43f2adb1-2434-4ead-b370-4f552e630171?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] and Google Sheets [ https://substack.com/redirect/c9fcab84-6ac2-4741-b4d1-ca10790758b7?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] are now in beta - build new sheets, fix formulas, explain models, and update workbooks in place. (read more [ https://substack.com/redirect/27518ec9-0d69-4bf1-93cc-a702c82bb18f?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ])
/ultrareview in Claude Code [ https://substack.com/redirect/01dda304-27f7-4f65-b217-5309aab2eb8d?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] (research preview) lets you run bug-hunting agents in the cloud before merging riskier changes like auth, data migrations, or other critical code paths.
OpenAI built an open-source viewer for chat data and Codex session logs - Euphony [ https://substack.com/redirect/6b6da928-749e-4db1-a7c3-d8866a6dcc28?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ].
Sierra is piloting an AI-native interview [ https://substack.com/redirect/475e6bff-5fda-4caf-9c5c-518248b46417?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] - debugging/review focused interviews where candidates improve a medium-sized codebase with coding agents.
ml-intern [ https://substack.com/redirect/823cdac6-02b9-47d2-91b0-cf9443ab0054?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] from Hugging Face - open-source research agent to come up with experiments, and run them.
Clawputer [ https://substack.com/redirect/0901cc89-b530-437b-8fe1-94726219ac51?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] - Managed OpenClaw agent inside an always-on sandbox.
Kami [ https://substack.com/redirect/627e5ac8-6705-470c-8b68-861a45e6c4d2?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] - design skill for AI-native docs, resumes, portfolios, long docs, and slides.
noscroll [ https://substack.com/redirect/e28988fe-a5af-4990-bcd8-191683ab6814?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] - an AI that doomscrolls X for you and texts you just the signal. In my experience, this is easy to claim and hard to get right.
Monologue [ https://substack.com/redirect/06de84d4-0ef8-439d-ae2c-5510a18d0bfe?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] has a new Notes feature for thinking out loud when you don’t know the exact words you want to dictate.
Fin [ https://substack.com/redirect/5cf8757e-e3ee-4e3d-959f-edb5e94e93fa?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] is moving beyond customer support into sales - using the same business context and integrations to qualify leads and book meetings.
Perplexity post-trained a Qwen-based model [ https://substack.com/redirect/012c98d5-5da7-4679-8ee2-6d1faee8f429?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] to handle search and tool calls for cheaper, and it’s already serving a meaningful chunk of traffic.
The next Slack won’t look like Slack [ https://substack.com/redirect/9edce41d-5f9d-4ceb-961c-72b008bbd965?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ], and Ando [ https://substack.com/redirect/df285d21-002e-46c7-97b1-3b2e7e092b58?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] looks like one early attempt at that.
Frontend in 2026 [ https://substack.com/redirect/d303033b-cec4-4926-857d-cc4bb0735593?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] - for and against the frameworks and abstractions dominant today.
Afters
Find me on X [ https://substack.com/redirect/acab09f7-7a4a-4ccd-95df-3af0e0cfd646?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ], Linkedin [ https://substack.com/redirect/ec9affb0-ef10-4890-8ff7-87712cd91bdb?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ], or YouTube [ https://substack.com/redirect/3cb12d82-92e9-4aef-be29-b19883be6f1d?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ]
Read about me [ https://substack.com/redirect/db3b75ee-777c-435f-abdd-77c536938ad0?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ] and Ben’s Bites
📷 thumbnail by @keshavatearth [ https://substack.com/redirect/5bf9dde1-fa2a-4eb5-a90f-dd77d23aa3c0?j=eyJ1IjoiODlwbDdnIn0.tGTUbRFxaLdUThAGF55-pc7SYA7ESP3gGmBRwUjpB8c ]
* sponsors who make this newsletter possible :)
Wanna partner with us for the next quarter?
Email us at shanice@bensbites.com [ mailto:shanice@bensbites.com ] or k@bensbites.com [ mailto:k@bensbites.com ]
Ben's Bites is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
Extraction LLMclaude-haiku-4-5 · prompt v1 · 59261070 tokens
# ChatGPT Images 2.0

OpenAI is back in the image generation space with ChatGPT Images 2.0. The model performs miles ahead at text generation within images, with no typos observed even with hundreds of words per image. It is also highly capable at creating realistic pictures.

Image generation is available in the Codex app as a skill. Using it with thinking models produces the best results—allowing it to think and use code/tool calls (like creating a QR from a link, searching logos from the web) and then use them as reference images. It can also create images, reflect on them and improve the generation.

People are creating realistic UI screenshots, multi-page illustrated magazines, personal style recommendations and creative QR codes using the new model.

## Design to Code Testing

Last weekend, a test of various tools and models on implementing a design from a screenshot found:

Claude Design > Magicpath AI > Raw models (like Gemini 3.1 Pro/Opus 4.6 in their web apps), when it comes to understanding the concept and making something usable, not just copying the pixel-by-pixel look (Gemini won pixel-by-pixel accuracy).

When asked to turn designs into a real working app, there was major drift in how the apps looked. Opus 4.7 did better than GPT-5.4 at visually matching the reference screenshot. Though GPT-5.4's code was more functional, and the unseen pages (like the admin panel) had a consistent design with the rest of the app.

In many cases, the assets (hero image, icons, background textures) make the UI in a "generated image" stand out. When replicating that UI from a screenshot, you get the barebones UI with the correct buttons and the layout, but without those assets, and the output falls short of expectations.

# Headlines

**OpenAI Workspace Agents** - OpenAI has a new product for Business, Enterprise and Edu users: Workspace Agents. Codex-powered agents inside ChatGPT with a persona, task and access to external tools (like Linear) and accessible for Slack as well. These agents will replace custom GPTs down the line.

**Gemini Deep Research API** - Now offers two configurations based on 3.1 Pro. It claims the best performance in web research and finding hard facts. Plus, it gets MCP support and can create charts using Nano Banana or HTML.

**Cursor and SpaceX Partnership** - Cursor and SpaceX are working together. Cursor will train coding models on SpaceX's GPUs and likely share them with xAI. SpaceX can, in turn, acquire Cursor later this year for $60B, or pay $10B for the partnership if it doesn't. Thinking Machines also just signed a multi-billion-dollar Google Cloud deal.

**Droid Computers** - You can now give your Droid an always-on machine with its own filesystem, credentials, and config for it to keep working on your tasks. This can be in the cloud (managed by Factory), or you can bring your own device.

# My Feed

- **Chronicle** - Cursor for slides
- **ChatGPT for Excel and Google Sheets** - Now in beta - build new sheets, fix formulas, explain models, and update workbooks in place
- **/ultrareview in Claude Code** - Research preview lets you run bug-hunting agents in the cloud before merging riskier changes like auth, data migrations, or other critical code paths
- **Euphony** - OpenAI built an open-source viewer for chat data and Codex session logs
- **Sierra AI-native Interview** - Piloting debugging/review focused interviews where candidates improve a medium-sized codebase with coding agents
- **ml-intern** - Open-source research agent from Hugging Face to come up with experiments and run them
- **Clawputer** - Managed OpenClaw agent inside an always-on sandbox
- **Kami** - Design skill for AI-native docs, resumes, portfolios, long docs, and slides
- **noscroll** - An AI that doomscrolls X for you and texts you just the signal
- **Monologue** - New Notes feature for thinking out loud when you don't know the exact words you want to dictate
- **Fin** - Moving beyond customer support into sales using business context and integrations to qualify leads and book meetings
- **Perplexity Qwen-based Model** - Perplexity post-trained a Qwen-based model to handle search and tool calls for cheaper, already serving a meaningful chunk of traffic
- **Ando** - An early attempt at what the next Slack might look like
Prompt utilisé(snapshot au moment de l'extraction — édition via System prompts)
Tu es l'extracteur de contenu de Breviat. On te fournit le contenu Markdown nettoyé d'une newsletter.

Ta mission : produire une version PROPRE du contenu en supprimant tout ce qui n'est pas de l'information utile au lecteur. Tu es un FILTRE, pas un résumeur.

À RETIRER :
- Publicités, encarts sponsors, mentions "sponsorisé par X", "ad", "présenté par"
- Intros vides : formules de bienvenue, météo de l'humeur de l'auteur, anecdotes personnelles non liées au contenu
- Appels à l'action marketing : s'abonner à la newsletter, parrainer un ami, "follow us on Twitter", "join our Discord"
- Signatures, mentions légales, adresses postales, "view in browser", "unsubscribe"
- Boutons / CTAs / "cliquez ici" / "lire la suite" sans contenu derrière
- Promotions d'autres produits / événements / formations payantes de l'auteur ou de tiers
- Encarts récurrents type "Read of the day" ou "Quote of the day" sans valeur informationnelle propre

À CONSERVER (intégralement, sans résumer ni reformuler) :
- Toutes les annonces, news, analyses, commentaires factuels
- Les chiffres, dates, noms d'entreprises, citations
- Les explications techniques
- Les liens vers des sources réelles (annonces officielles, papers, articles cités)
- La structure (titres, sous-titres, listes)

RÈGLES :
- Ne reformule pas. Garde la formulation d'origine.
- Ne résume pas, ne condense pas. Si une section fait 200 mots et est utile, garde 200 mots.
- N'ajoute aucun contenu (pas de titres ni de transitions de ton cru).
- Ne fabrique aucune URL. Garde celles d'origine, ou retire-les.
- Si la newsletter entière est de la pub / promo / contenu inutile, sors un Markdown vide (rien d'autre).

Sortie : UNIQUEMENT le Markdown nettoyé, sans préambule ni commentaire sur ton travail.

Re-extraire cet email

Choisis une version du prompt d'extraction. La nouvelle extraction sera créée à côté des précédentes (rien n'est écrasé).

Footer détecté et extrait (R-08)
Unsubscribe https://substack.com/redirect/2/eyJlIjoiaHR0cHM6Ly93d3cuYmVuc2JpdGVzLmNvbS9hY3Rpb24vZGlzYWJsZV9lbWFpbD90b2tlbj1leUoxYzJWeVgybGtJam8xTURBd016azRNellzSW5CdmMzUmZhV1FpT2pFNU5UQXpPREE0TXl3aWFXRjBJam94TnpjMk9UVXdNVEF6TENKbGVIQWlPakU0TURnME9EWXhNRE1zSW1semN5STZJbkIxWWkwME16YzVNams1SWl3aWMzVmlJam9pWkdsellXSnNaVjlsYldGcGJDSjkubnJxYWV4Q194VDRvUm16Zk5oc243dHhucGl0dl9ibExvMWpFUUFUXzNSdyIsInAiOjE5NTAzODA4MywicyI6NDM3OTI5OSwiZiI6dHJ1ZSwidSI6NTAwMDM5ODM2LCJpYXQiOjE3NzY5NTAxMDMsImV4cCI6MjA5MjUyNjEwMywiaXNzIjoicHViLTAiLCJzdWIiOiJsaW5rLXJlZGlyZWN0In0.abS4sFi0QzvycjwR6JrU_7ZlEtH2nrOxP0ZOvjTkjPw?