Art & culture

Google's "Say What You See": How the AI Game Works

Google Arts & Culture turned the simple act of describing a painting into a strangely addictive game. Here's how "Say What You See" actually works, plus the tricks that get you to a perfect match faster.

Google's "Say What You See": How the AI Game Works

There's a particular kind of frustration that comes from staring at a painting you can clearly see and being unable to type the words that conjure it. You know it's a horse. You can see the horse. But Google's AI is showing you a wooden duck, a carousel, a brown sofa β€” anything but the horse. That gap between what your eyes register and what language can summon is the entire point of "Say What You See," and it's why a free browser game built by Google has quietly become one of the more interesting things to happen to art appreciation online.

What "Say What You See" actually is

"Say What You See" is a game inside Google Arts & Culture, the company's sprawling cultural archive. The premise is deceptively simple. You're shown an image generated by AI, and your job is to describe it using a text prompt. The game then generates a new image from your description and compares it to the original. The closer your generated image matches the target, the higher your score. You get a handful of rounds, and across them you're effectively learning to write prompts β€” the instructions you feed an image model to make it produce what you want.

Google's framing is educational rather than competitive. The game grew out of the broader conversation around generative AI and "prompt literacy" β€” the idea that talking to these tools is a learnable skill, not magic. By turning prompting into a guessing game with instant visual feedback, Google made the abstract concrete. You see immediately when your words were too vague, too cluttered, or pointed in the wrong direction entirely.

Where to find it: Open artsandculture.google.com or the Google Arts & Culture app, then look under the "Play" or "Experiments" section. "Say What You See" sits alongside other interactive features like Art Selfie and Art Transfer. It runs in a normal browser β€” no download, no account strictly required, though signing in saves your history.

How does "Say What You See" work?

Under the hood, the game is a loop between two AI systems. The first generates the target image you're trying to recreate. The second takes your written description and produces its own image, which an underlying model then scores for similarity to the target. The scoring isn't pixel-by-pixel matching β€” it's semantic, meaning the system compares whether the concepts, composition, colors, and mood line up.

This is why a wordy, hyper-specific description sometimes does worse than a clean one. If the target is a red apple on a white plate, typing "a vivid crimson Granny Smith apple, photorealistic, 8k, sitting on a porcelain dinner plate with subtle shadows and reflections in a softly lit kitchen" can confuse the model. It may fixate on "kitchen" and generate a whole room, or on "Granny Smith" and turn your apple green. The winning move is usually closer to "a red apple on a white plate."

The three things the game is really testing

Once you internalize those three layers, your scores jump. You stop describing what a thing is and start describing what the image looks like, which turns out to be a different skill.

How to actually win (or at least score well)

After a couple dozen rounds, patterns emerge. Here's what reliably works, based on playing the thing far longer than I'd planned to.

1. Lead with the single clearest noun

Start your prompt with the most unmistakable subject. "A lighthouse." Then build outward. The model weights early words heavily, so burying the subject behind adjectives costs you.

2. Add one composition word, not five

"A lighthouse on a cliff" beats "a lighthouse on a rocky cliff overlooking a turbulent sea at golden hour with seabirds." More words means more chances for the model to diverge from the target. Restraint scores.

3. Name the medium if it's obvious

If the target looks like an oil painting, a pencil sketch, or a photograph, say so. "An oil painting of a lighthouse" nudges the whole aesthetic in the right direction at once, which is often worth more than three object details.

4. Iterate ruthlessly

The game gives you feedback. Use it. If your first attempt produces something too dark, your next prompt should explicitly say "bright" or "daylight." Treat each round as a conversation where you correct course, not a single shot in the dark.

A quick mental model: describe the image to a stranger over the phone who has to draw it. You wouldn't say "8k hyperrealistic" β€” you'd say "it's a small white house, viewed from the front, with a red door, and it's sunny." That's the register the game rewards.

Why this game is sneakily good for your eye

Here's the part that surprised me. "Say What You See" is, at its core, a looking exercise dressed up as a tech demo. To describe an image well, you're forced to notice things you'd normally skim past β€” the direction of the light, the placement of objects within the frame, the dominant color temperature. These are exactly the observation habits that make people better at reading actual art.

I'd argue it sits in the same family as Google's older, more contemplative looking tool, which I've written about in our full guide to "Say What You See". Both train the same muscle: slowing down enough to convert a glance into precise language. That's not a small thing. Most of us look at images at the speed of a scroll, registering "tree, person, sky" and moving on. Naming what you see β€” really naming it β€” is the foundation of every serious encounter with a work of art.

It also demystifies why some paintings hit harder than others. When you struggle to get the AI to reproduce a sense of melancholy or grandeur, you start appreciating how a real painter engineered those effects through composition and color. There's a reason we keep returning to the strange stories behind famous paintings β€” the more closely you look, the more there is to find. This game just gamifies the looking.

Can I take a photo of something and Google tells me what it is?

Yes β€” but that's a different tool, and the confusion is worth clearing up because people often conflate the two. The feature you want is Google Lens, built into the Google app, Google Photos, and most Android phones (and accessible on iPhone through the Google app). Point your camera at an object, a plant, a landmark, or a painting, and Lens identifies it and surfaces related search results. Inside Google Arts & Culture specifically, there's an "Art Recognizer" that lets you point your phone at a painting in a participating museum and instantly pull up information about it.

So the lineup is: Lens identifies real-world things you photograph; Art Recognizer identifies artworks in museums; and "Say What You See" is the inverse β€” instead of the AI telling you what something is, you tell the AI, and it tries to recreate it. Same company, opposite directions.

How to use Google Arts & Culture beyond the game

"Say What You See" is one doorway into a platform that's genuinely enormous and almost absurdly free. Google Arts & Culture partners with thousands of museums and institutions worldwide to digitize collections at extraordinary resolution. Here's how to get the most out of it:

The platform is also a quiet education in how movements and styles evolved. If a particular era hooks you while browsing, it's worth understanding the bigger picture of why art movements still matter even outside the gallery β€” the labels on those digitized canvases aren't just trivia, they're a map of how ideas traveled.

How do I find my lookalike on Google Arts & Culture?

This is the feature that went viral years before "Say What You See" existed, and people still ask about it constantly. It's called Art Selfie. Open the Google Arts & Culture app (it works best on mobile), tap the Art Selfie or camera option, take a selfie, and the app compares your face against its database of portraits, returning the artworks you most resemble along with a match percentage.

A couple of practical notes. The matching can be hit-or-miss and occasionally unflattering β€” you may be told you're the spitting image of a stern 17th-century merchant. Availability has also varied by region over the years due to privacy regulations, so if you don't see it, that may be why. Google states that selfie data isn't used for anything beyond the match, but as with any face-scanning feature, read the prompt before you tap. There's a newer version, Art Selfie 2, that builds you into stylized scenes rather than just matching a portrait.

The bigger point: AI made us practice an old skill

What I find genuinely interesting about "Say What You See" is how a cutting-edge AI tool ended up reinforcing one of the oldest disciplines in art education β€” sustained, deliberate looking. For centuries, the standard exercise for any student was to sit in front of a work and describe it, or copy it, until they understood how it was built. The game does the same thing in thirty-second bursts, with a dopamine hit when your score climbs.

It's part of a wider moment where AI and visual culture keep colliding in ways that are sometimes alarming and sometimes delightful. The independent art site Colossal has tracked the strange, generative edges of this collision with more curiosity than panic in its ongoing coverage of art and emerging technology, and that's roughly the spirit to bring to "Say What You See" too β€” playful, skeptical, attentive.

And here's the genuinely useful takeaway. Getting better at describing images doesn't just improve your game score. It changes how you move through the world's images β€” the ads, the feeds, the paintings, the photographs. You start noticing why something works. That observational habit feeds directly into making things yourself, which is the argument I keep coming back to: that you're more of an artist than you think. Learning to say what you see is the first half of learning to make what you imagine.

So play the game. Lose a few rounds badly. Then notice, the next time you walk into a gallery or scroll past a striking photo, how much more you're suddenly able to put into words. That transfer β€” from a Google experiment to your own eye β€” is the whole quiet payoff.