AI Brand Visibility: AI Plays Favorites, It Won't Trash You

Ryan Turner · Head of InnovationJune 10, 2026

Most teams worried about AI and their brand are bracing for the wrong thing. They expect the assistant to say something damaging. In our testing, it almost never does. ChatGPT, Gemini, and Copilot were polite about every brand we put in front of them.

The real exposure is quieter. AI picks favorites. The favorite shifts depending on which assistant a person opens and which country they sit in. And in regulated categories, the assistant sometimes drops your brand from the conversation entirely, which is worse than any bad review.

To pressure-test that, we deliberately chose three brands in three categories built to poke an assistant's safety guardrails: DraftKings (sports betting), Bacardi (alcohol), and Texas politician James Talarico (politics). Gambling, alcohol, and politics are exactly where assistants get cautious. Then we asked the same questions from different countries.

This post is published by Massive Computing, the company whose localized AI chat tool ran the queries. The takeaways matter more than our raw numbers, so we lead with those and link the full data at the end.

Key Takeaways

Sentiment is not the needle-mover. Every assistant described every brand positively or neutrally. A "bad AI reputation" barely happened, so a green sentiment score tells you almost nothing.
The favorite is the needle-mover, and it's unstable. Who gets ranked #1 changed by up to two places depending only on which assistant we asked, and it shifts by country too.
In regulated categories, you can vanish. Gemini fully refused gambling questions from UK and German locations while answering freely elsewhere. The worst case is silence, not criticism.

AI almost never badmouths your brand

In our tests, every assistant described every brand positively or neutrally, with no exceptions across the three subjects. DraftKings came back "legitimate, licensed, top-tier." Bacardi was "reliable, the world's most-awarded rum." James Talarico was "principled" and "an effective communicator." Nobody got trashed.

So "is AI saying something bad about us?" is the comfortable question, and the wrong one. Run sentiment monitoring across the major assistants and it will almost always come back green, which feels reassuring and measures nothing useful.

That matters because people act on these answers. In 2025, Bain & Company found 80% of consumers rely on AI-written summaries for at least 40% of their searches, and 42% ask AI for shopping recommendations (Bain & Company, 2025). The answer is a referral now. A clean sentiment score hides the only thing that decides the referral: were you actually the one recommended?

It picks favorites, and the favorite depends on who you ask

The decision an assistant makes about your brand isn't whether to praise it. It's where to rank it, and that ranking moved by up to two places depending only on which assistant we asked. Same brand, same week, different verdict.

The clearest case was the politician. ChatGPT and Copilot ranked James Talarico #4 of 4 among rising Democrats, framing him as the least nationally proven. Gemini ranked him #2 to #3, treating him as a genuine contender. A reputation manager checking only ChatGPT would file him as an also-ran. Checking only Gemini, a rising star.

Source: Massive localized AI study, 2026. Full data in the report linked below.

The same split showed up in product categories. Copilot was the only assistant that ranked DraftKings #1, putting it ahead of FanDuel; ChatGPT and Gemini left it at #2. Bacardi was the one stable favorite-loser, ranked #2 behind Havana Club in every answer, everywhere. Whichever assistant your customer happens to open is acting as a hidden editor you don't control.

This isn't noise you can average away. SparkToro's 2026 research found under a 1-in-100 chance that an AI returns the same brand list across two runs (SparkToro, 2026), and a 2025 University of Toronto study found only 15% to 33% citation overlap between Google and ChatGPT (arXiv 2509.08919, 2025). The engines read different slices of the web, so they pick different favorites.

In regulated categories, you can disappear entirely

The most extreme outcome we found wasn't a bad answer, it was no answer. From UK and German locations, Gemini fully refused both gambling questions on every run ("my safety system flagged this request"), while the same Gemini answered enthusiastically from the US, Brazil, and Japan. DraftKings' visibility in those two markets isn't low. It's zero.

The severity tracked how regulated the category is. Gambling drew a full refusal. Alcohol drew a partial one: Gemini would say whether Bacardi was good but refused to rank alcohol brands. Politics drew no refusal at all, which surprised us, since we expected the politician to trip the most filters.

Source: Massive localized AI study, 2026. Full data in the report linked below.

This is the location effect that actually counts. Crossing a border rarely changed an assistant's opinion of a brand. What it changed was whether the brand appeared at all, governed by the local regulator more than the local market. It isn't unique to one tool, either. In 2026, Investigate Europe tested seven chatbots and found AI assistants surfaced unlicensed gambling sites in roughly 75% of replies when asked to bypass national self-exclusion schemes (Investigate Europe, 2026). Refusing the licensed brand in one country while surfacing unlicensed ones elsewhere is the kind of inconsistency you only catch by testing from inside each market.

Why most brand monitoring misses all of this

A single-assistant, single-country, single-run check catches none of the three effects above, because it returns a friendly sentiment score and one stable-looking ranking, and both are misleading. Here's the discipline that actually surfaces the risk.

Test the assistants your audience uses, not the one you use. Copilot ranked DraftKings #1 and ChatGPT never did. Track only ChatGPT and you'd never see your best result, or your worst.
Test from inside each market. The Gemini gambling refusal is invisible from a US connection. You have to ask as a user in the UK or Germany to see the silence.
Repeat every query and report a percentage. With a sub-1-in-100 chance of an identical list twice, one pull is noise. Track share of voice over time.
Separate a refusal from an outage. A policy refusal is a finding about your market coverage. An upstream error is missing data. Conflating them invents a problem or hides one.

Try it on your own brand

You can run the same-prompt-different-country test yourself, free and without a login. The Massive AI GEO playground asks ChatGPT the same question from the US, Brazil, and Japan, side by side. Same prompt, three countries, different answers. Drop in your brand against its competitors and watch the ordering move.

The playground is the demo. The engine under it is the product. Massive's Web Render AI chat endpoint returns live model completions from real consumer devices in 195+ countries, with the sources each model used, so you can build your own AEO or brand-visibility monitoring on top of it. Geo coverage, device origin, and source parsing are solved upstream; you keep your own scoring, dashboards, and brand. Sign up for an API key and point your tool at the endpoint.

Want the receipts? The full report has all 270 localized queries, cell by cell, across three assistants and five countries.

The bottom line

Stop asking whether AI is nice to your brand. It almost always is, and the answer is a comfortable distraction. Ask the questions that move revenue instead: who does the assistant pick as #1, does that favorite change by assistant and by country, and are there markets where you don't show up at all?

Sentiment is the green light that hides the problem. Favoritism and silence are the problem. Try the playground on your own brand, then read the full report to see how far the favorites move.

Ryan Turner writes about live web access for AI systems at Massive Computing, covering anti-bot infrastructure, geo-accurate retrieval, and the data behind AI search. The localized study in this post was run on Massive's Web Render AI chat endpoint, which returns model completions from real consumer devices in 195+ countries.

Sources

Bain & Company, "How Customers Are Using AI Search (2025 Research)," retrieved 2026-06-10, https://www.bain.com/insights/how-customers-are-using-ai-search/
SparkToro, "New Research: AIs are highly inconsistent when recommending brands or products," retrieved 2026-06-10, https://sparktoro.com/blog/new-research-ais-are-highly-inconsistent-when-recommending-brands-or-products-marketers-should-take-care-when-tracking-ai-visibility/
Chen, Wang, Chen, Koudas (University of Toronto), "Generative Engine Optimization: How to Dominate AI Search," arXiv 2509.08919, retrieved 2026-06-10, https://arxiv.org/abs/2509.08919
Investigate Europe, "AI chatbots lure vulnerable gamblers to unlicensed betting websites," retrieved 2026-06-10, https://www.investigate-europe.eu/posts/ai-chatbots-lure-vulnerable-gamblers-unlicensed-betting-websites
Massive Computing, localized AI brand-visibility study (270 queries: 3 subjects x 3 assistants x 5 countries x 2 prompts x 3 runs), conducted 2026-06-08. Full data: https://www.joinmassive.com/sample-reports/geographic_ai_report.html

Frequently Asked Questions

Does AI say negative things about brands?+

Rarely. In our localized study across ChatGPT, Gemini, and Copilot, every assistant described every brand positively or neutrally. The real risk is not a bad review but competitive ranking (who gets called #1) and outright omission in regulated categories, neither of which a sentiment score captures.

What actually varies between ChatGPT, Gemini, and Copilot?+

The ranking. We saw the same brand move up to two places depending only on the assistant, and a 2025 University of Toronto study found just 15% to 33% citation overlap between Google and ChatGPT (arXiv 2509.08919, 2025). The engines read different sources, so they recommend different favorites from identical questions.

Why would AI refuse to mention my brand in some countries?+

Safety policies are applied by region. In our study, Gemini fully refused gambling questions from UK and German locations on every run but answered freely from the US, Brazil, and Japan. The refusal tracked local regulation, so the brand had zero visibility in two markets and full visibility in three.

How should I monitor AI brand visibility?+

Test every assistant your audience uses, from inside every market that matters, with repeated runs. SparkToro's 2026 research found under a 1-in-100 chance an AI returns the same brand list across two runs (SparkToro, 2026), so report share of voice over time, not a single snapshot.

Is checking AI visibility from one country enough?+

No. Crossing a border rarely changes an assistant's opinion, but it can change whether your brand appears at all, because regional safety policy and default language shift at the border. You need to test from inside each market you sell into.