The Reddit paradox: What we learned from an industry breakdown of the most-cited AI sources

takes
We analyzed citation data across industry categories. The takeaway? Don't chase Reddit blindly—focus on the sources that actually matter to your business.

When generative AI began upending traditional search, marketers were hungry for fixes (they still are). And Reddit somehow became the poster child for AI search shortcuts.

Conventional wisdom seems to be that if you want your brand to show up when someone asks an LLM a question, you need to show up on Reddit first.

The thinking goes like this: Cited sources influence AI responses. If you get your brand mentioned in the most-cited sources, you get your brand mentioned in more LLM answers.

So far, so good.

Organizations started releasing studies showing that sites like Reddit are among the most-cited sources by LLMs. Cue marketing teams doing everything they can to get their brands mentioned in Subreddits (while trying not to get downvoted or banned by Redditors).

But those studies obscure the real story: While Reddit matters on the whole, it may not matter as much to you and your brand.

Reddit’s strength is largely explained by composition: It appears across many industries, which makes it look dominant in aggregate. But when you control for composition and look within industries, vertical specialists often show stronger, clearer overperformance.

You can dig into our full findings below. But if you take away one thing from this research, let it be this:

If you want to improve your AI search performance, don’t just obsess over Reddit—find out which citation sources truly matter for your business.

Reddit’s influence shrinks when you zoom in

Reddit appears to play an outsized role in AI answers because most data studies focus on overall citation share, which almost always means a weighted average across segments (i.e., industries, platforms, prompt types, etc.).

Topline numbers reflect both performance within each segment and how much demand each segment represents (the “composition”).

I’ve written about this before in the context of interpreting panel data, but the key reminder is: If the composition shifts, the overall metric can move even when nothing changes inside a segment.

It’s a composition effect that’s closely related to the broader idea behind Simpson’s paradox: Aggregate trends can be misleading if you ignore how groups are weighted.

There’s no denying that Reddit is a strong cross-industry generalist. Similar to sites like Wikipedia, YouTube, and Quora, it features content related to essentially every industry.

The rub: Most marketers work in one industry, not all of them.

When you start to slice and dice the data by specific industry categories, you see the importance of Reddit shrink in favor of sources that primarily serve one industry.

The importance (or lack thereof) of Reddit is even clearer when you drill down by model.

Reddit is over-represented across Google’s non-Gemini LLM surfaces (i.e., AI Overviews and AI Mode) relative to full-fledged AI assistants like ChatGPT and Gemini.

Once you get a segmented view of the data, Reddit looks less unique and behaves more like any other large citation source for a specific industry.

That’s not to say that it doesn’t matter greatly for certain industry segments. For example, Reddit dominates in media & entertainment.

But true vertical specialists consistently punch above their weight within their industries (e.g., NerdWallet for finance, Tripadvisor for travel, etc.).

This aligns with an experiment we ran earlier this year to see which websites are most frequently cited in AI responses to more bottom-of-funnel prompts. In that instance, Reddit appeared often as a cited domain across all categories but wasn’t a top-three source for any of them.

So the question then becomes: How important is Reddit to my industry?

And, if the answer is, “Not quite as important as you think,” which citation sources should I focus on?

These are the citation sources that dominate your space

We focused our research on 10 industries: finance, travel & transportation, hospitality & food, technology, retail, media & entertainment, healthcare providers & services, education, automotive, and personal & home services.

We found that when you stratify the data by industry, you can more clearly see where Reddit actually moves the needle and where it doesn’t.

One thing that’s worth mentioning before diving into the data below: No matter what domain is the most cited source for the industry you operate in, use your best judgement before deciding to prioritize it.

Our data is based on a taxonomy of industry categories. Each category can include many diverse types of businesses. For example, finance could encompass any number of products and services, from credit cards to life insurance to cryptocurrency.

To find out which sources matter to your business, you need granular visibility into which sources are cited for the prompts you care about most.

With that said, here are our findings (results are based on an average of four AI platforms: ChatGPT, Google AI Overviews, Google AI Mode, and Gemini):

Finance

These are the top 10 most-cited domains for finance:

NerdWallet takes the top spot while Reddit comes in at the bottom.

Bonus: The most-cited Subreddit for finance is: r/Insurance.

Travel & transportation

These are the top 10 most-cited domains for travel & transportation:

Tripadvisor is far and away the No.1 citation source while Reddit makes the bottom of the top five.

Bonus: The most-cited Subreddit for travel & transportation: r/TravelHacks.

Hospitality & food

These are the top 10 most-cited domains for hospitality & food:

Tripadvisor again dominates the competition while Reddit slips into sixth place in this industry category.

Bonus: The most-cited Subreddit for hospitality & food is: r/travel.

Technology

These are the top 10 most-cited domains for technology:

Generalist sources see higher performance in technology, with Wikipedia taking the top spot and Reddit coming in at No.3.

Bonus: The most-cited Subreddit for technology is: r/SaaS.

Retail

These are the top 10 most-cited domains for retail:

The leader by a long shot is good old-fashioned Google (this could include both individual brand websites and Google Shopping). That said, Reddit is high up in retail, snagging the No. 2 spot.

Bonus: The most-cited Subreddit for retail is: r/femalefashionadvice.

Media & entertainment

These are the top 10 most-cited domains for media & entertainment:

Reddit is the clear leader in this industry category.

Bonus: The most-cited Subreddit for media & entertainment is: r/SonicTheHedgehog.

Healthcare providers & services

These are the top 10 most-cited domains for media & entertainment:

The National Institutes of Health comes out on top here while Reddit doesn’t make the top 10.

Bonus: The most-cited Subreddit for healthcare providers & services is: r/Frugal.

Education

These are the top 10 most-cited domains for education:

Research.com is the clear winner, but Reddit takes the No. 2 spot in this industry category.

Bonus: The most-cited Subreddit for education is: r/StudentNurse.

Automotive

These are the top 10 most-cited domains for automotive:

Reddit just edges out YouTube for the top spot here.

Bonus: The most-cited Subreddit for automotive is: r/whatcarshouldIbuy.

Personal & home services

These are the top 10 most-cited domains for personal & home services:

Yelp grabs first place while Reddit rounds out the top 5 in this industry.

Bonus: The most-cited Subreddit for personal & home services is: r/hvacadvice.

Here are the key takeaways for marketing teams

Don’t benchmark Reddit in isolation

If you only look at aggregate rankings, you can confuse “shows up everywhere” with “wins in your category.”

The more actionable view is industry-specific share (and then platform-specific, where relevant).

Vertical specialists are often the real opportunity

In many industries, the domains that matter most are the ones that are built for that specific buyer journey (e.g., finance, travel, etc).

Those are the domains that tend to overperform within an industry.

Commercial citations are more diverse than you might expect

When it comes to industry and commercial prompts, cited-domain distribution is highly diverse, with many niche and industry-specific sites appearing.

That diversity is a feature of the prompt mix: Questions more closely associated with the middle and bottom of the funnel often require specialized sources.

Tailor your citation strategy to you

Don’t blindly trust what others say—prove it.

Map out the prompts that matter to your business and identify which citation sources are truly important.

Build a citation strategy that actually fits your business

Sources like Reddit matter, but they shouldn’t give you tunnel vision.

Here’s what you should do instead:

1. Track citation results for business-relevant prompts

The first step is understanding which sites are cited for brand-relevant topics (and whether you or your competitors are mentioned).

Scrunch makes it easy to see which sources are cited for the prompts you care about—whether it’s your brand, a competitor, or a third party.

2. Prioritize citation strategy based on influence

Whether your goal is getting mentioned in a cited source or beating it completely in the eyes of AI, it pays to prioritize your efforts based on the sources that truly dominate—the ones that are cited consistently over time.

Scrunch makes this easy, too. Our Influence Score tells you exactly which citation sources to tackle first by multiplying the unique number of prompts by the percentage of responses that have cited the source.

3. Partner or publish to close citation gaps

Once you have a prioritized list in hand, you can start working to secure placements on third-party sites that LLMs trust or creating and optimizing content to overtake dominant sources. In both cases, your focus should be on the exact sources that are moving the needle today.

And you don’t have to go it alone.

Scrunch partners with companies like Noble and Stacker to help you scale placements across third-party sites and generate organic placements via owned content.

Meanwhile, Scrunch's Deep AI Audit, Content Optimizer, and Agent Experience Platform (AXP) features make it easy to fine-tune content for AI consumption and deliver it directly to LLMs to drive better citation results.

Follow relevance, not just reach. Your AI search strategy will thank you.

See your brand’s real citation opportunities with Scrunch

Find out which citation sources matter most for your brand. Start a 7-day free trial or get in touch to see how Scrunch can help you optimize your citation strategy.


A quick note on our methodology

This analysis combines two ingredients:

  1. Observed citations from industry and commercial prompts: We looked at which domains were cited across tens of thousands of industry and commercial prompts (aka the kinds of questions people ask when they’re researching products and services or making purchase decisions) between December 1 and December 31, 2025.
  2. Demand-derived weights: We used millions of AI interactions from panel data and other sources over that time period to estimate how heavily different AI platforms and industries are represented, then reweighted our citation observations accordingly (a standard post-stratification approach). For example, if our data estimates that 10% of prompts are about media & entertainment, but only 5% of industry and commercial prompts are related to media & entertainment, we reweight so that citations for media & entertainment-related prompts count for double the weight.

Some things to keep in mind:

  • Since we focused on industry and commercial prompts (i.e., the questions brands are most likely to care about and want to show up for) versus more “general knowledge” prompts, broad reference sites are expected to be less prominent in the results.
  • This is a single window of time (December 2025). Patterns may shift with seasonality, news cycles, model updates, and prompt mix.
  • Our industry categorization is based on prompt-topic mapping using internal models and is inherently an approximation of real-world category boundaries.