AI AutomationJune 28, 2026

Fake AI Bot Traffic Is Lying to Your Shopify Store

81.8% of "AI assistant" traffic is fake bot crawls. Here's what Shopify brands must do right now to protect their analytics and revenue data.

Mark Cijo

Mark Cijo

Founder, GOSH Digital

Your analytics are lying to you. Not because your setup is broken — because AI crawlers are flooding your traffic reports, inflating numbers that look like growth but are actually noise.

A recent deep-dive by Duane Forrester at Search Engine Journal found that 81.8% of so-called "AI assistant" traffic on a new site was fake. Bot crawls masquerading as legitimate visits. And Googlebot? The verified number was even worse. ClaudeBot — Anthropic's crawler — outpaced actual Googlebot traffic. That's not a quirk. That's a structural shift in how AI systems are consuming the web, and your Shopify store is sitting directly in the crossfire.

Why This Actually Matters for Shopify Brands

Most DTC founders I talk to are making decisions off their analytics dashboard every single day. Which products are getting attention. Which pages are driving conversions. Where traffic is coming from. If a meaningful chunk of that traffic is crawlers — not humans — you're optimizing for ghosts.

Here's the real damage. Inflated session counts suppress your true conversion rate. If bots are hitting your product pages, your CVR looks lower than it actually is. You panic. You rebuild the page, rework the copy, swap the images. When the problem was never the page — it was the data.

I've seen brands cut ad spend because their "organic traffic" looked strong, only to realize most of it was AI crawlers indexing content. Real customers weren't showing up. Revenue stalled while the analytics looked healthy.

This isn't a fringe problem anymore. It's the new normal for any site with decent content or a strong ecommerce SEO presence.

The Scale of the Crawl Problem Right Now

Let me give you some context on what's actually happening out there.

AI companies are training and maintaining models constantly. They need fresh data. They send crawlers — ClaudeBot (Anthropic), GPTBot (OpenAI), Google-Extended, PerplexityBot, and dozens of smaller ones — to scrape the web. These bots hit your pages, parse your content, and leave. No purchase. No email signup. No revenue. Just a session count that makes your traffic look better than it is.

On new sites especially, bot traffic can represent the majority of total visits. That's what the SEJ data confirmed. And it's not just content sites — Shopify stores with active blogs, detailed product descriptions, or solid content marketing strategies are prime crawl targets.

The verified Googlebot number being worse than the AI assistant figure is the part that should make you stop and reread. Even "legitimate" crawler traffic from Google is frequently unverified in standard analytics setups. You think it's a real user. It isn't.

How This Breaks Your Decision-Making

Let's be specific about where fake traffic corrupts your analysis.

Conversion Rate Inflation and Deflation

If bots are counted in your sessions, your denominator is wrong. A store doing 10,000 "sessions" with 200 purchases has a 2% CVR. But if 3,000 of those sessions are bots, your real CVR is 2.86%. That gap matters when you're benchmarking, when you're pitching investors, and especially when you're deciding whether your product page above the fold needs a redesign.

Paid Media ROAS Distortion

If you're running Meta or Google ads and your attribution model pulls from GA4 sessions, bot traffic that hits a landing page post-click can skew assisted conversion data. We've audited accounts where branded search traffic looked enormous — and a chunk of it was crawlers hitting pages after following internal links. Your ecommerce Google Ads guide setup only works if the underlying session data is clean.

Email Segmentation Errors

Klaviyo fires events based on page views. If you're using web tracking to build segments — "viewed X product but didn't buy" — bot crawls can trigger those events and dump fake profiles into your flows. Suddenly you're emailing ghosts, your open rates tank, and your Klaviyo deliverability takes a hit because you're mailing addresses that don't exist or don't engage.

What Real Traffic Verification Looks Like

Here's what we do for clients at GOSH Digital to separate signal from noise.

Step 1: Verify Your Bot Filtering in GA4

GA4 has bot filtering enabled by default — but it's not comprehensive. It filters known bots from the IAB/ABC International Spiders and Bots list. That list doesn't include most AI crawlers. ClaudeBot, GPTBot, PerplexityBot — none of these are on that list. You need to handle them manually.

Go to your GA4 property settings and confirm "filter out known bots and spiders" is checked. That's table stakes. Then go further.

Step 2: Build a Custom Bot Exclusion Regex in GA4

In GA4, you can create custom channel groupings and filters. Build a regex that excludes known AI crawler user agents from your reporting. The major ones right now:

CrawlerUser Agent String
GPTBotGPTBot
ClaudeBotClaudeBot
Google-ExtendedGoogle-Extended
PerplexityBotPerplexityBot
CCBot (Common Crawl)CCBot
AmazonbotAmazonbot

This doesn't block them from crawling — it just stops them from polluting your analytics. That's the right approach for SEO reasons: you don't necessarily want to block AI crawlers at the server level if you want your content showing up in AI answers.

Step 3: Audit Your Server Logs

This is what the SEJ piece got right. Standard analytics will never give you the full picture. Your server logs will. Pull the raw log data from your hosting or Shopify Plus infrastructure and look at the actual user agent strings hitting your store. What percentage of requests are identifiable bots? What pages are they hammering?

If you're on standard Shopify (not Plus), you won't have direct server log access. Use a CDN like Cloudflare — even the free tier — to get real bot traffic data. Cloudflare's analytics show verified human traffic vs. automated traffic. The delta is almost always eye-opening.

Step 4: Cross-Reference with Shopify Analytics

Shopify's native analytics count sessions differently from GA4. Shopify strips known bots more aggressively at the platform level. If your Shopify session count is significantly lower than GA4, you've found your bot traffic gap. For one client in the supplements space, GA4 showed 47,000 monthly sessions. Shopify showed 31,000. That's 34% of traffic that was noise — and it was reshaping every conversion rate calculation they made.

Should You Block AI Crawlers Entirely?

I take a clear position here: no. Not wholesale.

Here's why. AI-driven discovery is becoming a real acquisition channel. If you're not thinking about how your products show up in ChatGPT, Claude, or Perplexity responses, you're ignoring a channel that's growing fast. The AI agents reshaping SEO and organic acquisition aren't going away — they're accelerating.

Blocking GPTBot and ClaudeBot might clean your analytics. But it also removes your chance of appearing in AI-generated answers when someone asks "what's the best protein powder for endurance athletes" or "where can I buy handmade leather wallets."

What you should block — or rate-limit — are the scrapers that hit you dozens of times per hour with no conceivable indexing purpose. That's different from letting legitimate AI crawlers index your content on a reasonable schedule. Protecting your Shopify cart and checkout from AI bots is a different problem from managing crawler traffic to content pages — don't conflate the two.

The right robots.txt strategy is selective. Allow GPTBot. Allow ClaudeBot. Disallow the credential-stuffing bots and the scrapers targeting your cart and checkout endpoints. Those are functionally different threats.

Fixing Your Reporting Stack Right Now

Here's the priority order I'd give any Shopify brand dealing with this.

First, get Cloudflare in front of your store. Even on the free tier, the traffic insights alone are worth it. You'll see human vs. bot traffic, challenged requests, and which user agents are hitting your store most aggressively.

Second, create a clean filtered view in GA4. Your main property stays unfiltered (you want the raw data preserved). Create a separate data stream or filtered report that excludes known AI crawler user agents. Make all your day-to-day decisions from the filtered view.

Third, audit your Klaviyo web tracking events. Check your "Active on Site" metric against your real session counts. If there's a major discrepancy, you likely have bot-triggered events inflating your web tracking. This feeds into your Klaviyo browse abandonment flows — if bots are triggering "viewed product" events, you're sending flows to non-humans.

Fourth, reconcile your KPIs. Every metric you report — CVR, traffic, channel attribution — should be recalculated against verified human sessions. Your ecommerce analytics baseline needs to be reset.

The Bigger Picture: What AI Traffic Means for DTC in 2026

This isn't just a data hygiene issue. It's a signal about where the internet is going.

AI systems are consuming web content at a scale that would have been unimaginable three years ago. The web is increasingly read by machines before humans even see it. That changes how you think about content, about your ecommerce customer data strategy, and about what "traffic" even means.

The brands that win over the next two years won't just optimize for Google rankings. They'll optimize for AI comprehension — structured data, clear product information, authoritative content that AI models trust enough to cite. We're already building this into our SEO work for clients, and the results track directly with how Google AI Overviews have shifted organic behavior over the last 18 months.

Clean data is the foundation. You can't make smart decisions on bad numbers. And right now, if you haven't audited your traffic for AI crawler inflation, your numbers are bad.

What to Do This Week

Don't let this become something you'll "get to eventually." Here's your three-day action plan.

Day one: Install or check Cloudflare on your store. Pull the bot traffic report. See the gap.

Day two: Audit GA4. Confirm bot filtering is on. Build a filtered explorer segment excluding the major AI crawler user agents listed in the table above.

Day three: Cross-reference your top-line metrics — CVR, sessions, channel mix — against the cleaned data. Identify which decisions you've made in the last 90 days were based on inflated traffic. Adjust your benchmarks.

If you're spending serious money on paid media and making creative or budget decisions based on analytics that include 30-40% bot traffic, you're flying blind. Measuring ad creative performance for DTC only works when the session and conversion data underneath it is real.

Get your data clean. Everything else follows from that.


If you want a proper audit of your Shopify analytics stack — including bot traffic separation, GA4 configuration, and Klaviyo event hygiene — that's exactly the kind of work we do at GOSH Digital. Reach out and we'll tell you within 48 hours whether your numbers are trustworthy.

Mark Cijo

Written by Mark Cijo

Founder of GOSH Digital. Klaviyo Gold Partner. Helping eCommerce brands grow revenue through data-driven marketing.

Book a free strategy call →

Want results like these for your brand?

Book a free call. We'll look at your data and show you what's possible.

Pick a Time

15 minutes. No pitch deck. Just your data and our honest take.