Most ecommerce teams track Google rankings religiously but have zero visibility into whether AI shopping agents can find, parse, and recommend their products. A Digital Applied study analyzing 23,000+ LLM citations found that 92% of brands are invisible in AI search results. The problem is not just optimization. It is measurement. Without a structured scorecard, you cannot fix what you are not tracking.
This guide defines 8 specific metrics that determine your store’s AI discoverability health. For each metric, you get a free tool to measure it, a benchmark to aim for, and a fix when you fall short. Run the full scorecard once, then track weekly. The entire audit takes under two hours the first time and under 30 minutes on follow-ups.
The 8-Metric AI Discoverability Scorecard
| # | Metric | What It Measures | Free Tool | Passing Score |
|---|---|---|---|---|
| 1 | AI Crawler Access | Can agents reach your pages | robots.txt check + log grep | 100% of product URLs accessible |
| 2 | Content Extractability | Do agents see text or blank pages | Jina Reader API | Key content present in extract |
| 3 | Schema Coverage | Product markup on catalog pages | Schema.org Validator | 100% of products have valid JSON-LD |
| 4 | Feed Health | Product data accuracy and freshness | Google Feed Validator | 0 critical errors |
| 5 | Citation Rate | How often agents recommend your store | Manual query tracking | 30%+ for branded queries |
| 6 | AI Referral Traffic | Visitors arriving from AI platforms | Google Analytics 4 UTM tracking | Growing month over month |
| 7 | Cross-Platform Visibility | Consistent presence across ChatGPT, Gemini, Perplexity | Cross-platform query test | Visible on 2+ platforms |
| 8 | Answer Readiness | First-paragraph answer quality | Content preview + AI self-test | Direct answer in opening paragraph |
Each metric below includes the exact steps to test it, what the output means, and what to fix when you score below the benchmark.
Metric 1: AI Crawler Access
What It Means
AI shopping agents from OpenAI, Google, Anthropic, and Perplexity must be able to crawl your product pages before they can recommend anything. A misconfigured robots.txt, aggressive rate limiting, or bot-blocking CDN rules can silently block all AI agents from your store.
How to Test
Tool: robots.txt Tester + curl
Step 1: Check your robots.txt for AI crawler directives.
curl -s https://yourstore.com/robots.txt
Look for these user agents: ChatGPT-User, Google-Extended, CCBot (Common Crawl), ClaudeBot, PerplexityBot, Bytespider (ByteDance). If any are set to Disallow: /, that agent cannot access your store.
Step 2: Test actual page access with curl.
curl -s -o /dev/null -w "%{http_code}" -A "ChatGPT-User" https://yourstore.com/products/example
A 200 response means the page is accessible. A 403 or 503 means something is blocking the request.
Step 3: Grep your server logs for AI crawler activity (if you have access).
grep -i "ChatGPT-User\|ClaudeBot\|PerplexityBot\|Google-Extended\|CCBot" /var/log/nginx/access.log | tail -50
Passing Score
All AI crawler user agents get 200 responses on product pages. Your robots.txt either allows all crawlers or explicitly allows the AI agents listed above.
What to Fix
If blocked: Remove Disallow rules for AI user agents. If your CDN (Cloudflare, Fastly) has bot protection set to “challenge” or “block” for automated traffic, add the specific AI user agents to an allowlist. Shopify stores: check that your theme does not include a custom robots.txt that overrides Shopify’s default (which allows all crawlers).
For a deeper analysis of crawler access patterns, see the AI crawler log analysis guide which covers building a full monitoring dashboard from your server logs.
Metric 2: Content Extractability
What It Means
Even when AI crawlers can reach your pages, they might extract nothing useful. JavaScript-only rendering, text trapped inside images, and content hidden behind interactive widgets all produce empty extracts. If Jina Reader cannot pull your product data from a page, neither can ChatGPT.
How to Test
Tool: Jina Reader API (free)
curl -s "https://r.jina.ai/https://yourstore.com/products/example" | head -100
Jina Reader returns the same clean text extraction that most AI agents use. Look for:
- Product name in the first few lines
- Price visible as a number
- Product description with actual detail (not “loading…” or empty)
- Variant information (sizes, colors)
- Any JSON-LD structured data at the end of the extract
Passing Score
Product name, price, description, and at least one product image URL are clearly present in the Jina extract. No “loading” placeholders or empty sections.
What to Fix
If content is missing: Your page likely relies on client-side JavaScript rendering. Solutions include:
- Server-side rendering (SSR): Render product data in the initial HTML response, not via JavaScript.
- Pre-rendering: Use a service like Prerender.io or Rendertron to serve static HTML to crawlers.
- Embedded JSON-LD: Even if your visual content is JS-rendered, embed a complete Product JSON-LD block in the raw HTML. AI agents parse JSON-LD separately from visible text.
For a full walkthrough of content preview tools, see the AI content preview tools guide which covers Jina Reader, reader mode testing, and structured data extraction in detail.
Metric 3: Schema Coverage
What It Means
Product schema (JSON-LD) is the structured data layer that tells AI agents exactly what your product is, what it costs, and whether it is in stock. Without valid schema, AI agents must guess, and they often guess wrong or skip your product entirely.
How to Test
Tool: Schema.org Validator (validator.schema.org)
Step 1: Open validator.schema.org.
Step 2: Enter your product page URL and click “Run Test.”
Step 3: Check for:
- A
Producttype withname,offers,image, anddescriptionfields offers.priceas a number (not a string)offers.availabilityas a validhttps://schema.org/URLskuorgtin13/mpnidentifier presentaggregateRatingif you have reviews
Step 4: Repeat for 5-10 product pages across your catalog, including:
- A top-selling product
- A newly launched product
- A variant-heavy product (multiple sizes/colors)
- A product on sale
Passing Score
100% of tested product pages have valid Product JSON-LD with at minimum: name, offers (price + availability), image, and a product identifier (GTIN, SKU, or MPN). Zero validation errors.
What to Fix
Missing schema: Shopify auto-generates Product JSON-LD for most themes, but custom themes or headless builds may not. WooCommerce requires a plugin (Yoast SEO or Rank Math both add Product schema). Custom platforms need manual JSON-LD injection.
Invalid schema: Common fixes include changing price from a string ("29.99") to a number (29.99), using the correct availability URL format (https://schema.org/InStock not just "InStock"), and ensuring the @type is exactly Product (not product or Products).
The product schema markup guide covers the complete schema specification for AI agent compatibility, including variant handling and aggregate rating markup.
Metric 4: Feed Health
What It Means
Your product feed (Google Merchant Center RSS/XML, Shopify product JSON, or custom feed) is the structured data pipeline that feeds directly into Google Shopping Graph and other AI shopping systems. Feed errors are the single biggest technical cause of AI invisibility.
According to Pragma’s 2026 study, 41% of ecommerce product feeds contain at least one critical error. The most common: missing GTIN (28%), incorrect availability signals (19%), and malformed structured data (17%).
How to Test
Tool: Google Merchant Center Feed Diagnostic + Shopify Feed Validator
For Google Merchant Center feeds:
- Log into Merchant Center
- Navigate to Products > Diagnostics
- Check for errors and warnings
- Focus on critical errors first: missing GTIN, price mismatch, image quality issues
For Shopify stores, validate your product feed output:
curl -s "https://yourstore.com/products.json?limit=10" | python3 -m json.tool | head -100
Check that each product has: title, body_text (description), variants with price, images, and vendor.
For XML/RSS feeds:
curl -s "https://yourstore.com/feed" | xmllint --noout -
Passing Score
Zero critical errors in your feed. All products have titles, prices, images, availability status, and at least one product identifier (GTIN, MPN, or brand+MPN combination).
What to Fix
Missing GTINs: Add them in your product management system. If you sell custom/private-label products without GTINs, submit a GTIN exemption in Merchant Center.
Stale data: Ensure your feed refreshes at least daily. Shopify auto-refreshes, but custom feeds need a cron job or webhook-triggered update.
Image quality: AI agents prefer images with clear product visibility. Avoid lifestyle-only images as primary feed images. Include at least one clean product-on-white photo.
The product feed validator guide provides a complete feed testing workflow with error interpretation for each validator.
Metric 5: Citation Rate
What It Means
Citation rate measures how often AI agents actually name your store or products in their recommendations. This is the ultimate output metric. Everything else on the scorecard is an input. Citation rate is the result.
How to Test
Tool: Manual query tracking (spreadsheet)
There is no reliable automated citation tracker yet. Testing requires querying AI platforms directly and recording results.
Step 1: Create a spreadsheet with these columns:
| Query | Platform | Your Store Cited? | Competitor Cited | Position | Date |
|---|
Step 2: Run these query types on ChatGPT, Google Gemini, and Perplexity:
- Branded: “What do you know about [your brand]?”
- Product category: “Best [your product category] for [use case]”
- Comparison: “[Your product] vs [competitor product]”
- Purchase intent: “Where can I buy [product type] online?”
Step 3: Record whether your store appears, which competitors appear, and whether the AI attributes specific products to you.
Step 4: Repeat weekly. Track trends over time.
Passing Score
Your store appears in at least 30% of branded queries and at least 10% of category/purchase-intent queries. If you are a smaller brand, focus on branded and niche category queries first.
What to Fix
Low citation rate despite good technical scores: The issue is likely content quality. AI agents cite stores with detailed, opinion-rich product descriptions, comparison data, and genuine expertise. A Digital Applied study found that opinion-rich prose generated a 47% citation lift compared to schema-only optimization (3.1% lift).
Write product descriptions that answer specific questions: “This 100% merino wool base layer is ideal for temperatures between -5 and 10 degrees Celsius” is citable. “Premium quality base layer for outdoor enthusiasts” is not.
For automated tracking approaches, the AI answer monitoring tools guide covers the emerging tools that can partially automate this process.
Metric 6: AI Referral Traffic
What It Means
Citations matter, but traffic proves that real humans are clicking through from AI answers. Tracking AI referral traffic in analytics tells you whether AI recommendations are driving actual store visits.
How to Test
Tool: Google Analytics 4
Step 1: In GA4, go to Reports > Acquisition > Traffic Acquisition.
Step 2: Look for these source/medium combinations:
chatgpt.com / referralperplexity.ai / referralgemini.google.com / referralcopilot.microsoft.com / referralai / organic(catch-all for some AI-driven traffic)
Step 3: Filter by landing page to see which product pages AI traffic reaches.
Step 4: Compare month-over-month. Even small absolute numbers matter if the trend is upward.
Passing Score
AI referral traffic is growing month over month. Even 50-100 sessions/month from AI sources is meaningful in 2026, as this channel is still early and growing rapidly. BrightEdge reported that AI search result volumes grew 850% between mid-2024 and early 2025, and that growth continues.
What to Fix
Zero AI traffic: Run metrics 1-4 first. If crawlers cannot access your pages or content is not extractable, you will get zero referrals regardless of citation rate.
Traffic but no conversions: Check that your landing pages load fast on mobile and that the product the AI recommended is prominently featured. AI traffic often lands on specific product pages, not the homepage.
For understanding the quality gap between AI traffic and traditional search traffic, the AI referral traffic quality study compares conversion rates across ChatGPT, Perplexity, and Google.
Metric 7: Cross-Platform Visibility
What It Means
Being visible on ChatGPT but invisible on Gemini and Perplexity means you are missing a growing portion of AI-driven shopping. Each platform has its own content sources, crawl patterns, and ranking logic. Cross-platform visibility measures your consistency.
How to Test
Tool: Cross-platform query matrix
Run the same 5 product-related queries on all three major AI platforms and record presence/absence:
| Query | ChatGPT | Gemini | Perplexity |
|---|---|---|---|
| Branded query | ? | ? | ? |
| Category query | ? | ? | ? |
| Comparison query | ? | ? | ? |
| Purchase query | ? | ? | ? |
| Local query | ? | ? | ? |
Score each cell: 1 = cited, 0 = not cited. Divide total by 15 (5 queries x 3 platforms).
Passing Score
Visible on at least 2 out of 3 platforms for branded queries. Visible on at least 1 platform for category queries.
What to Fix
Missing from ChatGPT specifically: ChatGPT relies heavily on Bing’s web index and its own browsing tool. Ensure Bing is indexing your product pages (check Bing Webmaster Tools). Also check that your content is extractable via Jina Reader, since ChatGPT’s browsing tool uses similar extraction.
Missing from Gemini specifically: Google Gemini pulls from Google’s search index and Shopping Graph. Ensure your Google Merchant Center feed is active and healthy (Metric 4). Check that your product pages rank in traditional Google results, since Gemini often cites sources that already rank well.
Missing from Perplexity specifically: Perplexity aggregates from multiple sources including web crawl, academic databases, and social signals. A complete llms.txt file and active presence on review platforms help with Perplexity visibility.
The cross-platform AI visibility gap analysis breaks down the specific data sources and ranking factors for each AI platform.
Metric 8: Answer Readiness
What It Means
Answer readiness measures whether your product page content is structured so that AI agents can cite your first paragraph directly as an answer. This is the GEO (Generative Engine Optimization) equivalent of the “featured snippet” in traditional SEO.
How to Test
Tool: Self-test with ChatGPT
Step 1: Copy the first paragraph of your product page description.
Step 2: Ask ChatGPT: “Based on this text, what is this product and who is it for?”
[Paste your first paragraph]
Step 3: Evaluate the response:
- Does ChatGPT correctly identify the product category?
- Does it mention the key differentiators you wrote about?
- Could this response be used as a citation in a shopping recommendation?
If ChatGPT cannot extract a clear answer from your opening paragraph, neither can any other AI agent when deciding whether to recommend your product.
Passing Score
Your opening paragraph contains a direct, specific statement about what the product is, who it is for, and what makes it different. No generic filler. No “Welcome to our store” or “Discover our collection.”
What to Fix
Weak opening paragraphs: Rewrite using the answer-first format. Instead of:
“Experience the ultimate in comfort with our premium merino wool socks.”
Write:
“These merino wool hiking socks wick moisture in temperatures from -5 to 25 degrees Celsius and last 200+ washes without pilling, based on third-party testing by Bureau Veritas.”
The second version is citable. It contains specific claims with verifiable data. The first version is generic noise that AI agents skip.
For the complete methodology on writing answer-first product content, the answer-first content guide covers the writing framework with before-and-after examples across multiple product categories.
Running Your Weekly Scorecard
The Workflow
Run this checklist every Monday. It takes 30 minutes after the initial setup.
Week 1 (full audit, 2 hours):
- Test all 8 metrics
- Record baseline scores in a spreadsheet
- Fix any critical failures (crawl access, schema errors, feed errors)
- Set up tracking for metrics 5-7
Week 2+ (weekly check, 30 minutes):
- Verify crawl access (curl one product URL per AI agent)
- Check Jina extract on one product page
- Run 5 citation queries across 3 platforms
- Check GA4 for AI referral traffic trend
- Record all scores, note changes from previous week
Scoring System
Assign points per metric:
| Score | Meaning |
|---|---|
| 2 | Passes benchmark |
| 1 | Partial pass (some products pass, some do not) |
| 0 | Fails benchmark |
Maximum score: 16 (8 metrics x 2 points). Minimum acceptable: 10.
Tracking Template
Week of: YYYY-MM-DD
Metric 1 (Crawl Access): __/2
Metric 2 (Content Extract): __/2
Metric 3 (Schema Coverage): __/2
Metric 4 (Feed Health): __/2
Metric 5 (Citation Rate): __/2
Metric 6 (AI Traffic): __/2
Metric 7 (Cross-Platform): __/2
Metric 8 (Answer Readiness): __/2
TOTAL: __/16
Notes:
Common Scorecard Patterns
Pattern 1: High Technical, Low Citation
Scores 14-16 on metrics 1-4 but 2-4 on metrics 5-8. This is the most common pattern. Your technical foundation is solid but your content is not citable. Fix: Focus on metrics 5 and 8. Rewrite product descriptions with specific, verifiable claims. Add comparison data. Write answer-first content.
Pattern 2: Low Technical, High Citation
Scores 2-6 on metrics 1-4 but 8-12 on metrics 5-8. This happens when you have strong brand recognition but poor technical infrastructure. AI agents cite you from training data but cannot crawl your current catalog. Fix: Address metrics 1-3 first. You are leaving easy visibility on the table.
Pattern 3: Consistently Low Across All Metrics
Scores below 6 total. Your store is essentially invisible to AI agents. Fix: Start with metric 1 (crawl access) and work through sequentially. Each metric builds on the previous one. Most stores can reach 10+ within 2-3 weeks of focused effort.
Tool Summary
| Tool | Metrics It Serves | Cost | Setup Time |
|---|---|---|---|
| curl (robots.txt + UA test) | 1 | Free | 5 min |
| Jina Reader API | 2 | Free (rate-limited) | 2 min |
| Schema.org Validator | 3 | Free | 5 min |
| Google Merchant Center Diagnostics | 4 | Free | 10 min |
| Manual query tracking (spreadsheet) | 5, 7 | Free | 20 min |
| Google Analytics 4 | 6 | Free | 10 min |
| ChatGPT self-test | 8 | Free tier available | 10 min |
Total setup time: approximately 60 minutes. Weekly maintenance: 30 minutes.
All tools are free. None require paid subscriptions for the level of testing described here.
Why This Scorecard Matters
The ecommerce AI discoverability landscape in 2026 is where Google SEO was in 2005: early, growing fast, and dominated by stores that measure and optimize systematically. Stores that track these 8 metrics weekly will build compounding visibility advantages as AI shopping adoption accelerates.
The data supports the urgency. BrightEdge’s research shows AI search result volumes grew 850% year over year. Profound’s analysis of 23,000+ citations found 92% of brands are invisible. And a Rand Fishkin study cited in multiple industry reports found that 60% of AI citations go to sources outside the top 10 Google results. This means traditional SEO rankings do not predict AI visibility. You need a separate measurement framework.
Shopti.ai helps ecommerce stores diagnose and fix gaps across all 8 of these metrics. The scorecard framework above works as a self-service audit. For stores that want ongoing monitoring and automated fixes, shopti.ai provides the infrastructure to track these metrics continuously without manual spreadsheet work.
Check your store agent discoverability score free at shopti.ai.
FAQ
How often should I run the AI discoverability scorecard?
Run the full 8-metric scorecard weekly. Technical metrics (crawl access, schema, feed health) rarely change day to day but can break after theme updates, plugin changes, or platform migrations. Citation rate and AI traffic should be tracked weekly to spot trends. Monthly is too slow. Daily is overkill for most stores.
Do I need paid tools to track AI discoverability?
No. All 8 metrics in this scorecard can be measured with free tools: curl, Jina Reader, Schema.org Validator, Google Merchant Center Diagnostics, manual AI queries, Google Analytics 4, and ChatGPT’s free tier. Paid tools like Semrush or Ahrefs can supplement citation tracking but are not required for the core scorecard.
What score should my ecommerce store aim for on this scorecard?
Aim for at least 10 out of 16 total points. Stores scoring 12 or above are well-positioned for AI visibility. Stores below 8 have significant gaps that need immediate attention. The most critical metrics to fix first are crawl access (metric 1) and schema coverage (metric 3), since everything else depends on those foundations.
Why is my store visible on ChatGPT but not on Gemini or Perplexity?
Each AI platform uses different data sources. ChatGPT relies on Bing’s index and its own browsing tool. Gemini pulls from Google’s index and Shopping Graph. Perplexity aggregates from web crawl, academic databases, and social signals. Visibility on one platform does not guarantee visibility on others. Use metric 7 (cross-platform visibility) to identify which platforms need attention.
How long does it take to improve AI discoverability after fixing issues?
Technical fixes (crawl access, schema, feed errors) typically show results within 1-2 weeks as AI crawlers re-index your pages. Content improvements (answer readiness, product descriptions) take 2-4 weeks to affect citation rates. Cross-platform visibility improvements can take 4-8 weeks depending on the platform’s crawl cycle and update frequency.
Sources
Digital Applied, “LLM Citation Analysis: 23,000+ Citits Across Major AI Platforms,” May 2026. Found that 92% of brands are invisible in AI search results and opinion-rich prose outperforms schema-only optimization by 47% vs 3.1%.
Pragma, “Product Feed Quality in Ecommerce: 2026 Benchmark Report.” Found that 41% of ecommerce feeds contain critical errors: missing GTIN (28%), incorrect availability (19%), malformed structured data (17%).
BrightEdge, “AI Search Growth Report 2025-2026.” Documented 850% growth in AI search result volumes between mid-2024 and early 2025, with continued acceleration into 2026.
Rand Fishkin / SparkToro, “AI Citation Sources Analysis 2026.” Found that 60% of AI citations reference sources outside the top 10 traditional Google results, indicating weak correlation between SEO rankings and AI visibility.