D1–D6 Agent-Trust Domain Metrics: framework reference
Six dimensions for scoring how visible and trustworthy a domain is to AI agents — evaluated at the infrastructure layer, before content is processed.
Agents evaluate domains before reading content. The D1–D6 framework scores that prior-layer evaluation: what agents can resolve from the domain name, from machine-readable declarations, and from crawler history. A domain that scores well on D1–D6 has established namespace clarity — the prior condition for presence in agent-mediated systems.
Each dimension is independently assessable. A domain can score high on D3 (structural infrastructure present) and low on D4 (no AI crawlers have visited). High scores on all six dimensions indicate a domain that is well-declared, structurally sound, and actively present in agent retrieval pipelines.
The Six Dimensions — v0.1
Can an agent resolve the domain's purpose from the domain name alone, without fetching any files or reading any content? This is the first signal agents process.
- High: Name directly encodes the domain's scope —
semanticnamespace.orgsignals semantic namespace reference;semantic-domains.comsignals domain assessment. No inference required. - Medium: Name is plausibly related to the actual scope but requires context to confirm — a compound name where one term is generic, or a brand name with adjacent-domain recognition.
- Low: Name is a brand, acronym, or generic term with no semantic signal. Agents must infer purpose entirely from declarations and content rather than the name itself.
- How to assess: Cover the site content. Read only the domain name. Could a well-informed agent categorize the domain's primary subject without additional signals?
Are the machine-readable metadata signals — JSON-LD schema, Open Graph Protocol tags, meta descriptions — consistent with the domain name and with each other?
- High: JSON-LD
@typeis specific and appropriate (WebSite,Organization,TechArticle);descriptionis role-specific and non-generic; OGP tags consistent with LD+JSON; meta description non-boilerplate. - Medium: JSON-LD present but uses only generic
WebSitetype with minimal description; one tag contradicts another; meta description is templated. - Low: No JSON-LD; schema contradicts domain name or visible content; multiple inconsistent signals across tag types.
- How to assess: Fetch
/. Parse JSON-LD, OGP, and meta description. Check consistency across all three. Verifydescriptionis specific enough to distinguish this domain from a similar one.
Does the domain publish the structural files that agents expect to find before indexing content? This is the compliance checklist for agent-readable infrastructure.
- Full: All five present —
robots.txtwith named AI crawler UA directives,sitemap.xmlcovering all content pages,AGENTS.mdwith role and scope,/.well-known/agent.jsonwith machine-readable identity,/.well-known/namespace-cluster.jsonif cluster member. - Partial: Core files present (
robots.txt,sitemap.xml) but agent-specific declarations absent (AGENTS.md,agent.json). - Low: Only
robots.txtpresent, orrobots.txtblocks AI crawlers, or no agent-readable infrastructure at all. - How to assess: Fetch each file directly. For
robots.txt, verify named AI UA directives (ClaudeBot, GPTBot, Google-Extended, Meta-ExternalAgent). Forsitemap.xml, verify it lists all deployed content pages. Foragent.json, verify all required fields are present and non-empty.
Has at least one named AI crawler visited the domain within the active window (90 days)? A domain with no AI crawler visits is absent from agent retrieval pipelines regardless of its structural quality.
- Active: At least one named AI crawler (ClaudeBot, GPTBot, Google-Extended, Meta-ExternalAgent, Applebot, PerplexityBot) has visited within 90 days. Multiple distinct crawlers visiting is a stronger signal.
- Stale: AI crawler visits present in logs but older than 90 days. Domain was indexed but has not been recrawled — content updates may not be reflected in agent knowledge.
- Dark: No AI crawler visits in logs, or no log data available. Domain is structurally sound but absent from active agent retrieval pipelines.
- How to assess: See Crawler-Liveness Instrumentation section below.
Does the domain demonstrate technical trust signals at the protocol layer — DNSSEC, TLS chain integrity, and anti-cloaking consistency?
- DNSSEC: Domain has DNSSEC enabled and resolving cleanly. Verifiable via
dig +dnssec [domain] Aor DNSSEC validators. DNSSEC prevents DNS spoofing that could redirect agents to malicious content. - TLS chain integrity: Valid certificate from a recognized CA, full chain present, no mixed-content warnings. Certifiable via TLS inspection tools. A broken TLS chain causes agent HTTP clients to refuse connection.
- Anti-cloaking consistency: The same content is served to named AI crawler UA strings as to standard browser UA strings. Agents that detect content cloaking (different content served to bots vs. humans) will downgrade trust scores or refuse to index.
- How to assess: DNSSEC:
dig +dnssec [domain] Aand check for AD flag. TLS: fetch withcurl -vand inspect certificate chain. Anti-cloaking: fetch with ClaudeBot UA string and with standard browser UA, compare content.
Does the domain have a stable governance history — domain age, secure registrar, no recent ownership changes — that supports long-term agent trust?
- Domain age: Registered more than 24 months ago with no ownership transfers. Older domains with stable history receive higher baseline trust in agent systems that weight governance signals.
- Registrar security tier: Registered through a high-security registrar (Cloudflare Registrar, Namecheap, Google Domains) with registry lock enabled. Low-tier registrars with poor abuse response are treated as risk signals.
- Ownership continuity: WHOIS history shows no recent ownership transfers. A domain that recently changed hands may carry a different trust history than its current operator intends.
- How to assess: WHOIS lookup for creation date, registrar, and transfer history. Check for registry lock status. Compare WHOIS registration date against the domain's oldest archived content.
Assessment Methodology
A D1–D6 audit is a structured point-in-time evaluation. It does not require special tooling — each dimension is assessable with standard web tools. The recommended sequence:
- Establish the domain name only. Score D1 before fetching anything. Once you fetch content, the D1 signal is contaminated by contextual knowledge.
- Fetch
/and parse JSON-LD, OGP, and meta tags independently. Score D2 from these signals alone before reading body copy. - Fetch
/robots.txt,/sitemap.xml,/AGENTS.md,/.well-known/agent.json, and/.well-known/namespace-cluster.jsonif present. Score D3 against the checklist above. - Access server access logs or analytics with UA filtering. Filter for named AI crawler UA strings within the past 90 days. Score D4 by presence and recency.
- Run DNSSEC validation, TLS chain inspection, and a cloaking consistency check. Score D5 against all three signals.
- Run WHOIS. Score D6 on domain age, registrar tier, and transfer history.
- Aggregate. A domain with PASS on all six dimensions has established namespace clarity. PARTIAL on any single dimension is a known gap, not a failure. FAIL on D3 or D4 typically indicates the domain is effectively absent from agent pipelines.
Crawler-Liveness Instrumentation
D4 scoring requires access to crawler liveness data. Several methods are available depending on your infrastructure:
-
Server access logs (most reliable)
Filter access logs for named AI crawler UA strings. ClaudeBot UA:
ClaudeBot/1.0. GPTBot UA:GPTBot/1.1. Google-Extended UA:Google-Extended. Meta-ExternalAgent UA:meta-externalagent. Filter by date range and count unique visit dates. At least one hit within 90 days = Active on D4. - Cloudflare Analytics (if hosted on Cloudflare Pages or Workers) Cloudflare's analytics dashboard segments traffic by bot category. Filter to "Verified Bots" and inspect the bot name breakdown for AI crawlers. Note: Cloudflare analytics are sampled — absence in analytics does not confirm absence in server logs.
-
Edge function logging
If you control server-side code, log the
User-Agentheader for each request and write AI crawler hits to a persistent store. This is the most accurate instrumentation method and enables per-crawler liveness tracking with exact timestamps. -
Robots.txt structured as a canary
Publishing explicit AI crawler directives in
robots.txt(even permissive ones) causes compliant crawlers to fetchrobots.txtbefore indexing. The fetch is logged. Arobots.txtfetch from a named AI UA is a confirmed liveness signal for that crawler, even if no content pages were indexed.
Liveness data has a 90-day freshness window by default. A domain that was active 120 days ago but has had no AI crawler visits since is scored as Stale on D4. Recrawl can be requested via sitemap resubmission, content updates, or backlink acquisition that signals to crawlers the domain has new content worth indexing.
Cluster Nodes
- Definition semanticnamespace.org Conceptual foundation — what semantic namespace is and why it matters
- Assessment semantic-domains.com D1–D6 Agent-Trust Domain Metrics — how to score namespace clarity (this site)
- Protocol agenticnamespace.org Implementation specifications — how to declare a namespace position