ChatGPT Follows Your Internal Links

Most SEO thinking about AI visibility focuses on the entry point: how does ChatGPT or Perplexity find a page in the first place? The assumption is that AI systems behave like traditional search engines, they query an index, retrieve a list of URLs, and read the pages at those URLs.

New research suggests this is an incomplete picture. And the gap has significant implications for how you structure your site.

What the research found

Analysis of ChatGPT Deep Research behaviour via WebSocket traffic, examining how OAI-SearchBot actually navigates the web during a research session, produced a finding that caught the SEO community's attention: ChatGPT follows internal links.

Not just the pages it discovers through Bing search. Once it lands on a page, it extracts all HTML links, evaluates them in the context of the page, and follows the ones relevant to its research task. Navigation links, footer links, editorial in-content links, all of them become part of the navigation layer.

Total page opens in one analysed session

Pages discovered via internal links not search

29%

Of pages found without a new search query

In the analysed session, 9 of 31 page opens came via internal links rather than fresh search queries. That's 29% of pages discovered without ChatGPT needing to go back to Bing. The AI navigated from page to page using the site's own link structure, just as a thorough human researcher would.

The key finding

Each page open in ChatGPT Deep Research contains a field called clicked_from_url. When empty, the page was found via Bing search. When filled, the page was reached by following an internal link from another page on the same site. This makes AI navigation behaviour observable and measurable.

Which links ChatGPT follows

The research shows that all HTML links are extracted when a page is opened, not just editorial in-content links. This has implications for every link on your site.

Actively followed

Navigation links

Main nav links to your key pages are visible and can be followed. A well-structured nav that links to your most important content helps AI systems understand your site's scope.

Actively followed

Editorial in-content links

Links within article and page body copy. These are the most valuable because they're contextually relevant, the AI can understand why one page links to another.

Actively followed

Footer links

Footer links are extracted and followed. This makes the footer a more important element than many SEOs currently treat it, it's part of the AI navigation layer.

Blocked

robots.txt restricted paths

The research explicitly confirms robots.txt is respected. Paths blocked for OAI-SearchBot return a "Fetch denied by robots.txt" signal. Many sites accidentally block AI crawlers, check yours.

Why this changes site architecture thinking

Traditional SEO site architecture thinking focuses on crawl efficiency for Googlebot: how quickly can the crawler reach every important page, and how clearly does the link structure communicate page importance?

AI navigation introduces a different question: if an AI system lands on your homepage, or your most-linked article, where does it go next? Does your internal link structure guide it to the content that establishes your authority on the topics you want to be cited for?

Isolated content doesn't get discovered. A page with no internal links pointing to it, or pointing from it, exists outside the navigation layer. Even if Bing finds it, AI systems won't traverse deeper from it.

Link context matters. The AI evaluates which links to follow based on the context of the page it's on. A link to your case study placed within an article about the same topic is more likely to be followed than a generic "see more" link in a sidebar.

Orphaned content is invisible to AI. Pages that rank in Google because of external backlinks but have no internal link structure connecting them to related content on your site may be found once but never explored further.

Footer links are part of the architecture. Every page on your site links to your footer. If your footer links to your most important service pages and content hubs, every page becomes an entry point to that content for an AI crawler.

The robots.txt finding

The research also explicitly confirmed that ChatGPT Deep Research respects robots.txt restrictions for OAI-SearchBot. When a path is blocked, the system returns a clear signal: "Fetch denied by robots.txt (OAI-SearchBot)".

This matters because many sites have robots.txt configurations that were set years ago for legitimate reasons, blocking duplicate content, preventing crawl of admin areas, or limiting crawl of thin pages, but inadvertently block AI crawlers as well.

A blanket Disallow: / for all bots, or an overly broad wildcard rule, will block ChatGPT from accessing your entire site regardless of how well it ranks in Bing. This is one of the most common and most impactful AI visibility issues we find when running the Robots.txt Checker on client sites.

Check yours now

Run your domain through the free Robots.txt Checker on our tools page. It checks whether GPTBot, OAI-SearchBot, ClaudeBot, and PerplexityBot can access your site, and explains what each rule means and what to fix.

What to do about it

The practical actions fall into three categories:

1. Audit your robots.txt immediately

Check that OAI-SearchBot, GPTBot, ClaudeBot, and PerplexityBot are not blocked. If you have a legacy rule that blocks all bots, update it. If you have specific paths blocked, confirm those paths don't include your main content areas.

2. Review your internal link structure for AI navigation

Identify your most authoritative content on each topic you want to be cited for. Then trace the internal link paths from your homepage and most-visited pages to that content. If the path is broken or nonexistent, fix it. The AI needs to be able to navigate from where it lands to where your best content lives.

3. Use in-content links deliberately

Within every article and service page, link to the most relevant other content on your site, not as a generic "related articles" block, but as contextually relevant references within the body copy. These are the links most likely to be followed because their context is clear.

The finding that 29% of pages in a single research session were discovered via internal links is not a marginal result. It means that for any thorough AI research task touching your domain, roughly a third of the pages that get read will be ones the AI navigated to rather than searched for. The quality of your internal link structure directly determines which pages those are.

Check your robots.txt and AI crawler access.

The free Robots.txt Checker on our tools page shows exactly which AI crawlers can access your site and what to fix. Takes 10 seconds.

Book Diagnostic →

About the author

Douglas Lord

Digital Authority & AI Visibility Strategist · Founder of Digital Dominator · Creator of PTODA

Doug Lord is a Digital Authority & AI Visibility Strategist and founder of Digital Dominator. He created the Periodic Table of Digital Authority™ (PTODA), an independent research framework for measuring digital authority, AI visibility and crawler accessibility, and is co-founder of OG01, where he serves as COO and CPO.

ChatGPT doesn't just readyour pages.It follows your links.

What the research found

Which links ChatGPT follows

Why this changes site architecture thinking

The robots.txt finding

What to do about it

1. Audit your robots.txt immediately

2. Review your internal link structure for AI navigation

3. Use in-content links deliberately

ChatGPT doesn't just read
your pages.
It follows your links.