Master the essential technical files so Google understands and indexes your site correctly: robots.txt, XML sitemap and structured data.
Before appearing in search results, your site must be discovered, crawled and indexed by Google. The robots.txt and sitemap.xml files guide Google's bots, while structured data helps them understand your content.
Poor configuration of these elements can block indexation of your important pages or dilute your crawl budget on useless pages.
The robots.txt file is a text file placed at your site's root that tells indexing robots (Googlebot, Bingbot, etc.) which pages they can or cannot crawl.
Robots.txt example:
User-agent: * Allow: / Disallow: /admin/ Disallow: /private/ Sitemap: https://www.your-site.com/sitemap.xml
The XML sitemap is a structured list of all pages on your site that you want indexed. It helps Google discover your pages faster and understand your site structure.
Sitemap structure:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://www.your-site.com/</loc>
<lastmod>2024-01-15</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
</urlset>
Schema.org is a structured data vocabulary that helps search engines understand your page content. It enables rich snippets in Google results.
⏱ 45 min • 🍽 8 servings • 285 kcal
JSON-LD example for an article:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Article Title",
"author": {"@type": "Person", "name": "Author"},
"datePublished": "2024-01-15"
}
</script>
The robots meta tag lets you control indexation and link following at the page level, offering finer control than robots.txt.
Common examples:
<meta name="robots" content="index, follow">
<meta name="robots" content="noindex, follow">
<meta name="robots" content="noindex, nofollow">
Our tool automatically verifies the presence and configuration of all these essential technical elements for your site's indexation.