Sitemap: What It Is and How to Create One for SEO

Sitemap: What It Is and How to Create One for SEO
David Kaufmann
SEO Tutorials
14 min read

Sitemaps are one of the most commonly neglected SEO elements by most SEO consultants. Many bloggers label them as "not necessary," but when Google regularly updates information about them, we at least have to keep them in mind and optimized.

As we say, it's not an element without which the website cannot be crawled, indexed, classified, etc., but it can help us keep everything much more organized. One way to know if a website is carrying out an SEO strategy, or rather a good SEO strategy, is to look at its sitemap. With that small analysis, believe me, you can tell who is behind it.

But, do we know what Sitemaps are? Let's review below the concept of sitemaps, the available types, functions, intent, importance, the creation process of the map itself, advantages, and tips regarding such a tool that is imperative for any website, especially when it is very large in terms of pages and complex in relation to URL structure.

What Is a Sitemap

The definition of a sitemap can vary according to the existing types and their function and purpose. However, in general it can be said that a sitemap is an organizational plan of a website, in relation to its URLs and internal pages, as well as sections and the data stored internally.

To simplify the above definition, it can be said that it is the index of the website. To make a simpler analogy, it could be compared to the index we find when opening a book. Is it necessary to read the book? No, but if we see that the index is poorly written, with pages that don't exist, out of order, etc., what first impression would we have? How could we quickly and conveniently access a specific part of the book? With some differences, the sitemap closely resembles it.

The above concept is valid for sitemaps in general, changing according to the kind of sitemap being discussed or that a platform has, since there can be several, which we will address in its corresponding section. A sitemap is also a graphical representation of a site and how it is organized, which at the same time is included on the platform to improve its navigation and ease of use by the user.

Thus, sitemaps involve in their concept an organizational, technical, and usage nature that facilitates access to the platform, both for users and search engines, and they are also an important web development tool.

Sitemap example
Sitemap example

Function of Sitemaps

Today, a site having a sitemap is an advantage, particularly when the platform turns out to be complex, with a large number of web addresses and included sections, making it an important tool for technical reasons, ease of use, organizational intent, and also for traffic generation.

A sitemap helps to understand a website and its structure, whether it is a simple project, with a home page, contacts, sections, or very complex platforms such as ecommerce sites with millions of products, subsections, blog, tags, etc.

We have already seen the organizational sense of this element, but it has even more value as an SEO factor. Providing Google with the clear structure of our website, prioritizing the most important URLs, reducing those we are not interested in, etc., greatly helps Google "understand" in a faster and clearer way, and therefore greatly helps with the indexing of the website.

SEO Advantages of Having a Well-Implemented Sitemap

Focusing more on purely SEO aspects, let's see as a summary the main advantages of having an updated and optimized sitemap:

  • Improves site indexing, as we mentioned before. Providing Google with the order and importance of our URLs will help with better indexing.

  • Helps us detect errors quickly. Once a sitemap has been created, it is advisable to upload it to the root of your website, and notify Google via Search Console. Google will crawl that sitemap, informing you if it found any problem in any of the URLs listed, so we can see how Google understands those URLs and improve the ones that need it.

  • Organize the website. Within our website, for example, products will not have the same importance as blog articles, the privacy policy, etc. Making a sitemap is a very good way to do a self-analysis and prioritize our SEO goals based on what is reflected in it.

  • It is a way to force us to continuously review the state of the website. A sitemap should be a living element. This means that we will have to clean it up continuously, since having it poorly optimized not only wouldn't help but could hurt. In this way, we are "forcing" ourselves to have greater control over our website.

What a Sitemap Should Look Like

Below, we are going to see some fundamental points you should keep in mind when creating your sitemap:

  • Do not add URLs to the sitemap with a response code (status code) other than 200.

  • Do not add URLs blocked in robots.txt to the sitemap.

  • Do not add URLs with a noindex tag to the sitemap.

  • Do not add non-canonical URLs to the sitemap (that is, let's not add URLs that are canonicalized to another; for this we will directly use the original URL).

  • Avoid adding pages without SEO value to the sitemap (here common sense prevails; if we see that there is a large group, for example, of pages without SEO value such as PDFs, it may be advisable not to include them).

Typical Sitemap Errors

Many times, whether due to an oversight or external factors, we encounter many errors in sitemaps. Luckily, we have Search Console at our disposal, which warns us of all the detected errors and allows us to act to fix everything.

Sitemap errors
Sitemap errors

Below, you can see the most common errors we usually find:

  • "The submitted URL contains the noindex tag": if we submit a URL in the sitemap that has the noindex tag, we are giving confusing signals to the search engine. On one hand we tell it not to index it, and on the other hand we do. That's why it's important to maintain consistency.

  • "The sitemap includes URLs that the robots.txt file has blocked": in the same way as with noindex, if we block a page or a page pattern via robots.txt and then submit it in the sitemap, we will again be confusing search engines and, above all, wasting crawl budget.

Most Used Types of Sitemaps

Although the XML sitemap is the most common and relevant, the truth is that it is not the only one that exists, and there are several available with different functions and approaches. These are:

  • XML Sitemap: the XML sitemap is specifically designed to facilitate the indexing of URLs of a website, showing the engine that they are available there to be crawled and included in search results. This type of sitemap is essential for large sites that would have crawling problems.

  • HTML Sitemap: this type of sitemap shows the hierarchical order of the platform, with sections ranging from first category or main page, to second and third level with sections and subsections. This type of sitemap is available to the user and in fact facilitates their browsing experience.

  • ROR Sitemap: the ROR sitemap can be considered as a variant of the XML, but with a much more robust nature, as it has descriptions of the URLs, sections, among others, and supports multiple formats, which is ideal for sites with product and service pages.

  • Video Sitemap: when a site has extensive multimedia content, it is advisable to include a video sitemap that includes all URLs with this type of file, including name, thumbnail, description, and links to landing pages. It is done to facilitate search engine crawling and find files in .mpg, avi, mkv formats, among others.

  • News Sitemap: news sitemaps aim to create an organizational scheme that allows developers to handle the news and information that is placed on platforms like Google News, providing information on place, name, and content of the news and even keywords.

  • Image Sitemap: a sitemap specific to images and their content. It is very interesting to use it in portals where images have a relevant weight, such as an ecommerce of visual products. In this way we favor appearing in Google Images search results.

When to Use Sitemaps

It is usually recommended that any site have a related sitemap due to its advantages, but in relation to specific characteristics, using it on a platform becomes almost mandatory, which are:

  • When a website is very large: when a platform is very extensive according to sections and URLs, a sitemap is essential because it will facilitate navigation and also the crawling of each page, which can be complicated when there are many for crawlers or spiders of search engines like Google.

  • When a site is new: when a site is new it is recommended to have a sitemap, because when it is in the primary stage it will not have links connecting to it, obstructing crawling. So, the sitemap will make things easier for Google and other search engines.

  • When there are a large number of isolated URLs: in relation to the previous reason, it is recommended to add a sitemap when a site has a file with many isolated addresses or ones that do not connect with others, since it makes crawling easier for the search engine bots.

Create a Sitemap in WordPress

For almost anyone seeing a sitemap in an image, it would seem very complicated due to all the connections and hierarchies it has, apart from the web development knowledge that would be needed. Fortunately, there are simpler ways to create a sitemap, through CMS (Content Management System).

WordPress is undoubtedly the most used CMS worldwide for managing and creating websites, and it offers a tool through a plugin that allows generating sitemaps automatically. That plugin is from Rank Math (although there are many other plugins on the market for WordPress such as Yoast SEO that generate this element for you. In this case, we name Rank Math for being one of the best known and completely free. After having it in WordPress, the following steps to generate the sitemap are:

  • Access WordPress using the credentials and in the dashboard enter the "Rank Math" option.

  • Once in "Rank Math", some options related to the plugin will appear, where you must choose "Dashboard" and then select the button that activates the "Sitemaps".

  • Changes are saved and the XML sitemap has been created.

  • To view the map, click on the link that appears at the top of the page.

  • The sitemap is updated automatically after adding new addresses and sections, without having to do anything else.

Rank math sitemaps 1.jpg
Rank math sitemaps 1.jpg

Of course, through WordPress and the Rank Math plugin is not the only way to generate sitemaps, since each CMS has its own tools to do it. For example, Shopify-based ecommerce sites have the advantage that the platform itself generates the XML sitemap, including products and addresses, posts, images, collections, among others, being indispensable for this type of site.

In the same way, there is availability online of different tools to create sitemaps independently of the XML type, which is the most usual, such as XML Sitemaps and SEOptimer Sitemap Generator.

XML Sitemaps

Generating a sitemap with XML Sitemaps is simple, since the process is completely automatic. To start, visit the website with your preferred browser, and in the bar insert the URL of the site to create the sitemap.

By clicking "Start" the platform will crawl the entire site and generate the .xml file for download. If the site is very large and has more than 500 internal URLs, the paid version must be used, so it is an ideal tool for small sites.

SEOptimer Sitemap Generator

SEOptimer Sitemap Generator is another useful tool to generate sitemaps, only having to enter the URL of the platform and some additional information such as the frequency at which the site changes, approximate number of URLs, last modification date, among others.

When you enter the website and input all the information mentioned above, the platform will generate an .xml file that can be downloaded and shows the sitemap with the addresses. We remind you that it is HIGHLY recommended, once you have the sitemap file at hand generated by any tool, to register it in Google Search Console, as it will facilitate its reading by Google.

Sitemap Limitations

When creating sitemaps, there are a series of limitations that we must take into account so as not to execute them incorrectly:

  • All URLs contained in a sitemap must belong to the same domain. We cannot add subdomains within a sitemap; for that we would have to make a separate one.

  • Sitemap files must have UTF-8 encoding implemented.

  • At most we can include 50,000 URLs within a sitemap.

  • The maximum weight for this element is 50 MB.

Sitemap Index

When we find ourselves in any of the situations mentioned above that limit our project, Google provides us with sitemap indexes. That is, we can create a set of different sitemaps, and relate them all through said index. We could say it's a sitemap of sitemaps. Here we leave you the official Google information in this regard.

Sitemap index
Sitemap index

NOTE: As you have been able to see in the initial case, one of our clients (Chess.com) has it applied and the truth is that it works very well. All sitemaps are classified by category and then by language in order to do a correct follow-up and categorization of all pages.

Sitemaps with Geolocated Versions

If we have different versions of a URL for different zones, as is the case with multilingual websites, we can choose between creating one sitemap per language or using a single sitemap in which all translations are added. If you have doubts about the process, you can always see Google's guidelines on this matter for this specific case.

multilingual sitemap
multilingual sitemap

Bonus: Some Extra Uses for Sitemaps

To finish, we leave you some "special" uses for sitemaps that you may find useful:

  • Speed up the deindexation of pages. Yes, as you hear, we can provisionally create a sitemap with all the URLs we want to deindex, adding the nofollow, noindex attribute to them and uploading it to Search Console. In this way we will be forcing Google to crawl them sooner and therefore read the noindex, so we will accelerate the deindexation of those URLs.

  • Speed up the removal of pages. Along the same lines as the previous point, but adding a 410 status code (removed) to the URLs that we want Google to permanently remove from its index. By uploading the sitemap with these URLs exclusively, we will also favor this process. Don't forget to revert it once they are removed.

  • Spy on the competition. By extracting all the URLs from their sitemap and detecting which they prioritize, which they don't, which have errors, etc. For this I will share a tool we found on the internet in the form of a very convenient Google Sheet:

https://docs.google.com/spreadsheets/d/1jKP30CAJEL-rQ8PUnkNfJOiBfDN1XWNauTEUxBU1-w8/copy

You just have to make a copy and modify this value with the website you want to extract the URLs from its sitemap:

extract sitemap urls
extract sitemap urls

We hope with this complete Sitemap Guide you are able to create them effectively and optimized in your web projects.

Author: David Kaufmann

David Kaufmann

I've spent the last 10+ years completely obsessed with SEO — and honestly, I wouldn't have it any other way.

My career hit a new level when I worked as a senior SEO specialist for Chess.com — one of the top 100 most visited websites on the entire internet. Operating at that scale, across millions of pages, dozens of languages, and one of the most competitive SERPs out there, taught me things no course or certification ever could. That experience changed my perspective on what great SEO really looks like — and it became the foundation for everything I've built since.

From that experience, I founded SEO Alive — an agency for brands that are serious about organic growth. We're not here to sell dashboards and monthly reports. We're here to build strategies that actually move the needle, combining the best of classical SEO with the exciting new world of Generative Engine Optimization (GEO) — making sure your brand shows up not just in Google's blue links, but inside the AI-generated answers that ChatGPT, Perplexity, and Google AI Overviews are delivering to millions of people every single day.

And because I couldn't find a tool that handled both of those worlds properly, I built one myself — SEOcrawl, an enterprise SEO intelligence platform that brings together rankings, technical audits, backlink monitoring, crawl health, and AI brand visibility tracking all in one place. It's the platform I always wished existed.

→ Read all articles by David
More articles from David Kaufmann

Discover more content about this author