Back to blog
11 min read

XML Sitemaps for Next.js: How to Generate, Submit, and Validate After Migration

XML Sitemaps for Next.js: How to Generate, Submit, and Validate After Migration

An XML sitemap is not a guaranteed path to indexing — but a broken or incomplete sitemap is a reliable path to indexing problems. After a migration from WordPress to Next.js, your sitemap situation changes entirely. The old WordPress sitemap is gone. A new one needs to be generated, validated, and submitted on launch day. If that does not happen — or if the new sitemap is technically correct but contains the wrong URLs — Google's crawlers may spend weeks working from an outdated map of your site.

Most migration teams treat the sitemap as an afterthought: something to generate quickly from a plugin or scaffold, submit to Search Console, and move on from. The problem is that a quickly assembled sitemap often contains errors that actively mislead Googlebot. URLs that redirect instead of returning 200. Pages marked with noindex that should be indexed. lastModified dates set to the build time rather than the actual last-edit date. These are quiet failures that delay the full indexing of your migrated site.

This guide covers how to generate a correct sitemap in Next.js App Router, what should and should not be included, how to submit it on launch day, and how to validate that it accurately represents your site after the migration is complete.

Quick Checklist

  • Generate a sitemap that includes every indexable page (not legal pages at priority 0.3)
  • Use lastModified dates based on actual content changes (not new Date() on every build)
  • Exclude noindex pages, redirects, and 404 pages from the sitemap
  • Submit the new sitemap to Google Search Console on launch day
  • Remove the old sitemap from Search Console (if it was a different URL)
  • Validate by crawling: every URL in the sitemap should return 200 with correct canonical

Why Your Sitemap Matters More During (and After) a Migration

Under normal conditions, Google discovers pages on an established site primarily through internal links and external backlinks. The sitemap serves as a supplementary discovery mechanism — useful for new pages and pages with few internal links, but not the primary crawl driver.

During a migration, the calculus changes. Google's crawlers arrive at your new domain (or your existing domain on its new tech stack) without a complete understanding of your new URL structure. If your URLs changed during the migration — which is common when moving from WordPress's ?p=123 permalink format to clean paths like /blog/post-title/ — Google needs a signal to understand which new URLs correspond to which old URLs. Your sitemap, combined with your redirect structure, provides that signal.

A correct sitemap after a migration speeds up the reindexing process for your new URL structure. It tells Google exactly which pages you want indexed, with what priority relative to each other, and when each was last meaningfully updated. An incorrect sitemap creates confusion: Google may waste crawl budget on URLs that return redirects, spend time processing pages you did not intend to index, or miss newly added pages entirely.

The full migration checklist covers the broader sequence of migration tasks, but sitemap generation and submission deserves dedicated attention because the details matter.

How to Generate Sitemaps in Next.js (App Router)

The Next.js App Router provides a built-in sitemap API using the MetadataRoute.Sitemap type. This approach generates your sitemap dynamically at build time (for static generation) or at request time (for server rendering), and it integrates cleanly with the rest of your Next.js metadata configuration.

Static Sitemaps With MetadataRoute.Sitemap

For sites with a manageable number of static pages, the simplest approach is a sitemap.ts file at the root of your app/ directory. The file exports a default function that returns an array of sitemap entries conforming to the MetadataRoute.Sitemap type:

// app/sitemap.ts
import { MetadataRoute } from 'next'

export default function sitemap(): MetadataRoute.Sitemap {
  return [
    {
      url: 'https://yourdomain.com',
      lastModified: new Date('2026-04-01'),
      changeFrequency: 'monthly',
      priority: 1,
    },
    {
      url: 'https://yourdomain.com/services/migration',
      lastModified: new Date('2026-03-15'),
      changeFrequency: 'monthly',
      priority: 0.9,
    },
    // additional entries
  ]
}

Next.js will serve this at /sitemap.xml automatically. The lastModified field should reflect the date the page content was last meaningfully changed — not new Date(), which would set every URL's last-modified date to the current build time. Setting dynamic last-modified dates that change on every build gives Google an incorrect signal that every page on your site is updated constantly, which can cause crawlers to visit more frequently than needed while also making the field meaningless as a prioritization signal.

See the Next.js sitemap documentation for the full type definition and available fields.

Dynamic Sitemaps From a Headless CMS

For sites where content is managed in a headless CMS (Contentful, Sanity, Prismic, or similar), the sitemap function needs to be async and fetch the current set of published content at build or request time:

// app/sitemap.ts
import { MetadataRoute } from 'next'
import { getAllPosts } from '@/lib/cms'

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  const posts = await getAllPosts()

  const postEntries: MetadataRoute.Sitemap = posts.map((post) => ({
    url: `https://yourdomain.com/blog/${post.slug}`,
    lastModified: new Date(post.updatedAt),
    changeFrequency: 'weekly',
    priority: 0.7,
  }))

  const staticPages: MetadataRoute.Sitemap = [
    {
      url: 'https://yourdomain.com',
      lastModified: new Date('2026-04-01'),
      changeFrequency: 'monthly',
      priority: 1,
    },
  ]

  return [...staticPages, ...postEntries]
}

The critical detail here is pulling updatedAt from your CMS rather than using the current date or the post's published date. Most headless CMS platforms track when content entries were last modified — use that field. Google uses lastModified as one input for deciding how frequently to recrawl a URL, so accurate dates make the crawl schedule more efficient.

Handling Large Sites (Sitemap Index Files)

The Google sitemap specification limits individual sitemap files to 50,000 URLs and 50MB uncompressed. Most marketing and content sites fall well under this limit. If you are migrating a large site — an e-commerce catalog, a publishing site with thousands of posts, or a site with many location pages — you will need a sitemap index file that references multiple individual sitemaps.

Next.js supports sitemap splitting through the generateSitemaps function. Define segments of your content (by category, by date range, by content type) and generate a separate sitemap file for each segment. The App Router will automatically create a sitemap index at /sitemap.xml referencing the individual files at /sitemap/0.xml, /sitemap/1.xml, and so on.

For headless WordPress deployments — where WordPress manages the content but Next.js renders the front end — see our headless WordPress SEO guide for additional sitemap considerations specific to that architecture.

What Your Sitemap Should Include (and What It Shouldn't)

The sitemap's purpose is to tell Google which pages you want indexed. That means it should contain exactly the pages where a 200 status code, a canonical tag pointing to itself, and no noindex directive all apply simultaneously.

Include:

  • All public-facing service and product pages
  • All blog posts and editorial content that is fully published (not draft)
  • Category and tag pages, if they have meaningful content (more than a list of post titles)
  • The homepage
  • Location pages, if they contain unique content per location

Exclude:

  • Pages marked with <meta name="robots" content="noindex"> — including these in the sitemap sends a contradictory signal
  • Pages that return a 301 or 302 redirect — include only the final destination URL, not the redirecting URL
  • Pages that are intentionally kept out of search results (login pages, account pages, checkout flows)
  • Duplicate content pages (pagination pages beyond page 1 if they have thin content, filtered views with duplicate content)
  • Legal and policy pages are fine to include, but do not assign them priority 0.8 or 0.9 — a priority of 0.3 to 0.4 is appropriate since these pages do not generate organic search value

One common mistake is including staging or preview URLs in a production sitemap. If your CMS generates preview URLs for unpublished content, make sure those URLs are filtered out before the sitemap function returns its array. Similarly, if your site has an environment variable for the base URL, use it consistently — do not hard-code localhost or a staging domain into a function that runs in production.

Submitting to Google Search Console on Launch Day

Sitemap submission should happen within the first few hours of launch — not after you have settled in and verified other things. The earlier Google has the new sitemap, the earlier it can begin processing the new URL structure.

In Google Search Console, navigate to Sitemaps in the left menu and enter the full URL of your sitemap: https://yourdomain.com/sitemap.xml. Click Submit. Search Console will immediately attempt to fetch and validate the sitemap and will show you the number of URLs discovered.

If your sitemap is a sitemap index, submit the index URL — Search Console will crawl and process all referenced child sitemaps automatically.

Also check whether your previous sitemap from WordPress is still registered in Search Console. WordPress sitemaps were often served at URLs like /sitemap_index.xml or /wp-sitemap.xml. If that URL now returns a 404 or redirects, remove the old sitemap entry from Search Console to avoid confusion. Submit only the new sitemap URL.

One detail that trips up many teams: if you have been using a plugin like Yoast SEO on WordPress, its sitemap URL was probably different from the Next.js default /sitemap.xml. Make sure you know the old sitemap URL, remove it from Search Console, and submit the new one.

After submission, check the Sitemaps report in Search Console over the next 24–72 hours for any errors. Common post-launch errors include "URL not found (404)" (a URL in the sitemap is not being served correctly), "Submitted URL blocked by robots.txt" (your robots.txt is inadvertently blocking some page paths), and "Submitted URL has crawl issue" (the URL returns a server error intermittently). Address any errors immediately.

Validating Sitemap Accuracy Post-Launch

Search Console will tell you how many URLs it discovered in your sitemap, but it will not tell you whether those URLs are all returning 200 status codes, have correct canonical tags, or are free of noindex directives. For that, you need a crawl-based validation.

After launch, run Screaming Frog against your sitemap specifically: open Screaming Frog, go to Mode → List, and paste in the sitemap URL. Screaming Frog will fetch the sitemap, extract every URL, and crawl each one individually. The resulting report will show you the status code, canonical URL, and robots meta tag for every URL in your sitemap.

What you are looking for:

  • Status code 200 for every URL. Any URL returning a 301, 302, or 404 should be removed from the sitemap or fixed at the source.
  • Self-referencing canonical tags. If a URL in your sitemap has a canonical pointing to a different URL, Google will follow the canonical and not index the sitemap URL. This is often a sign of a duplicate content issue or a misconfigured canonical implementation.
  • No noindex directives. A URL in your sitemap with a noindex robots meta tag sends contradictory signals. Google will likely follow the noindex directive, so including the URL in the sitemap wastes crawl budget without benefit.

For each issue found, either fix the underlying problem (update the canonical, remove the noindex, fix the redirect) or remove the URL from the sitemap until it is corrected.

Run this validation on launch day and again 2–3 weeks after launch, after your deployment has stabilized and any post-launch fixes have been applied. The SEO Parity Audit includes this validation as a standard component of any migration review.

FAQ

Does submitting a sitemap to Google Search Console guarantee indexing?

No. A sitemap tells Google which pages you want indexed, but Google makes its own decision about whether to index each URL based on quality signals, content relevance, and crawl budget. Pages that are thin on content, duplicate other pages, or return errors will not be indexed simply because they appear in a sitemap. Submitting a sitemap speeds up discovery and helps Google process your URL structure more efficiently, but it does not override Google's quality-based indexing decisions.

How often should I update my XML sitemap?

For a static Next.js site, the sitemap regenerates on every build. For dynamic sitemaps fetched from a headless CMS, the sitemap reflects the current state of your content on every request (or at build time if you are using static generation with on-demand revalidation). You do not need to manually update or resubmit the sitemap every time you publish content — Google will recrawl it periodically. However, after a major change like a migration, a manual resubmission in Search Console is worthwhile to prompt faster recrawling.

What's the maximum number of URLs in a single sitemap file?

The Google-specified limit is 50,000 URLs per sitemap file and 50MB uncompressed file size. In practice, well under 50,000 URLs per file is safer for performance — very large sitemaps can be slow to generate and parse. If your site has more than 10,000–15,000 URLs, consider splitting into a sitemap index with multiple child sitemaps organized by content type (one for blog posts, one for product pages, one for static pages). This also makes it easier to diagnose indexing issues since you can see coverage statistics per sitemap in Search Console.

Should I include blog posts in my sitemap even before they're indexed?

Yes. Include all published, indexable blog posts in your sitemap from the moment they are published. The sitemap is how Google discovers new content most efficiently. There is no benefit to withholding a page from the sitemap until after it is indexed — the sitemap is a discovery mechanism, not a confirmation of indexed status. The only exception is draft or preview pages that you do not want indexed — those should never appear in a production sitemap regardless of their status in your CMS.

Can I have multiple sitemaps for a Next.js site?

Yes, and for large sites this is the recommended approach. Use Next.js's generateSitemaps function to create multiple sitemap files, automatically referenced in a sitemap index. You can also create multiple sitemap.ts files in different route segments of the App Router — for example, app/blog/sitemap.ts and app/products/sitemap.ts — each returning entries for their respective content types. Next.js handles the sitemap index generation automatically. Submit the root sitemap index URL to Search Console, and it will process all child sitemaps.

Next Steps

A correctly generated, accurately populated sitemap is a small but consequential piece of a successful migration. It takes a few hours to implement correctly — and the payoff is faster indexing, more efficient crawl budget use, and a clear signal to Google about your site's intended structure.

Related posts:

Services: