Methodology

Legacy CMS Migration Playbook — URL inventory lock, redirect map generation, GSC post-cutover

Methodology for migrating off WordPress, Wix, or Squarespace to an AI-native Astro + Cloudflare Workers static stack without losing search equity. The URL inventory lock, the redirect map generation pattern, the GSC post-cutover verification checklist, and the six failure modes that produce avoidable traffic loss.

A botched migration doesn't announce itself. It shows up ninety days later as an organic-traffic graph bending the wrong way and a quarter of your pipeline that quietly went somewhere else.

Moving off WordPress, Wix, or Squarespace onto a static AI-native stack is a measurable-risk operation, not a copy-paste. Three things are at stake the moment DNS flips — your organic search equity, your ad platforms' conversion history, and every bookmark and inbound link pointing at the old URLs. Three disciplines protect all three: a locked URL inventory, a redirect map built from that inventory, and a 30-day Search Console verification. Skip them and the new site rebuilds from a baseline instead of inheriting the equity you spent years earning.

Why migration is a measurable-risk operation

In the four-week AI-native build, the cutover lands in week 4 — the single highest-stakes day of the project. Three categories of risk make it consequential.

Organic traffic loss. The legacy site carries a Google-indexed URL inventory, a backlink profile, and months or years of accumulated ranking signal. Cut over without disciplined 301s and that signal collapses. The migration retrospectives from practitioners like Aleyda Solis, Glenn Gabe, and Lily Ray are full of poorly executed cutovers that shed a large share of organic traffic for the first 90 days; disciplined ones hold transient loss to single digits and recover inside 30.

Ad-platform conversion history loss. Google Ads, Meta CAPI, and Microsoft Ads each match conversion actions against URL patterns. URLs that move without redirect updates produce conversion-action mismatches that suppress bidding signal exactly when you can least afford it.

Bookmark and inbound-link integrity. Customers, partners, and existing case studies link to specific legacy pages. A 404 storm in week one is a trust failure visible to users and to Search Console alike.

The playbook below kills all three risks with three disciplines.

The URL inventory lock — every legacy URL enumerated

The inventory is one markdown file listing every public URL on the legacy site (markdown for diff-readability; YAML or CSV work too). The minimum schema:

legacy_url               new_url                  redirect_status  page_type     priority  notes
/                        /                        200 (canonical)  home          1.0       —
/about/                  /about/                  301              about         0.8       —
/services/               /services/               301              services_hub  0.9       —
/services/seo/           /services/seo-content-at-scale/  301      service_detail 0.8     URL changed in the new IA
/blog/                   /blog/                   301              blog_hub      0.7       —
/blog/old-post-slug/     /blog/new-post-slug/     301              blog_detail   0.5       Slug normalized
/blog/category/seo/      /tag/seo/                301              tag_index     0.4       /tag/ namespace
/?p=1234                 /blog/specific-post/     301              blog_detail   0.4       Parameterized legacy URL
/wp-content/uploads/...  https://cdn.example.com/uploads/...  301  asset         0.2     Asset migration

Six rules apply:

  • Enumerate exhaustively. Crawl with Screaming Frog or Sitebulb, merge with the legacy XML sitemap, then merge with Search Console's own URL inventory (Indexing → Pages and the Performance URL filter). The three sources together catch URLs no single source does.
  • Resolve every parameterized URL. WordPress sites serve content at ?p=1234, ?page_id=5678, and category/tag combinations. Each needs an explicit entry — broad regex catches miss the edge cases.
  • Commit it on day one of week 1. The inventory is the source of truth for the redirect map; edits go through normal PR review.
  • The status column is authoritative. 200 stays unchanged; 301 is a permanent redirect; 410 is gone-for-real (sparingly — only content with no replacement and no value); 451 is legal removal (rare).
  • Priority seeds the new sitemap. The new sitemap.xml emits a <priority> from page type and topic-graph pagerank; this column is the seed value.
  • Notes document the non-obvious. URL changes, slug normalizations, deferred redirects — six months later this column is your audit trail.

The redirect map — built from the inventory, never by hand

The redirect map is the runtime instruction set for inbound requests to legacy URLs. An exporter reads the inventory and emits the rules the Cloudflare Worker (or origin server) consumes:

// scripts/build-redirect-map.ts
import { readInventory } from './lib/inventory'
import { writeFileSync } from 'fs'

const inventory = readInventory('_meta/migration/URL-INVENTORY.md')

const redirects = inventory
  .filter(row => row.redirect_status === '301' || row.redirect_status === '410')
  .map(row => ({
    source: row.legacy_url,
    destination: row.new_url,
    status: parseInt(row.redirect_status),
  }))

writeFileSync('dist/_redirects', formatCloudflareRedirects(redirects))

The Worker applies dist/_redirects at the edge; every cutover-day request to a legacy URL returns its 301 and the browser follows in about 50ms — fast enough that nobody notices. Three patterns to pin down:

  • Single-hop discipline. A chain (/old-1/ → /old-2/ → /new/) suppresses link equity and crawl efficiency. Every legacy URL redirects to its final destination in one hop — and if a new_url is itself a redirect source, the inventory is broken.
  • Wildcards for parameterized URLs. Cloudflare's _redirects supports :splat and * captures; parameterized entries usually need a Worker-side query-string parser mapping, say, the p parameter to the new URL via a lookup table.
  • Asset redirects. WordPress uploads at /wp-content/uploads/… either redirect to the new CDN or — better — get re-hosted at the same path under the new domain (Cloudflare R2 plus a build-time asset import handles it cleanly).

The Search Console verification checklist

Google Search Console is the canonical proof surface. Five checks across a 30-day window:

  • Day 0: submit the new sitemap.xml; leave the legacy one submitted too — both accept independently during the window.
  • Day 1: URL-inspect five representative pages (home, services hub, a service detail, a blog post, a case study) via Test Live URL. Each should read "URL is on Google" or "will be indexed" within 24 hours.
  • Day 7: in Indexing → Pages, legacy URLs should show "Page with redirect" — not "Not found (404)", which means a redirect rule missed one. New URLs show "Indexed" or "Discovered – currently not indexed" (normal on the recovery curve).
  • Day 14: in Performance, single-digit transient loss is normal; loss above 20% means one of the six failure modes is firing.
  • Day 30: the indexed-URL count should match or beat the pre-cutover baseline. A lag over 5% means tracing each missing URL to a failure mode.

Want a cutover that protects your search equity? Talk to the team that runs it under audit. →

The six failure modes

Six failure modes account for nearly all avoidable traffic loss in service-business migrations.

  • Redirect chains. Validate at inventory time that no new_url is itself a source URL.
  • Missing 301s on parameterized URLs. A Worker-side query-string parser plus an inventory entry for every known parameter.
  • A stale sitemap on the legacy domain. At cutover, 301 the legacy sitemap path or replace it with a one-line file redirecting to the new sitemap.
  • Missing canonicals on duplicate paths. Trailing slash, HTTP/HTTPS, www/non-www — pick one canonical form at the Worker boundary, emit <link rel="canonical"> on every page, and 301 the rest.
  • robots.txt blocking the new tree. A legacy Disallow: /staging/ that happens to match the new path during cutover. Review robots.txt on day 0; the new one should Allow: / and reference the new sitemap.
  • Over-aggressive 410s. Marking old posts "gone" forfeits their link equity; a 301 to a topical destination preserves it. Reserve 410 for content with no destination and no backlinks.

Wix and Squarespace specifics

Wix historically prefixed blog posts with /post/ and served a set of system URLs (/blog-feed.xml, /_files/). Enumerate every one; Wix's "Export site" surfaces a partial list, and Search Console catches the rest. Squarespace's URLs are more predictable, but its asset hosting at images.squarespace-cdn.com needs explicit migration — the build-time asset importer downloads each image, re-hosts it at the same canonical path, and updates the markdown bodies. Both platforms allow only limited pre-migration redirect config, so the canonical move is to point DNS at the new edge runtime on day 0 and let it handle every redirect from there.

Closing

The migration is the highest-stakes day of the four-week build, and the disciplines above — exhaustive inventory, build-time redirect map, 30-day Search Console verification — are the difference between single-digit transient loss and a crater that takes 90 days to climb out of. The boring stack that prints doesn't skip the playbook. Every cutover runs the same inventory file, the same redirect builder, the same checklist. The discipline is the moat.

Ready to ship a migration that preserves your search equity? Book a 30-minute call →

Share X LinkedIn
Build it yourself?

Get the kit, not just the theory.

We'll send the build checklist behind this post — and the next pillar when it ships. One email, no drip sequence. Unsubscribe in one click.

Want this built for you?

Book a discovery call. We'll walk your numbers.

20 minutes. Tell us what's broken, hear what we'd ship in the next 90 days. No pitch deck.