How to Perform a Deep SEO Analysis
and Fix What Google Already Knows About Your Site
A tactical guide to detecting spam signals, pruning low-value pages, consolidating content, forcing indexation on 400+ URLs, and building a 90-day traffic recovery plan.
Most SEO audits stop at a Screaming Frog crawl and a few meta-tag tweaks. But if your site has hundreds of pages with flat or declining traffic, the problem runs deeper. Google may already have quietly deprioritised your domain — and a surface-level fix won’t move the needle. This guide shows you how to go further: diagnose spam signals, make ruthless page-level decisions, and execute a structured 90-day recovery.
Does Google Think Your Site Is Programmatic or Spam?
Google doesn’t usually send you a letter. Instead, it quietly demotes your pages through algorithmic filters — the HCU (Helpful Content Update), SpamBrain, and Panda-era signals that still live inside the core algorithm. Here is how to read the signals yourself.
Check Search Console for the red flags
Open Google Search Console → Pages → filter by “Not indexed.” The reasons Google gives you are diagnostic gold:
Run the HCU / site-quality self-test
Google’s Helpful Content system evaluates your entire domain, not just individual pages. Ask yourself — honestly — the following:
- Were many of these pages created primarily to rank, not to serve a reader’s actual need?
- Does the site have a large volume of pages that cover the same keyword cluster from slightly different angles?
- Is there thin or auto-generated content anywhere on the domain — even a single section?
- Do pages have author expertise clearly demonstrated (bylines, credentials, first-hand experience)?
- Is the click-through rate (CTR) in Search Console consistently below 1% across many high-impression pages?
If your site dropped sharply during a Helpful Content Update (March 2024 being the most significant) and has not recovered, treat it as a confirmed quality classification — not a ranking fluctuation. Recovery requires content removal, not content addition.
The site: operator test
Search site:yourdomain.com in Google. Compare the number of URLs returned against the number of pages you know exist. If Google is surfacing far fewer than your actual page count, it has already applied a crawl-budget restriction to your domain. Below 50% visibility on a site: query is a strong spam/quality signal.
SpamBrain indicators (manual action check)
Go to Search Console → Security & Manual Actions. A manual action is an explicit notification. But the absence of one doesn’t mean you’re clean — algorithmic suppression is invisible. Cross-reference with:
- Ahrefs / Semrush traffic history vs. HCU rollout dates (March 2024, August 2023, September 2023).
- Sudden drop in Referring Domains — lost backlinks can indicate a deindexation of link-source sites.
- Anchor text patterns in your backlink profile that are over-optimised (exact-match anchors above 40–50%).
Which Pages Should Be Deleted?
Page deletion is the highest-leverage action on a content-heavy site. It feels counterintuitive — less content, more traffic — but Google rewards domain quality, not quantity. Here is the decision framework.
Export your full Search Console URL performance report (last 16 months) before removing anything. You need data, not memory, to make these calls.
The 4-column decision matrix
Score each page against these four dimensions. A page that fails three or more should be deleted and 301-redirected.
| Dimension | Delete If… | Data Source | Action |
|---|---|---|---|
| Organic Clicks | 0–5 clicks over 16 months | Search Console → Performance | Delete |
| Impressions | Under 50 impressions (low demand signal) | Search Console → Performance | Delete |
| Backlinks | 0 referring domains, 0 internal links | Ahrefs / Screaming Frog | Delete |
| Word Count / Depth | Under 300 words, no original insight | Screaming Frog / Manual check | Merge or expand |
| Revenue / Conversion | Zero conversions in GA4 ever | Google Analytics 4 | Evaluate |
| Duplicate Intent | Same topic covered on 2+ other pages | Manual audit / SERP check | Merge |
| Ranking Position | Position 50+ for all tracked keywords | GSC / Semrush | Improve or delete |
Categories to delete with confidence
- Tag and category archives with fewer than 5 posts — these are almost always thin and should be noindex’d or deleted.
- Date-based archives (WordPress’s default /2021/03/ URLs) — zero SEO value, high crawl waste.
- Author pages for users who have only written 1–2 posts.
- Paginated pages beyond page 2 (/page/3/, /page/4/) unless they get actual traffic.
- Old press releases and announcements that rank for nothing and link to no current content.
- Stub product/service pages — pages that were “coming soon” and never filled in.
How to delete properly
- 301 redirect deleted pages to the closest relevant live page (never to the homepage by default).
- Submit a URL Removal Request in GSC immediately after deletion — don’t wait for Google to re-crawl.
- Remove internal links pointing to deleted URLs within the same deployment.
- Update your XML sitemap the same day to exclude deleted URLs.
Sites that delete 20–40% of their lowest-quality content typically see organic traffic lift within 60–90 days on remaining pages. The domain-level quality signal improves across the board.
Which Pages Should Be Merged?
Merging is the right call when two or more pages are competing for the same searcher intent — a problem known as keyword cannibalization. Instead of one strong page, you have two weak ones. Merging consolidates backlink equity, signals, and topical depth into a single, authoritative piece.
How to identify merge candidates
Method 1 — The GSC Cannibalization Check
In Search Console, export the Performance report. Sort by query. Look for the same keyword appearing in the “Top pages” column across multiple different URLs. If keyword X brings traffic to both /blog/a/ and /blog/b/, you have a cannibalization problem.
Method 2 — Ahrefs / Semrush Organic Keywords Overlap
Pull organic keyword reports for suspected duplicate pages. If two URLs rank in positions 6–20 for the same keyword, they are likely splitting authority that could push a single combined page into the top 3.
Method 3 — Semantic clustering
Group all your URLs by topic cluster. Any cluster with more than 3–4 URLs covering the same core question is a merging candidate. Tools like Screaming Frog + a custom extraction, or Keyword Insights’ clustering feature, can automate this.
Merge execution checklist
- Choose the winner: The URL with the most backlinks and best historical traffic becomes the canonical destination.
- Combine the content: Take the best unique sections from each page and rewrite them into a single, longer, more comprehensive piece.
- 301 all losers → winner: Every merged page gets a 301 redirect to the canonical URL.
- Update internal links: Replace all internal links pointing to the old URLs with the new canonical URL.
- Preserve heading structure: Use H2/H3 to logically organise the merged content — don’t just paste two articles together.
- Re-submit the winner URL: Use GSC’s URL Inspection → Request Indexing after the merge is live.
If the page you’re eliminating has at least 1 referring domain or has appeared in GSC with 10+ impressions in the last year, always merge (301 redirect). If it has zero backlinks and zero impressions — delete with a 301 to the closest relevant page.
How to Get Your 412 Unindexed Pages Indexed
Having hundreds of unindexed pages is both a symptom and a cause of poor domain health. Here is a systematic approach to force Google to re-evaluate your content — after you’ve completed the deletion and merging steps above.
Do not try to index 412 pages before you’ve deleted the low-quality ones. Asking Google to index weak content accelerates the quality signal problem. Prune first. Then index.
Tier the pages before you act
Not all 412 pages deserve to be indexed. Categorize them:
- Tier 1 — Index now: Pages with strong content, internal links, and at least 1 backlink. These are Google’s oversight, not yours.
- Tier 2 — Index after improvement: Pages with decent content but low word count or missing meta data. Fix first, request after.
- Tier 3 — Do not index: Thin, duplicate, or intentionally noindex’d URLs. Leave them alone.
Crawl budget recovery
If Google isn’t indexing your pages, it has reduced its crawl budget allocation for your domain. Fix these first:
- Eliminate redirect chains: A→B→C should become A→C. Every unnecessary hop wastes budget.
- Fix broken internal links (4xx): Broken links signal abandonment to Googlebot.
- Compress and speed up pages: Core Web Vitals below 2.5s LCP correlates with higher crawl frequency.
- Reduce crawl noise: Block search, filter, and infinite-scroll URLs in robots.txt that generate duplicate or low-value pages.
robots.txt audit
Check your robots.txt at yourdomain.com/robots.txt. Common mistakes that block indexation:
# Common robots.txt mistakes that block good pages:
# ❌ Over-broad disallow
Disallow: /blog/ # blocks ALL blog pages
# ✅ Corrected — only block specific patterns
Disallow: /blog/tag/
Disallow: /blog/page/
Disallow: /?s= # search results
Disallow: /wp-admin/
# Ensure your sitemap is declared
Sitemap: https://yourdomain.com/sitemap.xmlXML sitemap hygiene
- Only include URLs that return a 200 status and are not noindex’d.
- Remove deleted, redirected, and noindex’d pages from your sitemap immediately.
- Keep the sitemap under 50,000 URLs; split into sitemap index files if larger.
- Set accurate
lastmoddates — only update this when content actually changes.
Internal linking injection
Google discovers pages through links, not sitemaps alone. Unlinked or orphaned pages will not be indexed reliably. For each Tier 1 page:
- Add at least 3 contextual internal links from relevant, already-indexed pages.
- Include the target URL in a site-wide navigation element if it’s a pillar page.
- Build a topic hub page that links to all cluster articles — this is your fastest internal link boost.
GSC Indexing Request (use sparingly)
GSC’s “Request Indexing” button is rate-limited — roughly 10–12 requests per day, 200 per month. Use it only for Tier 1 pages after you’ve applied all the above fixes. Sending indexing requests on broken or thin pages wastes quota and doesn’t help.
Use the Google Indexing API (officially for job postings and livestreams, but widely used for all URLs) combined with a script to submit up to 200 URLs/day. Pair with IndexNow for Bing/Yandex simultaneous submission at no cost.
The 90-Day Plan to Increase Traffic
Recovery is not linear, but it is predictable if you follow a sequenced plan. The three phases below are ordered by dependency: each phase unlocks the next. Do not skip ahead.
- Week 1: Export full GSC performance (16 months). Tag every URL as Delete / Merge / Keep / Improve using the decision matrix.
- Week 1–2: Execute all deletions. 301 redirect each to the nearest live page. Remove from sitemap immediately.
- Week 2: Execute all page merges. Rewrite merged content — don’t concatenate. Update all internal links. Request re-indexing via GSC.
- Week 2–3: Full technical audit: redirect chains, broken links, robots.txt, canonical tags, Core Web Vitals. Fix every issue found.
- Week 3–4: Sitemap rebuild. Submit clean sitemap. Implement IndexNow. Fix crawl budget drains (session IDs, infinite pagination, parameter URLs).
- Week 4: Baseline measurement: snapshot current indexed page count, average position, clicks, impressions. This is your recovery benchmark.
- Week 5: Identify your 10 highest-potential “keep” pages (most impressions, positions 11–30). These are your quick-win targets.
- Week 5–6: Deep-rewrite each target page: expand word count, add original research/data, improve headings, add schema markup, add author bio with credentials.
- Week 6: Internal link audit. Every target page should have 5+ contextual internal links from already-indexed, relevant pages. Add them.
- Week 6–7: Start a digital PR / link-building sprint. Aim for 3–5 relevant, editorial backlinks to your core pages. Broken link building and resource page outreach are fastest.
- Week 7–8: Build or improve one topical hub page per content cluster. Link all cluster articles from it. Submit hub pages for indexing via API.
- Week 8: Check GSC for any new indexation of previously unindexed Tier 1 pages. Celebrate small wins; document what’s working.
- Week 9: Using GSC query data, identify “rising” keywords (impressions up, clicks flat). Create optimized content targeting these — you already have partial signals.
- Week 9–10: Publish 2–4 new high-quality posts targeting mid-funnel intent. Each must be 1,500+ words, expert-authored, internally linked from day one.
- Week 10: Update your 5 oldest high-performing posts. Add 2025 data, new sections, updated statistics. Republish with updated dates. Submit for re-indexing.
- Week 10–11: Second link-building sprint targeting new content. Use HARO / journalist outreach to land at least 2 authoritative mentions.
- Week 11–12: Measure full-funnel: organic sessions (GA4), impressions, clicks, average position (GSC), indexed pages, referring domains (Ahrefs). Compare to Day 1 baseline.
- Week 12: Write your next 90-day plan based on what moved and what didn’t. The compounding phase is about to start — don’t stop here.
KPIs to track weekly
| Metric | Tool | Target (90 days) | Priority |
|---|---|---|---|
| Indexed pages | Search Console | +30–50% of cleaned URL set | Critical |
| Average position | Search Console | Improve by 5–10 positions on target pages | Critical |
| Organic clicks | Search Console / GA4 | +20–40% vs. 90-day prior period | Critical |
| Referring domains | Ahrefs | +10–20 new RDs | High |
| Core Web Vitals | GSC / PageSpeed | All pages: Good (green) | High |
| Crawl coverage | Search Console | Reduce “Crawled – not indexed” by 60% | Medium |
| CTR | Search Console | >2.5% average across all pages | Medium |
The Core Principle: Quality Over Quantity, Always
Google’s algorithm has shifted permanently. The era of publishing 50 thin posts a month and watching rankings climb is over. What works now is fewer pages, stronger signals, deeper content, and cleaner technical infrastructure.
The five-step framework above — detecting spam signals, deleting low-value pages, merging cannibalized content, fixing indexation, and executing a structured 90-day plan — is the modern SEO playbook for any site that needs to recover or grow beyond its current ceiling.
None of it is fast. All of it works. Start with the most uncomfortable step first: deleting pages you spent time creating. Everything else flows from that decision.
A site with 200 excellent, indexed, interlinked pages will consistently outperform a site with 800 mediocre ones. Your goal for the next 90 days is to become the 200-page site.
Need a Custom SEO Audit for Your Site?
We analyze your specific site structure, content quality signals, and indexation issues — and give you an exact action plan.
Get a Free SEO Consultation