Skip to main content

Scanning

Auditoro scans your website to discover pages and run quality checks. This page explains how scanning works and the options available.

How Scans Work

1. Page Discovery

Auditoro discovers pages using two methods:

Sitemap-based discovery (preferred):

  • Reads your sitemap.xml file
  • Follows sitemap index files
  • Discovers all listed URLs

Crawl-based discovery (fallback):

  • Starts from your homepage
  • Follows internal links
  • Discovers pages recursively

If a sitemap is available, Auditoro uses it for faster, more complete discovery. If not, it crawls from your homepage.

2. Page Fetching

For each discovered page, Auditoro:

  • Requests the page HTML
  • Respects robots.txt directives
  • Follows redirects (up to a limit)
  • Captures response headers

3. Check Execution

Each page is analyzed with 20+ quality checks across:

  • SEO (titles, meta, headings)
  • Performance and delivery (compression, cache headers)
  • Security (HTTPS, headers)
  • Accessibility (alt text, lang)
  • Content and runtime quality (broken links, spelling, JS errors)

4. Score Calculation

After all checks complete, Auditoro:

  • Aggregates all detected issues
  • Calculates the health score
  • Updates trend data
  • Sends notifications (if configured)

Scan Types

On-Demand Scans

Start a scan manually at any time from the site dashboard. Use on-demand scans to:

  • Audit a new site
  • Check fixes you just deployed
  • Get updated results before a meeting
  • Investigate a reported problem

Scheduled Scans

Configure automatic recurring scans to monitor your site continuously.

Frequency options:

  • Weekly - Best for active sites (Growth and Scale plans)
  • Monthly - Available on all plans

See Scheduled Scans for setup instructions.

Scan Limits

Each plan includes a monthly scan allowance:

PlanScans/Month
Starter30
Growth100
Scale500

What counts as a scan:

  • Each on-demand scan counts as 1
  • Each scheduled scan counts as 1

Scan budget resets on your billing cycle date each month.

Page Limits

There's no strict page limit per scan, but very large sites may be crawled incrementally. Auditoro prioritizes pages in your sitemap.

For sites with thousands of pages:

  • Ensure your sitemap lists priority pages
  • Critical pages are always included
  • Less important pages may be sampled

Scan Duration

Scan duration depends on:

  • Site size - More pages take longer
  • Server speed - Slow servers extend scan time
  • Check complexity - Browser-based checks like external links and JavaScript errors take longer

Typical scan times:

  • 10-50 pages: 1-3 minutes
  • 50-200 pages: 3-10 minutes
  • 200-1000 pages: 10-30 minutes

You can close the browser during a scan—it continues in the background. You'll receive a notification when complete.

Robots.txt Respect

Auditoro respects your robots.txt file:

  • Pages disallowed for all bots are skipped
  • Crawl-delay directives are honored
  • User-agent specific rules are followed

The Auditoro crawler identifies as:

User-agent: Auditoro

To allow Auditoro while blocking other bots:

User-agent: *
Disallow: /private/

User-agent: Auditoro
Allow: /

Scan Failures

Scans may fail if:

  • Site unreachable - Server down or blocking requests
  • No pages found - Sitemap empty or crawl blocked
  • Authentication required - Login-protected pages
  • Rate limiting - Site blocking rapid requests

If a scan fails, check:

  1. Your site is accessible in a browser
  2. Robots.txt isn't blocking Auditoro
  3. No firewall rules are blocking the crawler
  4. Your sitemap is valid and accessible

Real-Time Progress

During a scan, the dashboard shows:

  • Pages discovered
  • Pages scanned
  • Issues found so far
  • Current check being run

You can watch progress in real-time or return later for results.