Robots.txt Checker - CMS-Aware Analysis with AI Recommendations
Validate robots.txt syntax, detect CMS patterns, and get AI-powered optimization advice by John Rippy | johnrippy.link
🏆 2025 Zapier Automation Hero of the Year — Project Phoenix: A 95-step AI sales pipeline cutting development time by 50%. Read more →
---
What This Actor Does
The Robots.txt Checker provides comprehensive analysis of your robots.txt file:
1. Syntax Validation - Detect parsing errors and malformed directives
2. CMS Detection - Identify WordPress, Shopify, Drupal, and 6+ other CMS platforms
3. Best Practice Checks - Verify sitemap declarations, crawl delays, blocked paths
4. Companion File Checks - Validate sitemap.xml, llms.txt, security.txt
5. AI Recommendations - CMS-specific optimization suggestions
---
Why Use This Actor?
The Problem with Manual Checking
Most developers paste robots.txt into a validator and get syntax errors, but miss:
- CMS-specific paths that should be blocked
- Missing sitemap declarations
- Accidental blocking of important content
- Security and AI crawler considerations
CMS-Aware Intelligence
This actor detects your CMS and provides targeted recommendations:
---
Use Cases
1. SEO Audits
Verify clients' robots.txt files don't accidentally block important content.
2. Pre-Launch Checks
Ensure robots.txt is properly configured before launching a new site.
3. Competitor Analysis
Compare robots.txt configurations across competitor sites.
4. Security Compliance
Check for security.txt and ensure proper crawler access controls.
---
Quick Start Examples
Example 1: Single URL Analysis
{
"url": "https://example.com",
"includeAIRecommendations": true
}
Example 2: Batch Analysis with All Checks
{
"urls": [
"https://yoursite.com",
"https://competitor1.com",
"https://competitor2.com"
],
"includeSitemapCheck": true,
"includeLlmsTxtCheck": true,
"includeSecurityTxtCheck": true
}
Example 3: Demo Mode (Free Testing)
{
"demoMode": true
}
Example 4: With AI Enhancement (BYOK)
{
"url": "https://example.com",
"includeAIRecommendations": true,
"anthropicApiKey": "sk-ant-..."
}
---
Input Parameters
*Either url or urls required unless using demoMode
---
Output Format
{
"url": "https://example.com",
"robotsTxtUrl": "https://example.com/robots.txt",
"timestamp": "2024-12-25T12:00:00.000Z",
"status": "found",
"score": 85,
"rules": [
{
"userAgent": "*",
"disallow": ["/admin/", "/private/"],
"allow": ["/admin/login"]
}
],
"sitemaps": ["https://example.com/sitemap.xml"],
"hasWildcardUserAgent": true,
"syntaxErrors": [],
"warnings": [],
"bestPractices": {
"hasSitemapDeclaration": true,
"hasReasonableCrawlDelay": true,
"blocksImportantPaths": [],
"allowsSearchEngines": true
},
"detectedCms": "WordPress",
"cmsRecommendations": [
"Consider adding Disallow: /wp-json/ to prevent REST API indexing"
],
"sitemapXml": {
"exists": true,
"url": "https://example.com/sitemap.xml",
"urlCount": 245
},
"llmsTxt": {
"exists": false,
"url": "https://example.com/llms.txt"
},
"securityTxt": {
"exists": true,
"url": "https://example.com/.well-known/security.txt",
"hasContact": true,
"hasExpires": true
},
"recommendations": [
{
"priority": 1,
"category": "cms_specific",
"issue": "WordPress optimization opportunity",
"recommendation": "Block /wp-json/ to prevent REST API indexing",
"impact": "medium"
}
]
}
---
Scoring System
The actor calculates a 0-100 score based on:
---
AI Recommendations
Without Anthropic API Key
Uses rule-based recommendations based on:
- Detected CMS patterns
- Common SEO best practices
- Security standards
With Anthropic API Key (BYOK)
Enhanced analysis using Claude to:
- Identify subtle configuration issues
- Provide context-aware suggestions
- Prioritize recommendations by impact
---
CMS Detection
Detects these platforms by analyzing robots.txt patterns:
- WordPress - /wp-admin/, /wp-content/, /wp-includes/
- Shopify - /admin/, /cart/, /checkout/, /collections/
- Drupal - /node/, /user/, /sites/
- Joomla - /administrator/, /components/, /modules/
- Magento - /admin/, /checkout/, /customer/, /catalog/
- Wix - /_api/, /_files/, /_partials/
- Squarespace - /config/, /api/, /static/
---
Webhook Integration
Webhook Payload
{
"event": "robots_txt_analysis_complete",
"timestamp": "2024-12-25T12:00:00.000Z",
"actor": "robots-txt-checker",
"status": "success",
"urlsAnalyzed": 3,
"avgScore": 82,
"results": [...]
}
---
Perfect For
SEO Agencies
- Client onboarding audits
- Competitor analysis
- Pre-launch checklists
Web Developers
- CI/CD integration for robots.txt validation
- CMS migration checks
- Security compliance
Marketing Teams
- Ensure content is indexable
- Verify proper crawler access
---
Pricing
- Demo Mode: Free (sample data)
- Standard Usage: Apify compute units only
- AI Recommendations: Rule-based free, Claude BYOK for enhanced
---
Related Actors
- Technical SEO Auditor - Full on-page SEO analysis
- Sitemap Generator - Create valid sitemaps
- PageSpeed Intelligence - Performance + Tech Stack analysis
---
Built by John Rippy | johnrippy.link🏆 2025 Zapier Automation Hero of the Year — Project Phoenix: A 95-step AI sales pipeline cutting development time by 50%. Read more →
---
Keywords
robots.txt checker, robots.txt analyzer, robots.txt validator, wordpress robots.txt, shopify robots.txt, seo audit, sitemap validation, llms.txt, security.txt, crawl directives, search engine crawler, googlebot, cms detection, technical seo, ai recommendations