Some important page is blocked by robots.txt …
-
The blog I’m working on http://adventuresincardiology.com/ was not on drop down list above.
Webmaster gives message: “Some important page is blocked by robots.txt ..”
http://adventuresincardiology.com/robots.txt
Any thoughts?
The blog I need help with is: (visible only to logged in users)
-
http://adventuresincardiology.com/ redirects to http://collateral-damage.net/ .
This is the standard robots.txt file here at wordpress.com, and anything disallowed in it the search engines should not be looking at.
Sitemap: http://collateral-damage.net/sitemap.xml
User-agent: IRLbot
Crawl-delay: 3600User-agent: *
Disallow: /next/# har har
User-agent: *
Disallow: /activate/User-agent: *
Disallow: /signup/User-agent: *
Disallow: /related-tags.php# MT refugees
User-agent: *
Disallow: /cgi-bin/User-agent: *
Disallow: -
-
I’m sorry. I misunderstood your reply. Are you saying that the redirect caused the block and that that’s OK?
-
You are welcome.
The search engines are seriously snoopy (much like TSA) and would like to look at every single file on a website, including the backend script files and everything. They always whine when they don’t get to look at everything. All the places they should be able to look for your content and such, they can see.
-
No, the redirect is not causing the issue. It is just the overly snoopy nature of search engines who want to strip-search websites.
-
So, is it a problem? If so, how to fix it?
Here is the “parsed result,” which looks like your WP template to me ….
# If you are regularly crawling WordPress.com sites please use our firehose to receive real-time push updates instead.
# Please see http://en.wordpress.com/firehose/ for more details.Sitemap: http://collateral-damage.net/sitemap.xml
User-agent: IRLbot
Crawl-delay: 3600User-agent: *
Disallow: /next/# har har
User-agent: *
Disallow: /activate/User-agent: *
Disallow: /wp-login.phpUser-agent: *
Disallow: /wp-admin/User-agent: *
Disallow: /signup/User-agent: *
Disallow: /related-tags.php# MT refugees
User-agent: *
Disallow: /cgi-bin/User-agent: *
Disallow: -
Actually thinking about it, the redirect could be the issue. By doing a redirect, http://adventuresincardiology.com/ is effectively turned into just a signpost directing any calls to that URL to the other URL. Google might be seeing that there is not a reference to http://adventuresincardiology.com/ in the robots.txt file and that might be confusing them.
The think is with the redirect, you should be setting up and monitoring http://collateral-damage.net/ because http://adventuresincardiology.com/ isn’t really a website anymore, it is just a signpost.
-
Ah, I’m beginning to understand. Actually, collateral-damage.net has been up and running for a while, but the numbers have been way down for a long time. I was thinking that the the robot.txt error message might be part of the problem….
So maybe it is OK?
-
Yeah, I’m sure you are OK. My site traffic goes in waves or cycles and sometimes those cycles are shorter and sometimes longer.
-
-
- The topic ‘Some important page is blocked by robots.txt …’ is closed to new replies.