31 May 2016

How To Add Costom Robots.txt File in Blogspot Blog?

Robots.txt contains the guidelines or rules for the bot-crawlers about how they will crawl and index your website or blog. Blogger allows to set some very basic SEO settings very easily from the dashboard. One of them is setting a custom robots.txt for your blogger.

When a robot search engine crawler is visiting a page or website, robots.txt is the first thing it looks for is the robots.txt file. As a Blogger user you now have the option to control what the search engine crawlers should follow and index from your website or blog.

Every blogger blog has a default robots.txt but with advanced changes in blogger you can change it according to your needs. In this post, you will know about the default robots.txt of blogger, how to add or edit a custom robots.txt for your blogger blog and some useful examples of robots.txt with an adsense friendly one. So let’s get started.


Read Also: How To Add Custom Robots Header Tags Settings In Blogger

 Default Custom Robots.txt of Blogger Blog

Every time you create a blog in blogger a default robots.txt is created and until you change it by the setting is dashboard it remains same.
User-agent: Mediapartners-Google
Disallow:

User-agent: *
Disallow: /search
Allow: /

Sitemap: http://Trueblogmoney.blogspot.com/feeds/posts/default?orderby=UPDATED

It is same for each blog and it is Adsense friendly. Don't worry
if it isn't colored. I colored it so that you may understand what these words mean.

    User-agent: Mediapartners-Google

    This code is for Google Adsense robots which help them to serve better ads on your blog. Either you are using Google Adsense on your blog or not simply leave it as it is.

    User-agent: *

    This is for all robots marked with asterisk (*). In default settings our blog’s labels links are restricted to indexed by search crawlers that means the web crawlers will not index our labels page links because of below code.

        Disallow: /search

    That means the links having keyword search just after the domain name will be ignored. See below example which is a link of label page named SEO.
        http://myblog.blogspot.com/search/label/SEO

    And if we remove Disallow: /search from the above code then crawlers will access our entire blog to index and crawl all of its content and web pages.

    Allow: /

refers to the Homepage that means web crawlers can crawl and index our blog’s homepage.

Sitemap: http://example.blogspot.com/feeds/posts/default?orderby=UPDATED

This code refers to the sitemap of our blog. By adding sitemap link here we are simply optimizing our blog’s crawling rate. Means whenever the web crawlers scan our robots.txt file they will find a path to our sitemap where all the links of our published posts present. Web crawlers will find it easy to crawl all of our posts. Hence, there are better chances that web crawlers crawl all of our blog posts without ignoring a single one.

Howevwr by default, the robot will index only 25 posts, so if you want to increase the number of index files, then replace the sitemap link with this one.

    Sitemap: http://example.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500
 
And if you have more than 500 published posts in your blog then you can use two sitemaps like below:
    Sitemap: http://example.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500

    Sitemap: http://example.blogspot.com/atom.xml?redirect=false&start-index=500&max-results=1000

Read Also: How To Submit Blogger Sitemap To Google Webmaster Tools?

How to prevent posts/pages from being indexed and crawled?

In case you haven't yet discovered yourself, here is how to stop spiders from crawling and indexing particular pages or posts.

    Disallow Particular Post

        Disallow: /yyyy/mm/post-url.html

    The yyyy and mm refers to the publishing year and month of the post respectively. For example if we have published a post in year 2015 in month of March then we have to use below format.

        Disallow: /2015/03/post-url.html

Disallow Particular Page

If we need to disallow a particular page then we can use the same method as above. Simply copy the page URL and remove blog address from it which will something look like this:

    Disallow: /p/page-url.html

 Adding Custom Robots.Txt to Blogger

Now the main part of this tutorial is how to add custom robots.txt in blogger. So below are steps to add it.
  •     Go to your blogger blog.
  •     Navigate to Settings >> Search Preferences ›› Crawlers and indexing ›› Custom robots.txt ›› Edit ›› Yes
  •     Now paste your robots.txt file code in the box.

add-costom-robots.txt-blogger

  •     Click on Save Changes button.
  •     And congratulation, You are done!

 Bottom Line

It was easy. once you knew what those code words meant. If you couldn't get it for first time, just go again through the tutorial and read it again.

In any case, from SEO and site ratings it's important to make that tiny bit of changes to your robots.txt file, so don't be a sloth, Learning is fun, as long as its free, cheers :)

Read Also: How To SEO Optimize Your Blogspot Blog Post Titles

1 comment: