1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

What is the robots.txt?

Discussion in 'Search Engine Optimization' started by aaateleshopping, Feb 27, 2012.

  1. #1
    Hello friends please tell me
    SEMrush
    What is the robots.txt? and how can i use this and give me a path of download this?
     
    aaateleshopping, Feb 27, 2012 IP
    SEMrush
  2. poojauni

    poojauni Peon

    Messages:
    22
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Suppose you want that Google will not index a page in your site, you put there Robots.tag text there. And for implementing check out the implementation of the Robots.tag on Google.....
     
    poojauni, Feb 27, 2012 IP
  3. signorm68

    signorm68 Well-Known Member

    Messages:
    982
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    108
    #3
    signorm68, Feb 27, 2012 IP
  4. Nate55

    Nate55 Peon

    Messages:
    51
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #4
    A simple text file that you place in your public html directory that specifies what folders and directories on your domain that Google and other search engine bots can crawl, which they can't and also which ones they can index and not.
     
    Nate55, Feb 27, 2012 IP
  5. dawsonryna

    dawsonryna Peon

    Messages:
    18
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    The robots.txt file is intended for the Search Engine "spiders". Choose the robots.txt file most appropriate to your situation that can be for preventing SEs from pages or allowing SEs. You can get the code online with simple search.

    First of all you just list out what are the areas of your website you want to mention in robots.txt file.
     
    dawsonryna, Feb 27, 2012 IP
  6. ivickon

    ivickon Peon

    Messages:
    32
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site
     
    ivickon, Feb 27, 2012 IP
  7. jitendrashukla

    jitendrashukla Member

    Messages:
    194
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    26
    #7
    robots.txt is a file which help Google's crawler during crawling and allow disallow the web pages which we want to read or which we do not want to read...
     
    jitendrashukla, Feb 27, 2012 IP
  8. jvfconsulting

    jvfconsulting Active Member

    Messages:
    1,089
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    90
    #8
    Login to your Google webmaster tools, there is an area that will help you build an effective robots.txt file
     
    jvfconsulting, Feb 27, 2012 IP
  9. AshishBedi

    AshishBedi Peon

    Messages:
    8
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    robot text is tag which tells google not to crawl that particular web page.For example if a site contains a web page for payment purpose then in that case robot.txt tags can be used in order to violation of privacy of customers
     
    AshishBedi, Feb 27, 2012 IP
  10. aliviamallan

    aliviamallan Active Member

    Messages:
    163
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    53
    #10
    The Robot Exclusion Standard, also known as the Robots Exclusion Protocol or robots.txt protocol, is a convention to prevent cooperating web crawlers and other web robots from accessing all or part of a website which is otherwise publicly viewable. Robots are often used by search engines to categorize and archive web sites, or by webmasters to proofread source code. The standard is different from, but can be used in conjunction with, Sitemaps, a robot inclusion standard for websites.
     
    aliviamallan, Feb 27, 2012 IP
  11. Mithuasha

    Mithuasha Banned

    Messages:
    1,056
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.
     
    Mithuasha, Feb 27, 2012 IP
  12. kshuklaindia

    kshuklaindia Greenhorn

    Messages:
    51
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    16
    #12
    robots.txt file help you disallow the unwanted page (page not for user) Like:- admin, index.html, redirect, Duplicate pages

    What to put in it
    The "/robots.txt" file is a text file, with one or more records. Usually contains a single record looking like this:
    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /tmp/
    Disallow: /junk/
     
    kshuklaindia, Feb 27, 2012 IP
  13. petervanlier

    petervanlier Peon

    Messages:
    260
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #13
    Robots.txt file is a set of instructions that tell search engine robots which pages of your site to be crawled and indexed. Robots.txt helps tell spiders what is useful and public for sharing in the search engine indexes and what is not. It should also be noted that not all search spiders will follow your instructions left in the robots.txt file, it improve site indexation by telling search engine crawler to only index your content pages and to ignore other pages.
     
    petervanlier, Feb 28, 2012 IP
  14. azharSEO

    azharSEO Active Member

    Messages:
    1,468
    Likes Received:
    7
    Best Answers:
    1
    Trophy Points:
    88
    #14
    The robots exclusion protocol (REP), or robots.txt is a text file webmasters create to instruct robots (typically search engine robots) on how to crawl & index pages on their website.
     
    azharSEO, Feb 28, 2012 IP
  15. miajones

    miajones Peon

    Messages:
    16
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #15
    Robots.txt is a file which a search engine crawls first before crawling the whole site. In this file you mention those pages which you don't want crawler to crawl.
     
    miajones, Feb 28, 2012 IP
  16. larryllison

    larryllison Greenhorn

    Messages:
    16
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #16
    Robots.txt file allows all search engine robots to visit whole web pages of web sites.
    and following is the default setting of robots.txt file.

    User-agent: * Disallow: /
     
    larryllison, Feb 28, 2012 IP
  17. mumtazseo

    mumtazseo Peon

    Messages:
    39
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #17
    Robots.txt is a file that gives instructions to all search engine spiders to index or follow certain page or pages of a website. This file is normally use to disallow the spiders of a search engines from indexing unfinished page of a website during it's development phase. Many webmasters also use this file to avoid spamming.

    You can simply create Robots.txt in note pad and disallow any web page from being indexed in search Engine and upload it to root directory of your website.
     
    mumtazseo, Feb 28, 2012 IP
  18. Rishi tomar

    Rishi tomar Greenhorn

    Messages:
    21
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #18
    If your site has copy content on two differ pages one you want to show to the crawlers and other you want to hide from crawlers. For this purpose Robot.txt is used
     
    Rishi tomar, Feb 28, 2012 IP
  19. shophia

    shophia Active Member

    Messages:
    231
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    58
    #19
    Robots.txt is simple a plain-text file that a web publisher should submit in the root directory of their sites. This file works on a per-bot basis. When these "robots" visits your website, the first thing they do is go observing for the robots.txt file. They listen to your demands, and won't visit WebPages that you've disallowed.
     
    shophia, Feb 28, 2012 IP
  20. ark_parmar

    ark_parmar Peon

    Messages:
    23
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #20
    Robot.txt we include in our website because what ever page we dont want to crawl by search engine we mention in Robot.txt. so that google crawl only those page what we are going for seo work.

    To make Robot.txt use webmaster tool and add the rules or page you want to show or block to crawl from search engine's.
     
    ark_parmar, Feb 28, 2012 IP