What is Sitemap & Robots.txt? Where are they located?

Discussion in 'Search Engine Optimization' started by Tuhin Eternal Spring, Mar 28, 2015.

?

What is Sitemap & Robots.txt? Where are they located?

  1. What is Sitemap & Robots.txt? Where are they located?

    0 vote(s)
    0.0%
  2. What is Sitemap & Robots.txt? Where are they located?

    0 vote(s)
    0.0%
Multiple votes are allowed.
  1. #1
    What is Sitemap & Robots.txt? Where are they located?
     
    Tuhin Eternal Spring, Mar 28, 2015 IP
  2. Mkcoy

    Mkcoy Well-Known Member

    Messages:
    1,602
    Likes Received:
    77
    Best Answers:
    2
    Trophy Points:
    195
    #2
    A site map (or sitemap) is a list of pages of a web site accessible to crawlers or users. It can be either a document in any form used as a planning tool for Web design, or a Web page that lists the pages on a Web site, typically organized in hierarchical fashion.

    Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol. The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site.

    Both are usually located at the root of your site.

    Eg site.com/sitemap.xml & site.com/robots.txt

    Some sites such as WordPress can create a "virtual" robots text file.

    Now you know.
     
    Mkcoy, Mar 28, 2015 IP
    kingofking likes this.
  3. braulio

    braulio Active Member

    Messages:
    70
    Likes Received:
    7
    Best Answers:
    1
    Trophy Points:
    95
    #3
    robots.txt and sitemap.xml are located in your root folder.

    robots.txt tells google, yahoo, bing crawlers what not to crawl ( take an inventory of ) on your pages. Robots.txt works in conjunction with your sitemap.xml file. The sitemap.xml tells the crawlers what pages you have and where they are located. If you do not have a robots.txt file, the crawler crawls all your site. If your robots.txt file has page A as a restriction and you declare page A on your sitemap.xml, you get a crawl error on your Google Webmaster Tools panel.

    An example of how we use it is the following.

    We have a test directory on our server where we upload sites to be tested before we go public. Obviously, we do not want the crawlers to take note ( inventory ) of this directory since it is a directory for testing only. Below is the robots.txt file contents.

    # www.robotstxt.org/

    User-agent: *

    Disallow:/testing/

    Good luck Tuhin.

    Braulio
     
    braulio, Mar 28, 2015 IP
    kingofking likes this.