Hello friends, This is one of the page of one of my client :: datingsitefree4all.com/robots.txt does it mean that the side is disallow for SE. But the index source page its showing >> robot allow i m bit confused!! suggest me
Search engine crawlers read the robots.txt file before the meta headers; so this means they would ignore the web pages via the robots.txt and never access them to begin with to see the meta robots element.
Your robot.txt file is providing restrictions to search engine robots, but ONLY for the directories listed: /include, /design, /plugins, and /site Pages from other directories could be crawled if are in sitemap or linked from other pages (same or different sites).
Disallow: /site/ Means the SE crawler will not visit my whole site?? please suggest me. thanks in advance
Actually even if the hyperlink is found via another source (external link) if it's "not" currently indexed and is blocked via the robots.txt file it won't get indexed. The first thing a "respectable" web robot following the robots exclusion protocol is check the robots.txt if the current web page or it's appended directory is blocked before it starts crawling from the web page.
Under the site folder, there are lots of folders and files (the web pages) /public_html/site/public so, tell me the pages will not viewed by SE?? thanks
User-agent: * Disallow: / That will block the entire website, include all sub-directories and web pages. Using the * as the user-agent will tell all search engines following the robots exclusion protocol to not not index anything from this website.
I was taking about pages from other directories not listed in his robots.txt "Pages from other directories could be crawled ..." Your robots.txt is excluding these directories, and all files/directories under it: yoursite.com/include yoursite.com/design yoursite.com/plugins yoursite.com/site
Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter†on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results. The location of robots.txt is very important. It must be in the main directory because otherwise user agents (search engines) will not be able to find it – they do not search the whole site for a file named robots.txt. Instead, they look first in the main directory and if they don't find it there, they simply assume that this site does not have a robots.txt file and therefore they index everything they find along the way. So, if you don't put robots.txt in the right place, do not be surprised that search engines index your whole site. The concept and structure of robots.txt has been developed more than a decade ago and if you are interested to learn more about it, you can go straight to the Standard for Robot Exclusion because in this article we will deal only with the most important aspects of a robots.txt file. Next we will continue with the structure a robots.txt file.
robots.txt is just to give directions for search engine crawlers and otehr crawlers to what to do with the site. Its a good way to stop them crawling your copyrighted images or important files. Do note that crawlers are not bind to follow that, if they are not programmed to follow them, than you can't do anything.
i wud suggest you to go through dis:: http://www.robotstxt.org/robotstxt.html hope all your doubts are clear nw!!!!!!!!
Hey Sanjoy (Wildstone)!!!! It means you are a criminal and have committed a great deal of FRAUD and THEFT against fellow DP'ers! You owe a LOT of people a LOT of money and you need to take responsibility for your actions! Come back to your thread and start taking a list of people you owe money to and start making payments! http://forums.digitalpoint.com/showthread.php?t=288426