My friend ask me this question which i don't know how to answer him ..... the reason he ask cause he own a site for two year but not even a single page index by any SE .... his site http://www.sppsb.com.my
is not my site .... my friend site and he just wonder why he dint get index after two year the site exit ....
which means the person who develop my friends have robot.txt to block all bot .... wonder why they doing it ...
They do NOT have a robots.txt. If you request "http://www.sppsb.com.my/robots.txt" in the browser you get a 404 error because the robots.txt does not exist. So it's not robots.txt preventing them from being indexed. Another way to block a page from being indexed is to include a <meta name="robots" content="noindex"> HTML element in the <head> element of the page's HTML. But your friends site does not have this either as you can see from their home page <head> below: So it's not a <meta name="robots" content="noindex"> element preventing your friend's site from being indexed. The reason your friend's site has not been indexed is that the search engines do not have Extra Sensory Perception (ESP). How are they to know that the site even exists? When I search Yahoo! for link:http://www.sppsb.com.my it shows me that Yahoo! doesn't know of a single site linking to your friend's site. Your friend should either: 1) Submit their home page URL to the engines to ask them to index it, 2) Submit a sitemap.xml to the engines to tell them the URLs for all the pages on the site to ask them to index, or 3) Simply get pages other sites already indexed by the search engines to link to his site. The engines will follow the links from the pages on other sites to discover your friend's site. These are typically the 3 ways you can get your site discovered and indexed by the engines. #3 is the preferable method IMO because it not only gets their site discovered but also the backlinks help insure that the page stays indexed.