Folks, I have been using mod_rewrite and Guest Sessions mod. (http://www.my.com/....html) May I use Google Sitemaps? are there problem if I use it (because I have been using mod_rewrite and Guest Sessions mod)? Also, Do I need <meta name="Googlebot" Content="index,follow"> Thanks for help.
By default the bot will index and follow, you don't need to put that meta tag in. As long as you have a robots.txt file for the bot to read so it knows which pages should not be crawled you are ok.
This is my robots.txt file : User-agent: * Disallow: /admin/ Disallow: /db/ Disallow: /images/ Disallow: /includes/ Disallow: /language/ Disallow: /templates/ Disallow: /index.php Disallow: /common.php Disallow: /config.php Disallow: /faq.php Disallow: /groupcp.php Disallow: /login.php Disallow: /memberlist.php Disallow: /modcp.php Disallow: /posting.php Disallow: /privmsg.php Disallow: /profile.php Disallow: /search.php Disallow: /viewonline.php Disallow: /linktous.php Disallow: /contact.php Disallow: /partner.php is this right? I want to avoid a duplicate pages... thanks
Looks good. /directory/ /page.html What you have will prevent indexing of everything listed. All content in the listed directories and the individual pages. You can add to it as needed Good job, post that sucker
Take a look at M-Benz's file below. The User Agent line sets the file up to be used by all bots (search engines). Then it lists directories that they should not go into to index and then specific pages to skip. Basically you list any directory or page that you don't want indexed. Include files and directories, CGI, admin, etc... In his case he listed specific pages such as login and others that he didn't want indexed as well. This prevents robots and bots from listing pages and content in the list so that they wont get indexed. Do a search on robots.txt there are plenty of tutorials out there.