Hi, the automatic Sitemap generator is taking all the urls from our site, while we dont need a lot of it. How do we restrict it from getting added to the sitemap.xml ? And we are using http://www.freesitemapgenerator.com/ to generate the sitemap. So, it doesnt get auto refreshed, right ? Does that mean, in every 2-3 months, I need to regenerate the sitemap for our site and upload again ?
You can't restrict it as it's an automated tool, but you can use Htaccess files to restrict Google from sending crawlers to them
Okay, example please. And do you know a better sitemap generator, which would refresh automatically the sitemap.xml placed on the server ?
Hi, Some sitemap tools allow you to: * Use some kind of exclude filters (e.g. both crawler "analysis filters" and sitemap "output filters") * Automate it through command line to e.g. scan + upload + ping e.g. each weekend automatically. You can always ask the developer of the sitemap tool you are using, maybe you missed an option or something.
Htaccess files may certainly be helpful for you, yo u just need to upload your sitemep by using this commend RewriteRule .? http://www.yourdomain.com%{REQUEST_URI} [R=301,L]. this is htaccess file upload it by ftp.
There are many sitemaps both paid and free version. Paid version has the option to include/exclude certain pages, refresh rate etc whereas most of the free version adds all url in your site and very few have the refresh option. If you need to restrict SE bots use htaccess file and disallow particular page or else you can use <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> on the page/url you want to block.