Hi all! I have a problem here and I need some help. I appreciate it. I have created a google sitemap. In the configuration file I put it to analyse all my public HTML directory then I generate the sitemap but I didn't check it before send it to google using my webmaster account. When I check the sitemap all was there: cgi-bin, all kind of files and all the internal PHP pages of my SMF forum. My webmaster accout shows about 1000 files taken! questions: 1) Is that good? Must I restrict the pages in my google sitemap ? 2) How can I do it? Please redirect me to a good tutorial for newby 3) Must the sitemap include the direction of the pictures? 4) How can I setup the sitemap for a PHP forum ? Other data. I have a site with HTML pages in the domain and I created a subdomain for the forum. Thanks
That is always the risk of using a utility that runs on the server and creates your file. If you have less than a few hundred thousand files, then use a tool that reads your pages from outside of your server, such as this webmaster tool If you use Xenu at all, careful on the follow external links - it will follow your adsense, on each page! Not good if coming from an IP address that you use to manage your adsense account! If you have files that you didn't want visitors to know about included in your sitemap, then google will find them and dish it out to the general public. I would modify the sitemap asap, edit your robots.txt file and hope for the best. Also, if they are really important files, you need rename them if you can, or move them to a protect directory.
Thank you very much for your reply. I manually edited the sitemap and I take off many links. Then I resubmit the map to Google and now my webmaster account showns just 111 links (before 1000). I will edit the robot.txt to not entering the forum files but I have a doubt. How can I say google to crawl the message generated on the forum? Where are those files located? I'm using simple machine forum.
In general, you will want Google to find threads in your forums etc. I suggest you try some other tools and see what results they give. Even if your website reaches 10,000 or perhaps even 100,000 you should be OK using external tools (at least if you have some patience). An alternative may be to find a plugin for SimpleMachineForum which can read directly from the DB and generate the sitemap for you.