help please sitemap with 1000 files ?

Discussion in 'Google Sitemaps' started by loeches, Feb 15, 2007.

  1. #1
    Hi all!

    I have a problem here and I need some help. I appreciate it.

    I have created a google sitemap. In the configuration file I put it to analyse all my public HTML directory then I generate the sitemap but I didn't check it before send it to google using my webmaster account. When I check the sitemap all was there: cgi-bin, all kind of files and all the internal PHP pages of my SMF forum. My webmaster accout shows about 1000 files taken!

    questions:

    1) Is that good? Must I restrict the pages in my google sitemap ?

    2) How can I do it? Please redirect me to a good tutorial for newby

    3) Must the sitemap include the direction of the pictures?

    4) How can I setup the sitemap for a PHP forum ?


    Other data. I have a site with HTML pages in the domain and I created a subdomain for the forum.

    Thanks
     
    loeches, Feb 15, 2007 IP
  2. Litho

    Litho Peon

    Messages:
    105
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #2
    That is always the risk of using a utility that runs on the server and creates your file.

    If you have less than a few hundred thousand files, then use a tool that reads your pages from outside of your server, such as this webmaster tool

    If you use Xenu at all, careful on the follow external links - it will follow your adsense, on each page! Not good if coming from an IP address that you use to manage your adsense account!

    If you have files that you didn't want visitors to know about included in your sitemap, then google will find them and dish it out to the general public.

    I would modify the sitemap asap, edit your robots.txt file and hope for the best. Also, if they are really important files, you need rename them if you can, or move them to a protect directory.
     
    Litho, Feb 15, 2007 IP
  3. loeches

    loeches Active Member

    Messages:
    2
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    86
    #3
    Thank you very much for your reply. I manually edited the sitemap and I take off many links. Then I resubmit the map to Google and now my webmaster account showns just 111 links (before 1000). I will edit the robot.txt to not entering the forum files but I have a doubt. How can I say google to crawl the message generated on the forum? Where are those files located? I'm using simple machine forum.
     
    loeches, Feb 16, 2007 IP
  4. websitetools

    websitetools Well-Known Member

    Messages:
    1,513
    Likes Received:
    25
    Best Answers:
    4
    Trophy Points:
    170
    #4
    In general, you will want Google to find threads in your forums etc.
    I suggest you try some other tools and see what results they give.

    Even if your website reaches 10,000 or perhaps even 100,000 you should be OK using external tools (at least if you have some patience).

    An alternative may be to find a plugin for SimpleMachineForum which can read directly from the DB and generate the sitemap for you.
     
    websitetools, Feb 21, 2007 IP