How the crawler works?

Discussion in 'Google' started by Yankee85, Mar 27, 2008.

  1. #1
    If a crawler visits a webpage, what happens? Is it like a browser? It "browses" and then parses the HTML code?
    Because in fact, any webpage written in any language, generate HTML as the final result.

    Thanks!
     
    Yankee85, Mar 27, 2008 IP
  2. The Stealthy One

    The Stealthy One Well-Known Member Affiliate Manager

    Messages:
    3,043
    Likes Received:
    54
    Best Answers:
    0
    Trophy Points:
    105
    #2
    Yes, that is basically how it works. http://www.seo-browser.com is a neat tool that shows you what the search engines see when they visit your site.
     
    The Stealthy One, Mar 27, 2008 IP
    Yankee85 likes this.
  3. Kaizoku

    Kaizoku Well-Known Member

    Messages:
    1,261
    Likes Received:
    20
    Best Answers:
    1
    Trophy Points:
    105
    #3
    It crawls your site, give importance in tags innerHTML, then updates/inserts into their database.
     
    Kaizoku, Mar 27, 2008 IP
  4. Yankee85

    Yankee85 Peon

    Messages:
    1,067
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Thanks! Rep added :)
     
    Yankee85, Mar 27, 2008 IP
  5. alexf2000

    alexf2000 Peon

    Messages:
    35
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Some crawlers can be based on regular browsers and work as browser extension. In this case it can read any generated javascript content.
     
    alexf2000, Mar 28, 2008 IP
  6. websitetools

    websitetools Well-Known Member

    Messages:
    1,513
    Likes Received:
    25
    Best Answers:
    4
    Trophy Points:
    170
    #6
    Yep, you got it. It does not matter whatever language a website is written in or if it has a database backend, only the output (HTML) matters.
     
    websitetools, Mar 28, 2008 IP
  7. Devilfish

    Devilfish Active Member

    Messages:
    396
    Likes Received:
    6
    Best Answers:
    1
    Trophy Points:
    70
    #7
    That's why it's NOT a good idea to have too much flash content on your site, that doesn't out put HTML and so won't get indexed. ;)
     
    Devilfish, Mar 28, 2008 IP
  8. Yankee85

    Yankee85 Peon

    Messages:
    1,067
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #8
    My dillema is then... why content generated by AJAX pages is not well indexed on search engines?
     
    Yankee85, Mar 28, 2008 IP