How google crowler works?

Discussion in 'Google' started by Flightsandfly1, May 25, 2012.

  1. #1
    How Google crowler works?
     
    Flightsandfly1, May 25, 2012 IP
  2. sultanofseo

    sultanofseo Notable Member

    Messages:
    9,930
    Likes Received:
    405
    Best Answers:
    0
    Trophy Points:
    265
    #2
    its just any other web crawler that crawls the web and finds sites. the crawler, known as googlebot, visits a site and fetch the data. google then analyze the data and figure out what to do with it. main purpose of the crawler is to fetch data for google
     
    sultanofseo, May 25, 2012 IP
  3. hope2life

    hope2life Active Member

    Messages:
    1,641
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    90
    #3
    yes google crawler crawl good content websites and neglect the bad ones. So make a good content site to make google crawler happy.
     
    hope2life, May 25, 2012 IP
  4. rising_sun

    rising_sun Peon

    Messages:
    908
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #4
    How can googlebot ( google crawler ) identify good contents, its just a software?
     
    rising_sun, May 25, 2012 IP
  5. henrywilliams

    henrywilliams Peon

    Messages:
    119
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    When crawler comes to your site it looks for the robots.txt and then crawl all the associated pages which will lead the crawler to the second phase of crawling which includes inspecting all the links present on a web page and follow links one to another.
     
    henrywilliams, Jun 1, 2012 IP
  6. DavidTiger

    DavidTiger Greenhorn

    Messages:
    77
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    18
    #6
    Google crawler is more common in talk but I want to say that what about Yahoo and Bing crawler. How they work? I saw that most of the people got ran in Google but suffer in Yahoo and Bing so can we do for that?
     
    DavidTiger, Jun 1, 2012 IP
  7. Jaydeep

    Jaydeep Greenhorn

    Messages:
    23
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #7
    Crawling is the process by which Googlebot discovers new and updated pages to be added to the Google index.

    Google's crawl process begins with a list of web page URLs, generated from previous crawl processes, and augmented with Sitemap data provided by webmasters. As Googlebot visits each of these websites it detects links on each page and adds them to its list of pages to crawl.
     
    Jaydeep, Jun 1, 2012 IP
  8. Mega B

    Mega B Well-Known Member

    Messages:
    3,454
    Likes Received:
    66
    Best Answers:
    1
    Trophy Points:
    190
    #8
    So what is a Google crowler ??? never heard of one of those thing before.
     
    Mega B, Jun 2, 2012 IP
  9. ashleyjohn2347

    ashleyjohn2347 Peon

    Messages:
    269
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Google crawler checks the site content and penalize the duplicate and raw sites.
     
    ashleyjohn2347, Jun 2, 2012 IP
  10. ryan_uk

    ryan_uk Illustrious Member

    Messages:
    3,983
    Likes Received:
    1,022
    Best Answers:
    33
    Trophy Points:
    465
    #10
    Googlebot really just does the discovery of content. It makes some determinations about whether to crawl a link, but only on a basic level.

    However, regarding software and identifying content: it's not difficult, even on a simple level, to identify bad content. It's all about finding particular signals/matches. Or if the page has no words, but hundreds or links, then you know there's a problem, etc. These are just the basics. Categorising content is a bit trickier, but if it appears to be about cats and then mentions apples, there's something bad with the content. Look up regular expressions and you could do some in PHP to analyse a page and make some determinations. (Simplest way if you want to experiment.)
     
    ryan_uk, Jun 2, 2012 IP
  11. jessylogan

    jessylogan Peon

    Messages:
    11
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Google knows better than us :)
     
    jessylogan, Jun 3, 2012 IP
  12. tridosil

    tridosil Peon

    Messages:
    477
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #12
    tridosil, Jun 3, 2012 IP
  13. josefaryan

    josefaryan Member

    Messages:
    1,473
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    45
    #13
    A Good source can tell you better, go through with the following wikipedia article on web crawlers:

    Continued: http://en.wikipedia.org/wiki/Web_crawler
     
    josefaryan, Jun 3, 2012 IP