Difference between indexing and crawling

Discussion in 'Google' started by sathish007, Jul 25, 2010.

  1. #1
    Hi can anyone help me out on this. I just want to know the exact difference between crawling and indexing. I need answer in short please.

    Thanks,
    Sathish.
     
    sathish007, Jul 25, 2010 IP
  2. Alexandros1

    Alexandros1 Peon

    Messages:
    332
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics and computer science. An alternate name for the process in the context of search engines designed to find web pages on the Internet is Web indexing. When you hear "index" chances are the person is talking about Google's index. In order to have your page displayed on Google they store it and serve it from their index.

    A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion. Other terms for Web crawlers are ants, automatic indexers, bots, or Web spiders, Web robots...
    This process is called Web crawling or spidering. Many sites, in particular search engines, use spidering as a means of providing up-to-date data. Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine that will index the downloaded pages to provide fast searches. Crawlers can also be used for automating maintenance tasks on a Web site, such as checking links or validating HTML code. Also, crawlers can be used to gather specific types of information from Web pages, such as harvesting e-mail addresses (usually for spam).
    A Web crawler is one type of bot, or software agent. In general, it starts with a list of URLs to visit, called the seeds. As the crawler visits these URLs, it identifies all the hyperlinks in the page and adds them to the list of URLs to visit, called the crawl frontier. URLs from the frontier are recursively visited according to a set of policies.
     
    Alexandros1, Jul 25, 2010 IP
  3. tompatrick

    tompatrick Peon

    Messages:
    118
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Consider Google as an database where things are kept on a priority basis, once you create a website it's being indexed means kept ion the database of Google, that means that it's being indexed, but it doesn't means that the website is crawled.

    Once the site is indexed once it will be crawled by Google when it's turn comes. that means if the url is indexed we can't be sure of that it's crawled or not but if the url is crawled we can be sure that it's being indexed. it's a predefined process that Google follows
     
    tompatrick, Jul 26, 2010 IP
  4. bogs

    bogs Active Member

    Messages:
    2,142
    Likes Received:
    16
    Best Answers:
    0
    Trophy Points:
    80
    #4
    crawling is the way of reading your site.. indexing is when bots gets the information on your site, (contents of your site).
    So even if your exclude a site page on robot.txt from indexing SEs still going to crawl it.
     
    bogs, Jul 26, 2010 IP
  5. sathish007

    sathish007 Peon

    Messages:
    20
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Thanks to all for your reply.
     
    Last edited: Jul 28, 2010
    sathish007, Jul 28, 2010 IP
  6. sathish007

    sathish007 Peon

    Messages:
    20
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Thanks, actually i was confusted with your reply; to my knowledge crawling is the first process and crawled is the second one. whic means. if the url is crawled we can't be sure of that the url is indexed or not; but if the url is indexed we can be sure that the url got crawled.
     
    sathish007, Jul 28, 2010 IP
  7. MLMOA

    MLMOA Peon

    Messages:
    22
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Hi,

    Crawling is when the spiders (little software programs) visit your site and "crawl" through it to gather information.

    Indexing is the process of categorizing what the spiders find on your site so that the search engines can decide the most appropriate places to list you site.

    Regards,

    Dave
     
    MLMOA, Jul 28, 2010 IP
  8. ritesh_83

    ritesh_83 Greenhorn

    Messages:
    11
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #8
    Cache means - the way the Web page looked when Google’s spiders indexed it.

    Index means pages added on Google database.

    I replied same in another thread here..

    hope its in in simple form as you required frnd..
     
    ritesh_83, Sep 21, 2010 IP