Efficient/best way to index a site?

Discussion in 'Programming' started by batman4444, Nov 13, 2008.

  1. #1
    What is the most efficient way to index a particular site? I am currently looking at using PHP to pull all the text from the page, strip it of formating, and then save it to a MySQL data base, but this seems to be really slow are there more efficient/faster ways to do this? How do search engines do it?
     
    batman4444, Nov 13, 2008 IP
  2. rene7705

    rene7705 Peon

    Messages:
    233
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
  3. VarriaStudios

    VarriaStudios Member

    Messages:
    246
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    28
    #3
    in my own opinion

    submit url to google, yahoo and a few directories.

    make sure to insert the necessary meta tags
     
    VarriaStudios, Nov 13, 2008 IP
  4. WeWatch

    WeWatch Active Member

    Messages:
    75
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    50
    #4
    You can use some Perl commands or PHP.

    I got started with wget after working with various spidering programs. You can learn a lot from a book, "Spidering Hacks" from O'Reilly.

    There are PERL scripts, PHP libraries, libcurl, wget, etc. It all depends on what you're most comfortable with.

    Tell me more about what your background is and what you want to do.
     
    WeWatch, Nov 16, 2008 IP
  5. Napoleon

    Napoleon Peon

    Messages:
    732
    Likes Received:
    20
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Napoleon, Nov 16, 2008 IP
  6. Christ

    Christ Active Member

    Messages:
    720
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    53
    #6
    Go to digg.com and submit a few articles from your websites. your aitw will be indexed within an hour.
     
    Christ, Nov 17, 2008 IP
  7. gummyworms

    gummyworms Active Member

    Messages:
    126
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    53
    #7
    there seems to be 3 different opinions here. Are you trying to index sites from somewhere else to your site? If so then google and etc etc are not the same because they index you, and not you index someone else. Sphider is a script where you index your own server directories, essentially you index yourself.
     
    gummyworms, Nov 17, 2008 IP
  8. Napoleon

    Napoleon Peon

    Messages:
    732
    Likes Received:
    20
    Best Answers:
    0
    Trophy Points:
    0
    #8
    You can also index external sites using sphider.
     
    Napoleon, Nov 17, 2008 IP
  9. Aweb

    Aweb Peon

    Messages:
    31
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    The best way is to check where the part you want to index begins and where it ends.

    than you have to find a pattern where you can start" cutting" the text you are interested in from, and a pattern when you will stop "cutting" the text.

    I'm using this method since years and it works unless they change the code.
     
    Aweb, Nov 17, 2008 IP
  10. batman4444

    batman4444 Active Member

    Messages:
    211
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    51
    #10
    Thanks for the help, yes I am trying to index text from other sites not my own, Looks like php will do the trick
     
    batman4444, Nov 17, 2008 IP
  11. aamirprince20

    aamirprince20 Peon

    Messages:
    52
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #11
    have anybody used that google's tool to build sitemap?
     
    aamirprince20, Nov 17, 2008 IP