Search Engine Spiders?

tjsocr22 Active Member

Messages:: 1,251

Likes Received:: 12

Best Answers:: 0

Trophy Points:: 80

#1

Hey guys,

Just wondering how search engine spiders work? And does it take a lot of effort to make one?

Thanks!

tjsocr22, Jun 4, 2010 IP

Blue Star Ent. Well-Known Member

Messages:: 1,989

Likes Received:: 31

Best Answers:: 0

Trophy Points:: 160

#2

They follow hyperlinks. Use Java. Here is the link : LINK

Let me know how you do !

Blue Star Ent., Jun 14, 2010 IP

Lever Deep Thought

Messages:: 1,823

Likes Received:: 94

Best Answers:: 0

Trophy Points:: 145

#3

For a really comprehensive read-up on spiders/crawlers have a look at http://en.wikipedia.org/wiki/Web_spider

I found http://www.ibm.com/developerworks/linux/library/l-spider/?ca=dgr-lnxw01WebSpiderLinux really useful as it has *nix, ruby & a simple python spidering script plus links to source code in the "Resources" section, including some Java spiders as Blue Star Ent points out.

Again, it would be nice to know how you get on

Lever, Jun 14, 2010 IP

stephan2307 Well-Known Member

Messages:: 1,277

Likes Received:: 33

Best Answers:: 7

Trophy Points:: 150

#4

It is not too difficult to write one in php. you can do that in a couple of hours.

However in order to have a proper spider you need also to handle robots.txt properly.

In order to write one all you need is a database, curl and preg_match_all

stephan2307, Jun 15, 2010 IP

ttyler333 Member

Messages:: 62

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 43

#5

i would just download sphider and read the code, learn how it works.

ttyler333, Jun 18, 2010 IP

Lever likes this.

Lever Deep Thought

Messages:: 1,823

Likes Received:: 94

Best Answers:: 0

Trophy Points:: 145

#6

ttyler333 said: ↑

i would just download sphider and read the code, learn how it works.
Click to expand...

I looked up Sphider, it's here > http://www.sphider.eu/

If people use it & find it useful, don't forget to donate (http://www.sphider.eu/donate.php)

Lever, Jun 19, 2010 IP

ttyler333 Member

Messages:: 62

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 43

#7

Lever said: ↑

I looked up Sphider, it's here > http://www.sphider.eu/

If people use it & find it useful, don't forget to donate (http://www.sphider.eu/donate.php)
Click to expand...

Hey, thanks for gathering that link.. i guess i forgot to do that. Anywho hopefully that helps people

ttyler333, Jun 19, 2010 IP

Lever Deep Thought

Messages:: 1,823

Likes Received:: 94

Best Answers:: 0

Trophy Points:: 145

#8

ttyler333 said: ↑

Hey, thanks for gathering that link.. i guess i forgot to do that. Anywho hopefully that helps people
Click to expand...

No problem at all, always happy to help out

And thanks for pointing out Sphyder, I'll have a proper look at that later myself.

Lever, Jun 19, 2010 IP

ttyler333 Member

Messages:: 62

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 43

#9

Lever said: ↑

No problem at all, always happy to help out

And thanks for pointing out Sphyder, I'll have a proper look at that later myself.
Click to expand...

Not a problem either. I was into using a spider for a site but it did not do what i wanted so i created my own. The spider i created only crawls indepth of all links on that page. I was working on making it crawl the crawled links but anywho. Not bad for first php spider attempt.

ttyler333, Jun 20, 2010 IP

Log in or Sign up

Search Engine Spiders?

tjsocr22 Active Member

Blue Star Ent. Well-Known Member

Lever Deep Thought

stephan2307 Well-Known Member

ttyler333 Member

Lever Deep Thought

ttyler333 Member

Lever Deep Thought

ttyler333 Member

Useful Searches