How does a search engine decide which duplicate to show in search results?

1EightT Guest

Messages:: 2,646

Likes Received:: 71

Best Answers:: 0

Trophy Points:: 0

#1

How does a search engine decide which duplicate to show in search results?

Following up in my series of articles about duplicate content, it seems fitting that we would next discuss how search engines determine which article to show when they have dozens or even hundreds of duplicates to chose from.

Lets start with a question we have all thought about at one point or another. A question that our past two days articles have been leading up to.

â€œHow does a search engine decide which duplicate to show in search results, and which ones not to show?â€

How do they choose? Pagerank? First one published? Shortest url? Article with the most links?

It doesnâ€™t seem to be any one signal. Itâ€™s not pagerank alone, or distance from root directory. Itâ€™s probably not the first one published, because many sites are dynamic, and the time stamp on the original may be later than on the copy, and the first copy spidered might be the one the search engines think is the oldest. It doesnâ€™t appear to be perceived authority. It could have something to do with the number and quality of inbound and outbound links from a page. It could be a mix of all of those things and others.

So what is it then? Lets dive into some research papers and find out! . . . .

1EightT, Aug 6, 2008 IP

monfis Well-Known Member

Messages:: 1,476

Likes Received:: 31

Best Answers:: 0

Trophy Points:: 160

#2

Page rank seems to be the first and most important factor. But you're right with your suggestions. There are some more issues token in consideration.

monfis, Aug 6, 2008 IP

T_Media Peon

Messages:: 691

Likes Received:: 13

Best Answers:: 0

Trophy Points:: 0

#3

Itâ€™s probably not the first one published, because many sites are dynamic, and the time stamp on the original may be later than on the copy, and the first copy spidered might be the one the search engines think is the oldest
Click to expand...

I doubt very much search engines would use the timestamps provided on a website as these are easily faked. Almost certainly a search engine would count the first copy it spiders as the original. Although its difficult for a spider to figure out the original if it finds 2 copies in a short time frame.

Thus by my reckoning the only other (almost) reliable indicator that a spider can use is looking at which copy has the most links, as the original version is more likely to have gotten the majority of references.

T_Media, Aug 7, 2008 IP

1EightT Guest

Messages:: 2,646

Likes Received:: 71

Best Answers:: 0

Trophy Points:: 0

#4

That's the same thing I deduced in the article

1EightT, Aug 7, 2008 IP

flashgordonweb Peon

Messages:: 286

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#5

I'd agree, I think it has to do with which one was spidered first.

flashgordonweb, Aug 7, 2008 IP

1EightT Guest

Messages:: 2,646

Likes Received:: 71

Best Answers:: 0

Trophy Points:: 0

#6

That's part of what is looked at. Is anyone actually reading the article? lol

1EightT, Aug 7, 2008 IP

DoOoM Active Member

Messages:: 188

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 58

#7

I have some experience with that.
Once I wrote an unique article in some unknown article directory and left it there for a while. It got cached in some weeks. When I searched for it in google I saw somebody else also published it - there were 2 results total . Then I copy-pasted it in an old blogspot. And you will probably guess - the blogspot toke the lead in just few weeks. So it is the trust involved for sure.

DoOoM, Aug 7, 2008 IP

T_Media Peon

Messages:: 691

Likes Received:: 13

Best Answers:: 0

Trophy Points:: 0

#8

DoOoM said: ↑

I have some experience with that.
Once I wrote an unique article in some unknown article directory and left it there for a while. It got cached in some weeks. When I searched for it in google I saw somebody else also published it - there were 2 results total . Then I copy-pasted it in an old blogspot. And you will probably guess - the blogspot toke the lead in just few weeks. So it is the trust involved for sure.
Click to expand...

Hmm thats not too conclusive though. Remember, blogspot is owned by Google they might give their own pages bias. For instance Youtube videos often rank higher than they should and its been reported that Google Knol pages are also doing better than they should.

T_Media, Aug 8, 2008 IP

graphseo Banned

Messages:: 161

Likes Received:: 7

Best Answers:: 0

Trophy Points:: 0

#9

In my opinion site which have Old domain can show first in Serps

graphseo, Aug 20, 2008 IP

kingcobrapoker Peon

Messages:: 31

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#10

My view is that there's a lot of stuff that goes into this although I wouldn't use PR here I'd use site rep which actually is the real PR. Toolbar PR is just a ruse. In addition the normal stuff that would rank a site higher on a search anyway factors in pretty big.

Ken
King Cobra Poker

kingcobrapoker, Aug 20, 2008 IP

Australianfranchises Peon

Messages:: 1,230

Likes Received:: 7

Best Answers:: 0

Trophy Points:: 0

#11

All search engines are striving to create good user experiences for people who search using their services. all of them want to avoid duplicate results filling up the early spots on search result pages.

Australianfranchises, Aug 20, 2008 IP

Log in or Sign up

How does a search engine decide which duplicate to show in search results?

1EightT Guest

monfis Well-Known Member

T_Media Peon

1EightT Guest

flashgordonweb Peon

1EightT Guest

DoOoM Active Member

T_Media Peon

graphseo Banned

kingcobrapoker Peon

Australianfranchises Peon

Useful Searches