Beating the duplicate content filter!

Tuning Well-Known Member

Messages:: 1,005

Likes Received:: 51

Best Answers:: 0

Trophy Points:: 138

#1

Hi Dpers,

What are the efficient methods to beat the google duplicate content filter for a directory like this :
http://www.google.com/search?hl=en&q=site:www.directoryone.info&btnG=Google+Search

I believe in couple of months all pages will be dropped and there will be nothing left in index.

Hence requesting suggestions on beating the dup filter somehow.

Thanks in advance.

Tuning, Jul 21, 2005 IP

nohaber Well-Known Member

Messages:: 276

Likes Received:: 18

Best Answers:: 0

Trophy Points:: 138

#2

You cannot beat the dup detection because it is query specific and operates on parts of the text. In the final ranking phase google looks for the query keywords in the resulting pages. Google extracts snippets of text that contain the keywords and matches them to the snippets from the other pages in the SERPs. When there's dup content (based only on snippets, not whole docs), google leaves only the one page that it considers most authoratie (oldest, highest PR sth like that). Most directory submissions are dup text. The way to beat that is: have different categories which would distribute the listings into different pages (that way keywords that match different listings won't get filtered out) OR edit the directory listing descriptions yourself (that's a lot of work).

nohaber, Jul 21, 2005 IP

Tuning Well-Known Member

Messages:: 1,005

Likes Received:: 51

Best Answers:: 0

Trophy Points:: 138

#3

Thanks nohaber,

But I'm looking for other methods like inserting random rss feeds, inserting some unique texts etc.

But I'm not sure how much effective they can be.

Regards,
Tuning

Tuning, Jul 21, 2005 IP

sadcox66 Spirit Walker

Messages:: 496

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#4

Why do you think you will be penalized for duplicates ?
Is this a copy of DMOZ ?

sadcox66, Jul 21, 2005 IP

Tuning Well-Known Member

Messages:: 1,005

Likes Received:: 51

Best Answers:: 0

Trophy Points:: 138

#5

sadcox66 said:

Why do you think you will be penalized for duplicates ?
Is this a copy of DMOZ ?
Click to expand...

Nope this is not a dmoz copy, but looking to site: command I'm not seeing any descriptions to indexed pages.

Tuning, Jul 21, 2005 IP

gdtechind Peon

Messages:: 414

Likes Received:: 11

Best Answers:: 0

Trophy Points:: 0

#6

the same happened with one site of mine as well
i don't know why google can't take up descriptions while all the tags are in their place and all seem to be working fine.

gdtechind, Jul 21, 2005 IP

nohaber Well-Known Member

Messages:: 276

Likes Received:: 18

Best Answers:: 0

Trophy Points:: 138

#7

Do you use site:www.yourdomain.com? Use site:www.yourdomain.com somecommonkeyword

nohaber, Jul 21, 2005 IP

gdtechind Peon

Messages:: 414

Likes Received:: 11

Best Answers:: 0

Trophy Points:: 0

#8

nohaber said:

Do you use site:www.yourdomain.com? Use site:www.yourdomain.com somecommonkeyword
Click to expand...

yes thats ok, but without a keyword, it gives such blank pages which shows that these blank pages exist as well on google which is a bad thing to see

gdtechind, Jul 21, 2005 IP

tfbpa The....Alive

Messages:: 896

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 120

#9

I have had great succes getting all pages of a directory indexed by means of RSS.

The directory itself has about 5000 categories and only 144 links are placed. Therefore most pages used to have no content at all and therefore were not cached.

What I did was to use random RSS feeds, but most of them were related to the subject. Like all categories under continent Africa have random RSS feeds about Africa, categories under Asia have random RSS feeds about Asia, etc.

This way the text onpage is somewhat relevant and now more than 6.500 pages are really cached, by looking with the API. Normal results show 35.000 pages, but not all of them are cached according to the API.

In short, random RSS feeds did wonders for my site! That and unique titles without any keyword/description tags. I noticed your keyword/description tags are the same for all pages, better to leave it out entirely if you cannot make them unique.

tfbpa, Jul 21, 2005 IP

Tuning Well-Known Member

Messages:: 1,005

Likes Received:: 51

Best Answers:: 0

Trophy Points:: 138

#10

tfbpa said:

I have had great succes getting all pages of a directory indexed by means of RSS.

The directory itself has about 5000 categories and only 144 links are placed. Therefore most pages used to have no content at all and therefore were not cached.

What I did was to use random RSS feeds, but most of them were related to the subject. Like all categories under continent Africa have random RSS feeds about Africa, categories under Asia have random RSS feeds about Asia, etc.

This way the text onpage is somewhat relevant and now more than 6.500 pages are really cached, by looking with the API. Normal results show 35.000 pages, but not all of them are cached according to the API.

In short, random RSS feeds did wonders for my site! That and unique titles without any keyword/description tags. I noticed your keyword/description tags are the same for all pages, better to leave it out entirely if you cannot make them unique.
Click to expand...

Thaks tfbpa.

That is what I was looking for. But Never heard any sucess stories using rss feeds.

Tuning, Jul 21, 2005 IP

nohaber Well-Known Member

Messages:: 276

Likes Received:: 18

Best Answers:: 0

Trophy Points:: 138

#11

When you use site:www.domain.com without keywords, you only specify where to search, but don't supply a query. Google handles it strangely and returns weird results.

nohaber, Jul 21, 2005 IP

SEbasic Peon

Messages:: 6,317

Likes Received:: 318

Best Answers:: 0

Trophy Points:: 0

#12

You can certianly beat dup content filters - But it's a case of whether you want the copy to make sense or not

SEbasic, Jul 22, 2005 IP

Tuning Well-Known Member

Messages:: 1,005

Likes Received:: 51

Best Answers:: 0

Trophy Points:: 138

#13

SEbasic said:

it's a case of whether you want the copy to make sense or not
Click to expand...

Sorry for being dumb, but I could not understand what you are suggesting.

Tuning, Jul 22, 2005 IP

SEbasic Peon

Messages:: 6,317

Likes Received:: 318

Best Answers:: 0

Trophy Points:: 0

#14

I guess I'm just saying that it is easy to change singular words in copy.

SEbasic, Jul 22, 2005 IP

yfs1 User Title Not Found

Messages:: 13,798

Likes Received:: 922

Best Answers:: 0

Trophy Points:: 0

#15

tfbpa said:

I have had great succes getting all pages of a directory indexed by means of RSS.

The directory itself has about 5000 categories and only 144 links are placed. Therefore most pages used to have no content at all and therefore were not cached.

What I did was to use random RSS feeds, but most of them were related to the subject. Like all categories under continent Africa have random RSS feeds about Africa, categories under Asia have random RSS feeds about Asia, etc.
Click to expand...

I couldn't agree more and in fact it is essential to use this technique in a new directory of you are using AdSense. This is because your empty categories are in violation of the TOS but if you add content dynamically, you are fine.

yfs1, Jul 22, 2005 IP

Tuning Well-Known Member

Messages:: 1,005

Likes Received:: 51

Best Answers:: 0

Trophy Points:: 138

#16

SEbasic said:

I guess I'm just saying that it is easy to change singular words in copy.
Click to expand...

Does that mean changing all category names in singular to plural ?

Anyway I will add dynamic rss feed to check how it works.

Thanks.

Tuning, Jul 22, 2005 IP

WhatiFind offline

Messages:: 1,789

Likes Received:: 257

Best Answers:: 0

Trophy Points:: 180

#17

Tuning said:

What are the efficient methods to beat the google duplicate content filter for a directory like this :
http://www.google.com/search?hl=en&q=site:www.directoryone.info&btnG=Google+Search
Click to expand...

First what I would to is create a rewrite module to the directory.

And then create for every category a description and include the description to the pages and meta.

That way you always have a description on the pages. I did it with my directory http://www.dirspace.com/ it's a lot of work but really pays off. About 3800 pages are indexed.

Also put your <head></head> into the top of the html page then insert the table with picture and adsense code.

WhatiFind, Jul 22, 2005 IP

Tuning Well-Known Member

Messages:: 1,005

Likes Received:: 51

Best Answers:: 0

Trophy Points:: 138

#18

Thanks Whatifind!

Thats helps a lot.

Cheers!

Tuning, Jul 22, 2005 IP

siraxi Peon

Messages:: 333

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 0

#19

Hey, just a question (off the top of my head): what is the page rank of http://www.yourdomain.com/ (it looks to me like being 0) and why isn't the lucky owner of this site doing something about it?
How come a such popular site (has inbound links all over the place) is not taken advantage of?

siraxi, Jul 25, 2005 IP

Log in or Sign up

Beating the duplicate content filter!

Tuning Well-Known Member

nohaber Well-Known Member

Tuning Well-Known Member

sadcox66 Spirit Walker

Tuning Well-Known Member

gdtechind Peon

nohaber Well-Known Member

gdtechind Peon

tfbpa The....Alive

Tuning Well-Known Member

nohaber Well-Known Member

SEbasic Peon

Tuning Well-Known Member

SEbasic Peon

yfs1 User Title Not Found

Tuning Well-Known Member

WhatiFind offline

Tuning Well-Known Member

siraxi Peon

Useful Searches