Duplicate Pages Found

Stupidav Guest

Messages:: 16

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#1

I have messed around with a what I called a website for years, never really taking the time to learn or do it properly. Although I finally sat down and started paying attention and focusing on the details behind the pages. I have learned a lot lately and want to learn more. I have recently ran a couple of different crawlers on webpage to see what kinds of results they would come up with, and I am getting some results that I am questioning.

The main one that I am having trouble understanding, is that several of the crawlers are saying most of my pages are duplicates. I have had some difficulty trying to determine what the crawlers are looking for to determine what the problem is. It is only my guess, but things like the googlebot might be coming up with the same conclusion. Can any one assist me with what the bots are looking for so that I can prevent the duplicate pages found errors? My page is located at www.stupidav.com

Stupidav, Jan 19, 2006 IP

classifieds Sopchoppy Flash

Messages:: 825

Likes Received:: 51

Best Answers:: 0

Trophy Points:: 150

#2

I didn't have time to look very deep but here's an example:

On this page: http://www.stupidav.com/cool_stuff.shtml

You have a collection of paragraphs that came from some other source such as:

"Spyware is often associated with software that displays advertisements (called adware) or software that tracks personal or sensitive information. That does not mean all software which provides ads or tracks your online activities is bad. For example, you might sign up for a free music service, but "pay" for the service by agreeing to receive targeted ads. If you understand the terms and agree to them, you may have decided that it is a fair tradeoff. You might also agree to let the company track your online activities to determine which ads to show you."
Click to expand...

Which doing an exact phase search in Google returns 231 other sites that contain that paragraph verbatium.

I suspect that many of your pages are suffering from the same problem.

Try adding / writing unique text. If you are going to include boiler plate affiliate content then you definitely need to rewrite it so that you don't look like all the other affiliates.

Good luck!

-jay

classifieds, Jan 20, 2006 IP

Sly Well-Known Member

Messages:: 26

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 116

#3

Which crawlers did you use that identified the duplicate content?
(I would like to do a similar check own my own site)

Sly, Jan 20, 2006 IP

Stupidav Guest

Messages:: 16

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#4

For starters the GSite Crawler but I was under the impresion that it only scans my site and doesn't look any further. Thanks to my freind for the writeup on the spyware stuff, I will rewrite that right away, sorry to all for that one, but other than the News, I know all of it is unique, because I wrote the rest of it.

Stupidav, Jan 20, 2006 IP

classifieds Sopchoppy Flash

Messages:: 825

Likes Received:: 51

Best Answers:: 0

Trophy Points:: 150

#5

Stupidav (do you have a name? I hate calling you stupid ),

I looked at a few more of your pages and did searches in G/Y on randomly selected paragraphs and got no matches.

Did the crawlers you used tell you what pages were dups and where the dups are located?

btw you have no backlinks to the site. You should spend a little time in the directory forum here and find directories to submit to.

-jay

classifieds, Jan 21, 2006 IP

Stupidav Guest

Messages:: 16

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#6

Sorry I haven't gotten back sooner, I've been out of town.

My name is Dave, although I anwser to Stupid, Stupidav, or all of the above.

Since I last posted I redid the site, to save time on updating the links, by putting the SSI to work. I broke up the pages seperating the Header and footer as well as the Headlines and Links into their own pages and using the include virtual to put everything together. This has elminated the Duplicate pages issue.

By doing this, it leaves me with another question, that I might need to post seperatly. Now that the pages are actual content, and the header, footer, headlines, and links are seperate, when a crawler looks at the page does it see the non server side include versions or does it see the assembled version?

Thank you for the tip Jay. I have been trying to get it linked to more and more as I have time, although I haven't focused on that 100% yet, hoping to get the site right first.

Stupidav, Jan 29, 2006 IP

classifieds Sopchoppy Flash

Messages:: 825

Likes Received:: 51

Best Answers:: 0

Trophy Points:: 150

#7

Dave,

The crawler will see the fully assembled version - just like an end-user's browser (minus any javascript or css formatting).

-jay

classifieds, Jan 30, 2006 IP

Stupidav Guest

Messages:: 16

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#8

Thanks Jay,

I just kind of wonder why it was showing that before. Oh well as long as it is not showing it now I guess that I might be OK.

Dave

Stupidav, Jan 30, 2006 IP

classifieds Sopchoppy Flash

Messages:: 825

Likes Received:: 51

Best Answers:: 0

Trophy Points:: 150

#9

Dave,

I left this out.

Here's a few tools you can use to verify what the spiders see:

http://www.webconfs.com/search-engine-spider-simulator.php

http://www.delorie.com/web/ses.cgi

classifieds, Jan 30, 2006 IP

Log in or Sign up

Duplicate Pages Found

Stupidav Guest

classifieds Sopchoppy Flash

Sly Well-Known Member

Stupidav Guest

classifieds Sopchoppy Flash

Stupidav Guest

classifieds Sopchoppy Flash

Stupidav Guest

classifieds Sopchoppy Flash

Useful Searches