Duplicates articles on blog

Discussion in 'Search Engine Optimization' started by shahab6, Apr 26, 2008.

  1. #1
    shahab6, Apr 26, 2008 IP
  2. astup1didiot

    astup1didiot Notable Member

    Messages:
    5,926
    Likes Received:
    270
    Best Answers:
    0
    Trophy Points:
    280
    #2
    No, your site won't be penalized since it's all on the same domain name; however the duplicated web pages probably won't be indexed and if they are they'll never show up in search results. Just block your blog via the robots.txt file.
     
    astup1didiot, Apr 26, 2008 IP
  3. shahab6

    shahab6 Well-Known Member

    Messages:
    2,351
    Likes Received:
    28
    Best Answers:
    0
    Trophy Points:
    138
    #3
    how do I block it?
     
    shahab6, Apr 26, 2008 IP
  4. astup1didiot

    astup1didiot Notable Member

    Messages:
    5,926
    Likes Received:
    270
    Best Answers:
    0
    Trophy Points:
    280
    #4
    Do you have a robots.txt file? What is the real URL to the domain name and the blog?
     
    astup1didiot, Apr 26, 2008 IP
  5. bs1

    bs1 Peon

    Messages:
    45
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Yes, duplicate blog content does dilute your PR. It's a big problem. Here's my wordpress 2.5 robots.txt, which I use to block my duplicate content. Works like a charm.

    Just copy the entire contents of the file below into a blank text file called robots.txt and upload it to the root directory (the main directory) of your site:


    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /wp-admin/
    Disallow: /wp-includes/
    Disallow: /wp-content/
    Disallow: /trackback/
    Disallow: /feed/
    Disallow: /tag/
    Disallow: /author
    Disallow: /comments/
    Disallow: /category/*/*
    Disallow: /trackback
    Disallow: /*trackback
    Disallow: /*trackback*
    Disallow: /*/trackback
    Disallow: /*?*
    Disallow: /*.html/$
    Disallow: /*.php$
    Disallow: /*.js$
    Disallow: /*.inc$
    Disallow: /*.css$
    Disallow: /*feed*
    Disallow: /wp-register.php
    Disallow: /wp-login.php
    Disallow: /2007/
    Disallow: /2008/
    Disallow: /stats/

    # Google Image
    User-agent: Googlebot-Image
    Disallow:
    Allow: /*

    # Google AdSense
    User-agent: Mediapartners-Google*
    Disallow:
    Allow: /*

    # Internet Archiver Wayback Machine
    User-agent: ia_archiver
    Disallow: /

    # digg mirror
    User-agent: duggmirror
    Disallow: /
     
    bs1, Apr 26, 2008 IP
  6. shahab6

    shahab6 Well-Known Member

    Messages:
    2,351
    Likes Received:
    28
    Best Answers:
    0
    Trophy Points:
    138
    #6
    shahab6, Apr 26, 2008 IP
  7. astup1didiot

    astup1didiot Notable Member

    Messages:
    5,926
    Likes Received:
    270
    Best Answers:
    0
    Trophy Points:
    280
    #7
    http://www.bestcreditrates.net/robots.txt It gives me an error and this is where the robots.txt should be located at. You want to block the /b2evolution/ directory...

    
    Disallow: /b2evolution/
    
    Code (markup):
     
    astup1didiot, Apr 26, 2008 IP
  8. shahab6

    shahab6 Well-Known Member

    Messages:
    2,351
    Likes Received:
    28
    Best Answers:
    0
    Trophy Points:
    138
    #8
    shahab6, Apr 26, 2008 IP
  9. astup1didiot

    astup1didiot Notable Member

    Messages:
    5,926
    Likes Received:
    270
    Best Answers:
    0
    Trophy Points:
    280
    #9

    
    User-agent: *
    Disallow: /b2evolution/
    sitemap: <absolute path to XML Sitemap>
    
    Code (markup):
    Remove the sitemap: line if you don't have a XML sitemap, actually instead leave it, create a XML sitemap and upload it to the same directory as the robots.txt, see my XML Sitemap FAQ thats stickied in the sitemap sub-forum on digitalpoint.
     
    astup1didiot, Apr 26, 2008 IP