Moving directory content - duplicate content?

Discussion in 'Search Engine Optimization' started by fadetoblack22, Jun 23, 2009.

  1. #1
    I am going to be moving some files from one directory to another on my site.

    The old directory will be deleted, but will there be a problem with duplicate content if the old files are still indexed when the new ones get indexed?

    thanks.
     
    fadetoblack22, Jun 23, 2009 IP
  2. freelistfool

    freelistfool Peon

    Messages:
    1,801
    Likes Received:
    101
    Best Answers:
    0
    Trophy Points:
    0
    #2
    You should do a 301 redirect with all the files in the directory to the new directory to avoid the duplicate content. You can do one RewriteRule in your .htaccess file to do it if you're moving every file in the directory.
     
    freelistfool, Jun 23, 2009 IP
  3. technomart

    technomart Guest

    Messages:
    240
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    You should do a 301 redirect with all the files in the directory to the new directory to avoid the duplicate content.
     
    technomart, Jun 23, 2009 IP
  4. hans2100

    hans2100 Peon

    Messages:
    122
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    roger that.
     
    hans2100, Jun 23, 2009 IP
  5. ravont

    ravont Peon

    Messages:
    196
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    if the old files are still index in Google then u should not del that files from your old dir. you should take advantages of all your index file.just use 301 redirection so that could not lost your links.
     
    ravont, Jun 23, 2009 IP
  6. fadetoblack22

    fadetoblack22 Well-Known Member

    Messages:
    2,399
    Likes Received:
    62
    Best Answers:
    0
    Trophy Points:
    160
    #6
    I have all the files in the new directory and they are more recent.

    How would I take advantage of the links to the old directory?
     
    fadetoblack22, Jun 26, 2009 IP
  7. karpok

    karpok Active Member

    Messages:
    325
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    60
    #7
    yes, all the new pages detects as having Duplicate content. Bcoz, Google caches & stores a copy of entire web. So, you must have to remove all the old pages from using WMT to avoid the this problem.

    Else, you can use redirection as mentioned freelistfool. By using this technique, you don't loose link juice of your pages.

    All the best...
     
    karpok, Jun 26, 2009 IP
  8. Canonical

    Canonical Well-Known Member

    Messages:
    2,223
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    110
    #8
    If you're running on an Apache web server, use Mod Rewrite to 301 redirect all old URLs to the corresponding new URLs. It's quite simple to implement redirects in Mod Rewrite if you are familiar with regular expressions. IMO this is much better than implementing 301 redirects w/ server side scripting by modifying the pages at the old URLs to perform the 301 redirect.

    The following will explain how to 301 redirect files using Mod Rewrite and .htaccess files.


    Moving Files from /folder1/subfolder/ to /folder2/ but Leaving Page Names the Same:

    If the pages used to live at:

    http://www.example.com/folder1/subfolder/page1.html
    http://www.example.com/folder1/subfolder/page2.html
    http://www.example.com/folder1/subfolder/page3.html
    http://www.example.com/folder1/subfolder/page4.html

    and now they live at:

    http://www.example.com/folder2/page1.html
    http://www.example.com/folder2/page2.html
    http://www.example.com/folder2/page3.html
    http://www.example.com/folder2/page4.html

    respectively then you could place a .htaccess file at /folder1/subfolder/.hataccess within your web containing a rule similar to the following:

    RewriteRule (.*) http://www.example.com/folder2/$1 [R=301,L]


    Moving Files from /folder1/subfolder/ to /folder2/ and Changine the Page Names:

    If the pages used to live at:

    http://www.example.com/folder1/subfolder/page1.html
    http://www.example.com/folder1/subfolder/page2.html
    http://www.example.com/folder1/subfolder/page3.html
    http://www.example.com/folder1/subfolder/page4.html

    and now they live at:

    http://www.example.com/folder2/page5.html
    http://www.example.com/folder2/page6.html
    http://www.example.com/folder2/page7.html
    http://www.example.com/folder2/page8.html

    respectively then you could place a .htaccess file at /folder1/subfolder/.htaccess within your web containing a set of rules similar to the following:

    RewriteCond $1 ^page1\.html$
    RewriteRule (.*) http://www.example.com/folder2/page5.html [R=301,L]

    RewriteCond $1 ^page2\.html$
    RewriteRule (.*) http://www.example.com/folder2/page6.html [R=301,L]

    RewriteCond $1 ^page3\.html$
    RewriteRule (.*) http://www.example.com/folder2/page7.html [R=301,L]

    RewriteCond $1 ^page4\.html$
    RewriteRule (.*) http://www.example.com/folder2/page8.html [R=301,L]


    NOTE: Once the .htaccess file is in place at /folder1/subfolder/.htaccess and is successfully redirecting your requects for the old URLs to the new URLs, you can delete the pages from the old folder. They are no longer needed. You simply need to leave the .htaccess file there. Mod Rewrite will see the requests and issue the redirect preventing the web server from ever even attempting to render the pages at the old URL. So they are no longer needed.
     
    Canonical, Jun 26, 2009 IP
  9. fadetoblack22

    fadetoblack22 Well-Known Member

    Messages:
    2,399
    Likes Received:
    62
    Best Answers:
    0
    Trophy Points:
    160
    #9
    Thanks for the advice. Do I have to redirect every page?

    Its a wordpress install so there are a lot of pages/permanlinks.

    Is it possible to make it so when e.g. folder1 is reached it redirects to folder2 to find the same page but in the new folder?

    thanks.
     
    fadetoblack22, Jun 26, 2009 IP
  10. Canonical

    Canonical Well-Known Member

    Messages:
    2,223
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    110
    #10
    If the page names are exactly the same, but they simply exist in a new folder then the first example I gave about will work. One rule will redirect all pages. If however you changed the names of the pages AND changed folders then you have to do a page by page redirect.

    The reason you should 301 redirect is so that the new pages will get credit for all inbound links that your old pages have acquired and so that other sites that have links to those old pages don't end up with broken links on their sites.

    Why don't you post examples of some of the old an new URLs and I'll try to help you write the rules.
     
    Canonical, Jun 26, 2009 IP
  11. fadetoblack22

    fadetoblack22 Well-Known Member

    Messages:
    2,399
    Likes Received:
    62
    Best Answers:
    0
    Trophy Points:
    160
    #11
    Thanks for the help.

    I didn't post the examples because I didn't want the new one to get indexed before I did the redirect.

    Basically the old directory is: http://www.bigfreebet.com/BetAdvice/

    and the new one has /previews/ instead of /BetAdvice

    thanks.
     
    fadetoblack22, Jun 27, 2009 IP
  12. Canonical

    Canonical Well-Known Member

    Messages:
    2,223
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    110
    #12
    Then the examples that I gave above should work. Create a simple text file named .htaccess and place it at /betadvice/.htaccess . If the page names in /preview remain the same as they did in /betadvice (i.e. were only moved to a new folder, not renamed) then your .htaccess file would need the following:

    only be sure to replace www.example.com in the RewriteRule above with www.bigfreebet.com .

    NOTE: The above will effectively move EVERYTHING that lives beneath /betadvice to the same place under /previews. This includes not only the pages in /betadvice but also any that might live in subfolders of /betadvice like /betadvice/somefolder/page.html


    The RewriteRule takes the general form:

    and works like this works like this:

    (.*) is a regular expression matching pattern. the '.' says match on any single character. Having the '*' after the period says match on zero or more of the things that the '.' matches on. In otherwords, the pattern will match any string including an empty string. Any time a patterm match occurs the value that it matched on will be assigned to a variable. In this case $1 contains the pattern that matched.

    Some examples of the types of rewrites that will be handled by the RewriteRule above are:

    /betadvice/ -->301--> /previews/
    /betadvice/page1.html -->301--> /previews/page1.html
    /betadvice/folder/page2.html -->301--> /previews/folder/page2.html

    If you not only moved the pages but also changed their names then you'll probably need something like the following in the /betadvice/.htaccess file:

    When you throw in RewriteCond directives, how Mod Rewrite evaluates the rule is a little different. For example, in the case of:

    Mod Rewrite skips the RewriteCond and goes to the RewriteRule first. It first looks to see if the pattern in the rule actually gets a match. Since we used (.*) which is basically a wildcard as our pattern, it will always match so the $1 variable is assigned whatever value it matched on in the URL (i.e. everything after the /betadvice/).

    If the RewriteRule pattern got a match it then jumps up and evaluates the RewriteCond. The RewriteCond has the general form:

    Mod Rewrite applies the condpattern to the teststring to see if it can find a match. If the matching fails then Mod Rewrite abandons the current RewriteRule it's evaluating and drops thru to evaluate the next RewriteRule in the file, if one exists. However, if Mod Rewrite applies the condpattern to the teststring to see if it can find a match and it succeeds to match then it goes back to the RewriteRule substitues the value of substitution for the previously requested URL.

    The flags [R=301,L] tells Mod Rewrite that if the RewriteRule succeeded and a substitution was done then301 redirect (R=301) to the current (substituted) URL and this is the last rule (L) that should be evaluated in the .htaccess file. So it redirects and stops evaluating subsequent rules.

    For example, if the requested URI was /betadvice/page1.html then the RewriteRule would have assigned the $1 variable the value "page1.html". Since the RewriteRule pattern matching succeeded it jumps up to the RewriteCond directive and checks to see if $1 is equal to "page.html". I had to 'escape' the '.' in the condpattern with the preceeding '\' because '.' is a special character in regular expression that means any character and I want it to match the literal character '.', not any charactor. Since the teststring ($1) does match the condpattern ("page1.html"), the RewriteCond matching succeeds and Mod Rewrite goes back to the RewriteRule, substitues the requested URL with http://www.example.com/previews/page5.html, performs a 301 redirect to the new URL (http://www.example.com/previews/page5.html) and abandons processing of the rest of the .htaccess file because of the 'L' flag.

    You can find out more about Mod Rewrite on the Apache.org site. This old version of the Mod Rewrite documentation has a good picture of how the RewriteCond directives and RewriteRule are evaluated in the Ruleset Processing section about 20% of the way down the page.

    Hope that helps.
     
    Canonical, Jun 27, 2009 IP
  13. fadetoblack22

    fadetoblack22 Well-Known Member

    Messages:
    2,399
    Likes Received:
    62
    Best Answers:
    0
    Trophy Points:
    160
    #13
    Thanks, the first bit worked perfect as all the page names are kept the same.

    As you see to know a lot about .htaccess, the site that I didn't want to get indexed has got indexed. It is a test site and I don't want any of the pages to get indexed.

    Is there a way to stop all of them getting indexed through .htaccess? None of the pages have a robots file.

    thanks.
     
    fadetoblack22, Jun 27, 2009 IP
  14. Canonical

    Canonical Well-Known Member

    Messages:
    2,223
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    110
    #14
    You can prevent those pages from being indexed by:

    1) adding a <meta name="robots" content="noindex"> element to the <head> element in the HTML of the /previews/* pages or
    2) add the following to your robots.txt in the root of your web:

    User-agent: *
    Disallow: /previews/

    If you actually put the .htaccess file in place, Google probably visited your site for a scheduled crawl of your old pages (or followed a link from a site that linked to your old pages), discovered the 301s, and indexed the new URLs.

    You can go to each of the engines and request a URL removal if you think those pages are being shown in the SERPs.
     
    Canonical, Jun 28, 2009 IP
  15. kbeus21

    kbeus21 Peon

    Messages:
    762
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #15
    kbeus21, Jun 28, 2009 IP
  16. Canonical

    Canonical Well-Known Member

    Messages:
    2,223
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    110
    #16
    Google treats duplicate content within the same site as the original version MUCH differently than they treat duplicate content on sites other than the original version. If Google discovered the /previews/* pages because of the .htaccess file in the /betadvice folder then there is NO WAY this would be considered duplicate content because of the 301 redirects. 301 redirects have been the search engines' prefered solution for fixing canonical issues/duplicate content issues within the same site forever. Matt Cutts and others have been preaching this forever.
     
    Canonical, Jun 28, 2009 IP
  17. fadetoblack22

    fadetoblack22 Well-Known Member

    Messages:
    2,399
    Likes Received:
    62
    Best Answers:
    0
    Trophy Points:
    160
    #17
    Thanks for the help, I have used what you said.
     
    fadetoblack22, Jun 28, 2009 IP
  18. ericajoieake

    ericajoieake Guest

    Messages:
    556
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #18
    the only solution for that is 301 redirection so you can maintain your backlinks and also your returning visitors.
     
    ericajoieake, Jun 28, 2009 IP