robots , duplicate content and google

Discussion in 'Google' started by fsmobilez, Sep 23, 2008.

  1. #1
    Hi
    i want to know that there are some pages of my site which are crawled by google

    and i have come to know about that now

    some told me to block that pages using robots.txt and i have blocked them using robots but still after two weeks all that blocked pages were appearing in the google although there cache pages were not updating any more

    but i want to totally remove them and to inform u that i cant no index on site as it is dynamic site and all pages follow one header

    My question is will robots.txt helps in removing even they are crawled by google

    and 2nd question is

    let say i have these two urls in google
    www.example.com/demo/jokes_category.php?cat_id=78

    www.example.com/demo/jokes_category.php?cat_id=78&jtype=1

    while the 2nd one is still in google but is blocked by robots.txt

    will google still consider this as duplicate content or it will just ignore the 2nd url

    Plz give me exact answer.
     
    fsmobilez, Sep 23, 2008 IP
  2. IEmailer.com

    IEmailer.com Well-Known Member

    Messages:
    1,864
    Likes Received:
    27
    Best Answers:
    0
    Trophy Points:
    110
    #2
    Beside of blocking access using the robots.txt file, Try using: NOINDEX, NOFOLLOW and NOARCHIVE meta-tags in those pages...

    If both URL's showing the same content and they are indexed and cached in Google, it will be considered as a Duplicate Content, and one of those URL's will be ignored.

    Try redirect the unwanted url's to the one that you want to show in the SERP's even on the source code level by managing the parameters, or using .htaccess 301 redirects.

    Hope it was clean and helpful, REP is most appreciated ;)
     
    IEmailer.com, Sep 23, 2008 IP
  3. magda

    magda Notable Member

    Messages:
    5,197
    Likes Received:
    315
    Best Answers:
    0
    Trophy Points:
    280
    #3
    There's a place in webmaster tools to ask to have urls removed.
     
    magda, Sep 23, 2008 IP
  4. hardevzala

    hardevzala Banned

    Messages:
    228
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Use Webmaster Tool
     
    hardevzala, Sep 23, 2008 IP
  5. fsmobilez

    fsmobilez Active Member

    Messages:
    449
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    78
    #5
    >>Beside of blocking access using the robots.txt file, Try using: NOINDEX, NOFOLLOW and NOARCHIVE meta-tags in those pages...

    As i already mentioned before that it is a dynamic script based site and only use one header for all pages means if i add "no index, no follow" it will be added on complete site which can create problem.




    >>If both URL's showing the same content and they are indexed and cached in Google

    yes they are same content

    >> it will be considered as a Duplicate Content, and one of those URL's will be ignored.

    i have blocked duplicate content by robots.txt , means i have blocked duplicate urls by robots and google is showing the one which i asked google to show in there search results and yes it is ignoring the blocked one but is still appearing in google if i search for my site only

    e.g

    site:example.com

    I want to know if it is ignoring the 2nd one but is still appearing in google even they are blocked by robots "will google penalized my site in FUTRURE"



    >>There's a place in webmaster tools to ask to have urls removed

    there are almost 8000 urls which i want to remove how can i remove them also i tried to remove one of them but google denied and reason is i cant change those files to 404.


    Plz clear me and reply me
     
    fsmobilez, Sep 23, 2008 IP