changing the robots.txt my page disapper for google

Discussion in 'Google' started by yaseen4u, May 10, 2009.

  1. #1
    hi to all
    i just changed my page robots.txt
    # Google - Most Important bot
    #   Unfortunately a robots.txt will only stop it crawling certain urls, and NOT adding any
    #   urls which it comes across into its index. So we're relying on a meta noindex tag.
    User-agent: Googlebot
    # Don't index mobile versions
    Disallow: /forum/index.php?*;wap
    Disallow: /forum/index.php?*;wap2
    Disallow: /forum/index.php?*;imode
    
    User-agent: ia_archiver
    Disallow: /
    
    # Yahoo - Too aggressive
    #   So limit it as much as possible.
    User-agent: Slurp
    # Disallow Everything
    # Now allow bits and then disallow bits
    Allow: /forum/sitemap.xml$
    Allow: /forum/robots.txt$
    Allow: /forum/index.php$
    Allow: /forum/index.php?topic=*.0$
    Allow: /forum/index.php?topic=*.*0$
    Allow: /forum/index.php?topic=*.*5$
    Allow: /forum/forum/index.php?board=*.0$
    Allow: /forum/index.php?board=*.*0$
    Allow: /forum/index.php?board=*.*5$
    # But don't allow these
    Disallow: /forum/index.php?*.msg
    Disallow: /forum/index.php?topic=*.msg*0$
    Disallow: /forum/index.php?topic=*.msg*5$
    Disallow: /forum/index.php?*.new
    # Anything with a ; disallow
    Disallow: /forum/index.php?*;*
    
    # Bad bot - Often ignores robots.txt - Waste of bandwidth
    #   Despite claiming on their website to be a search engine in development
    #   I'm suspicious as to whether they are a harvester pretending to be SE
    User-agent: Twiceler
    Disallow: /
    
    User-Agent: W3C-checklink
    Disallow: /
    
    User-agent: TurnitinBot
    Disallow: /
    
    # Stop following PHPSESSID's
    User-Agent: MJ12bot
    Disallow: /forum/index.php?PHPSESSID
    
    # Catch all (remainder)
    #   Will be followed by any bots other than ones identified above
    #   Uses BASIC robots.txt directives without wildcards, end-anchors etc
    #   So Spiders should understand these (including MSNBOT)
    User-agent: *
    # Default SMF Folders
    Disallow: /forum/attachments/
    Disallow: /forum/Packages/
    Disallow: /forum/Smileys/
    Disallow: /forum/Sources/
    Disallow: /forum/Themes/
    # Default SMF Actions
    Disallow: /forum/index.php?action=activate
    Disallow: /forum/index.php?action=admin
    Disallow: /forum/index.php?action=calendar
    Disallow: /forum/index.php?action=emailuser
    Disallow: /forum/index.php?action=findmember
    Disallow: /forum/index.php?action=help
    Disallow: /forum/index.php?action=helpadmin
    Disallow: /forum/index.php?action=login
    Disallow: /forum/index.php?action=logout
    Disallow: /forum/index.php?action=mlist
    Disallow: /forum/index.php?action=modifykarma
    Disallow: /forum/index.php?action=pm
    Disallow: /forum/index.php?action=post
    Disallow: /forum/index.php?action=printpage
    Disallow: /forum/index.php?action=profile
    Disallow: /forum/index.php?action=recent
    Disallow: /forum/index.php?action=register
    Disallow: /forum/index.php?action=reminder
    Disallow: /forum/index.php?action=search
    Disallow: /forum/index.php?action=theme
    Disallow: /forum/index.php?action=unread
    Disallow: /forum/index.php?action=unreadreplies
    Disallow: /forum/index.php?action=verificationcode
    Disallow: /forum/index.php?action=who
    Disallow: /forum/index.php?theme
    Disallow: /forum/index.php?action=stats;expand
    Disallow: /forum/index.php?action=stats;collapse
    
    Code (markup):
    to

    
    User-agent: *
    Disallow: /
    
    Code (markup):
    changing this suddenly my page disapper from google

    becuase i used this to index not only in google but also yahoo and msn

    can any one help to index my page in yahoo and msn

    but i changed to preview one............

    anyway give me code for robots.txt for msn google and yahoo to index...............
     
    yaseen4u, May 10, 2009 IP
  2. digitalextrememediagroup

    digitalextrememediagroup Peon

    Messages:
    429
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #2
    I hate robot txt. You can just remove pages and dir at Goole webmaster tools.
     
  3. .htaccess

    .htaccess Peon

    Messages:
    277
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    0
    #3
    User-agent: *
    Disallow: /

    Means Disallow * i.e. Everything !! and / means root folder !

    Now as u removed that robots.txt and replaced original, then just resubmit your links to Yahoo and MSN..

    https://siteexplorer.search.yahoo.com/
    and
    http://search.live.com/docs/submit.aspx
    and
    http://www.google.com/submityourcontent/index.html
    Code (markup):
    No need to do any changes to your actual robots.txt file
    Always use robots.txt file to protect your site from hackers and spam. !!
     
    .htaccess, May 11, 2009 IP
  4. SearchBliss

    SearchBliss Well-Known Member

    Messages:
    1,899
    Likes Received:
    70
    Best Answers:
    2
    Trophy Points:
    195
    Digital Goods:
    1
    #4
    Ouch, you shot yourself in the foot! List directories and files to Disallow, not your entire site. Then do what .htaccess said and re-submit through Google webmaster tools.
     
    SearchBliss, May 11, 2009 IP
  5. angilina

    angilina Notable Member

    Messages:
    7,824
    Likes Received:
    186
    Best Answers:
    0
    Trophy Points:
    260
    #5
    If you want all search bots to crawl everything, then you need to use this code:

    User-agent: *
    Disallow:

    On the other hand, the code you used, which is:

    User-agent: *
    Disallow: /

    Is used to stop all search bots, including, Google, Yahoo and MSN to not crawl any page in your site.
     
    angilina, May 11, 2009 IP
  6. jasonsc

    jasonsc Well-Known Member

    Messages:
    1,696
    Likes Received:
    56
    Best Answers:
    0
    Trophy Points:
    165
    #6
    now this is stupid. Why didn't you just remove robots.txt if you didn't know what you were doing? You don't NEED to have robots.txt. Simply remove it and bots will come back in few days
     
    jasonsc, May 11, 2009 IP
  7. mr_vampire

    mr_vampire Well-Known Member

    Messages:
    458
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    130
    Articles:
    1
    #7
    Hey,you have put a restriction to all the the search engine crawlers.
     
    mr_vampire, May 11, 2009 IP
  8. junvalasek

    junvalasek Banned

    Messages:
    279
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Remove robot.txt... it won't help you..
    I don't use any robot.txt
     
    junvalasek, May 11, 2009 IP