1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Is Dreamhost support stupid or am i missing something?

Discussion in 'robots.txt' started by johan-cr, Dec 27, 2006.

  1. #1
    Today i got an email from dreamhost support that said i had a bot that was going through all my pages on all my domains. They found this bot to be google bot.

    What dreamhost support then did was to add a robots.txt file with the following content to ALL of my domains (100+).

    # go away
    User-agent: *
    Disallow: /

    Does this not mean that all search engine bots will skip my domains and in the end all my pages will be deindexed also.

    I have not agreed that dreamhost would do something like this but maybe i am missing something and what they did was actually a good thing?
     
    johan-cr, Dec 27, 2006 IP
  2. dshah

    dshah Well-Known Member

    Messages:
    1,840
    Likes Received:
    69
    Best Answers:
    0
    Trophy Points:
    115
    #2
    thats damn scary, if dreamhost did it.
     
    dshah, Dec 27, 2006 IP
  3. johan-cr

    johan-cr Well-Known Member

    Messages:
    2,034
    Likes Received:
    170
    Best Answers:
    0
    Trophy Points:
    135
    #3
    I have an email from their support where they explain that they have done it, they also clearly mention that they think google bot is a big problem on my sites (i think it is great that it crawls my sites all the time). Already some indications from google that they can not reach my sitemaps...

    I have sent an email back to their support where i want them to explain their actions, still awaiting that response.

    Have also just finished the work of deleting all the robots.txt files that they have added there...
     
    johan-cr, Dec 28, 2006 IP
  4. maiahost

    maiahost Guest

    Messages:
    664
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Did they come up with some "reasonable" explanation like "google bot is crawling too fast and that's causing high server load?" or they simply told you that there's a problem with it ?
     
    maiahost, Dec 28, 2006 IP
  5. koolasia

    koolasia Banned

    Messages:
    1,413
    Likes Received:
    59
    Best Answers:
    0
    Trophy Points:
    0
    #5
    that sucks how can a host tell his customer plz dont let google see ur site
     
    koolasia, Dec 28, 2006 IP
  6. johan-cr

    johan-cr Well-Known Member

    Messages:
    2,034
    Likes Received:
    170
    Best Answers:
    0
    Trophy Points:
    135
    #6
    I am using dreamhost for my new sites that have little or no traffic, so the hitratio for googlebot vs real visits is extreme. That is also what dreamhost has picked up on.

    I am at 1% of my bandwidth usage so that is not the problem. It should also be my decision and not theirs if i want to use my bandwidth for googlebot or other things...
     
    johan-cr, Dec 28, 2006 IP
  7. maiahost

    maiahost Guest

    Messages:
    664
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Now that's plain dumb if you ask me - who are they to decide what or who you want to accomodate and how you want to use your website. If I were you I'd call the supervisor of that Support person and give them hell.
     
    maiahost, Dec 28, 2006 IP
  8. FireStorM

    FireStorM Well-Known Member

    Messages:
    2,579
    Likes Received:
    88
    Best Answers:
    0
    Trophy Points:
    175
    #8
    Lol thats crazy . They are dumb i think . stupid
     
    FireStorM, Dec 28, 2006 IP
  9. explorer

    explorer Well-Known Member

    Messages:
    463
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    110
    #9
    A while back, one of my hosts blocked googlebot at the server level - not in the robots.txt files of individual sites. Earnings from a site I had on the server soon began to fall but it took me a while to figure out what had happened. The host claimed they had blocked access to the whole server because one particular site on the server (not mine) was being hammered by googlebot. To say I was annoyed would be a considerable understatement.
     
    explorer, Dec 28, 2006 IP
  10. Coupons

    Coupons Active Member

    Messages:
    889
    Likes Received:
    42
    Best Answers:
    0
    Trophy Points:
    70
    #10
    That is really serious! Have you posted this in their forum? I would like to see official replies.
     
    Coupons, Dec 29, 2006 IP
  11. Rogem

    Rogem Peon

    Messages:
    171
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #11
    I assume it's just them trying to save on bandwidth, I think 2TB bandwidth costs like $100 a month per a server. You should complain to them that there stopping you doing well by stopping google bot crawl your site.
     
    Rogem, Dec 29, 2006 IP
  12. michael_dreamhost

    michael_dreamhost Peon

    Messages:
    3
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #12
    It is possible to create a script that just consumes more and more memory and more and more cpu, but never actually transfers any information and only uses a tiny amount of disk space. If a user had a script that did this we would disable it. Everyone would be pretty much in agreement that this was a good idea, especially the other users on the same machine in a shared hosting environment.

    The other end of the spectrum is a plain html file with a link to a video. Someone could go to your site and download the video and it would use a lot of bandwidth, but very little cpu or memory on the web server.

    Many web hosting customers use third party scripts or write their own code that has not been optimized very well. This is usually ok because the website often does not get much traffic. Once a website starts to get more popular though the good web designers then go back and improve on their initial design of the website to make it more efficient.

    DreamHost does not in general add a robots.txt file to a customers account, but if as in this case the code is very inefficient and goolge bot is hammering it, we will add the file to protect the server and then contact the customer to work with them on improving their website. The key here is that it is the code on the website that needs improving.

    I saw that support said the following:

    "We're not asking for you to completely block out bots that crawl your site, but we are asking for you to slow it down. Please read our wiki article here:

    http://wiki.dreamhost.com/index.php/Bots_spiders_and_crawlers
    "

    We were clear that the user above could update the robots.txt to be whatever he wanted. The user has said he will use the information in the article above and google's webmaster tools to slow down the bot. I will also instruct the admins to try the slow down method first as well.

    If a site is affecting the performance of a server we reserve the right to even shut it down completely until the site can be fixed. Still, we do work hard to find other solutions and in this case we merely blocked google bot until the problem could be resolved. Overall there is nothing dumb or evil or sneaky about trying to keep a server up and functioning well. If we didn't stop sites from running out of control there would be ten times the number of customers complaining that our servers were slow and crashy.

    Overall, the server resources, professional communication and troubleshooting provided to this customer at our well-known low prices is astounding! Just to put it in perspective the user above is in the Top 10 cpu users on all of DreamHost (not percent). There are thousands and thousands of other users with more traffic and less cpu usage. He is currently using a dedicated server’s worth of cpu resources and most other hosts probably would have just forced him to move to a dedicated server.

    We try very hard to be accommodating and it does seem like the above complaint will be resolved to everyone’s liking.
     
    michael_dreamhost, Dec 29, 2006 IP
  13. johan-cr

    johan-cr Well-Known Member

    Messages:
    2,034
    Likes Received:
    170
    Best Answers:
    0
    Trophy Points:
    135
    #13
    Sounds like a good approach.

    True, Dreamhost support has been very good after the first email where they shut down all bots. Everything resolved for now.
     
    johan-cr, Dec 29, 2006 IP
  14. Coupons

    Coupons Active Member

    Messages:
    889
    Likes Received:
    42
    Best Answers:
    0
    Trophy Points:
    70
    #14
    Thank you Michael. I asked this in Dreamhost's forum, so that we could get an official reply.
    And here it is, really quick :)
     
    Coupons, Dec 30, 2006 IP
  15. relysites

    relysites Active Member

    Messages:
    359
    Likes Received:
    17
    Best Answers:
    0
    Trophy Points:
    60
    #15
    That was an interesting read, and good that dreamhost actually posted within it, glad it worked out.
     
    relysites, Feb 19, 2007 IP
  16. ZaxiHosting

    ZaxiHosting Well-Known Member

    Messages:
    1,997
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    130
    #16
    Unprofessional behaviour i Think
     
    ZaxiHosting, May 15, 2007 IP
  17. ishan

    ishan Prominent Member

    Messages:
    2,212
    Likes Received:
    88
    Best Answers:
    0
    Trophy Points:
    325
    #17
    Overselling is costing them now I think . I never believed in Dreamhost packages . Could you please check Server Status from your cPanel & see if server load is high which may be the reason they are trying to slow down Google Bot.
    According to our company's hosting policy , it doesnt matter if any kind of bot visits your website & even if it is consuming more bandwidth than normal websites.
    Slowing down a bot is not good for indexing according to what I read in my Google WebMaster Tools account. They have recommended Speed to be Normal.

    Just my $0.02 though

    Thanks
    Ishan
     
    ishan, May 15, 2007 IP
  18. inworx

    inworx Peon

    Messages:
    4,860
    Likes Received:
    201
    Best Answers:
    0
    Trophy Points:
    0
    #18
    Googlebot normally uses 0.5% of server resources on celeron 2 GHz with 512 MB RAM. So, probably there might be lots of sites being crawled by google bot at same time. They ontrol panel shows 1 user as 1. So, all google bot = 1 user.
     
    inworx, May 16, 2007 IP
  19. neonKnight

    neonKnight Peon

    Messages:
    36
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #19
    good read, are these sort of problems with dream host common or not really? i was considering acquiring an account with them, but after this i have my doubts
     
    neonKnight, May 16, 2007 IP
  20. michael_dreamhost

    michael_dreamhost Peon

    Messages:
    3
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #20
    Ishan, can you please reference the url and post the quote here that you are referring to? At DreamHost we have been in direct contact with Google engineers and this does not match what we are hearing from them.

    Inworx, what you are saying is off the mark technically as far as measuring the effect googlebot can have on a server. It includes no mention of the number of pages being crawled nor the the type of pages being crawled. Imagine for instance a site such as http://example.com/crashme.cgi that just recursively allocates memory. This single page view will obviously have a much larger effect then thousands of small static html pages.

    Also you reference the DreamHost control panel but without any knowledge of how it works. "panel shows 1 user as 1. googlebot=1 user" means what exactly? If you are trying to say that we can not tell which domains and scripts are being visited or which of the many ips that googlebot connects from is the culprit, you guessed incorrectly in this case.

    neonKnight, to give you an idea why we might stop googlebot temporarily: we have had cases where it gets confused by a blog that might only have a couple posts to it in total, but since the blog has many dynamic elements and is circularly linked, googlebot will hit the sight thousands of times in a very short period as it tries to follow all the circular links. We work closely with our customers to keep their sites running smoothly. The only time we will take corrective measures in the meantime is if the site is affecting the server for the other customers. It doesn't do anyone any good to let the server spiral out of control.

    Hope this helps!
     
    michael_dreamhost, May 16, 2007 IP