Digital Point Forums
westernunion

Go Back   Digital Point Forums > Design & Development > Site & Server Administration > robots.txt
Google Analytics
Log In to view
your analytics

Reply
 
Thread Tools
  #1  
Old Dec 27th 2006, 11:42 pm
johan-cr johan-cr is offline
of the Nightfall
 
Join Date: Sep 2006
Location: Shanghai China, Ystad Sweden
Posts: 2,035
johan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to all
Is Dreamhost support stupid or am i missing something?

Today i got an email from dreamhost support that said i had a bot that was going through all my pages on all my domains. They found this bot to be google bot.

What dreamhost support then did was to add a robots.txt file with the following content to ALL of my domains (100+).

# go away
User-agent: *
Disallow: /

Does this not mean that all search engine bots will skip my domains and in the end all my pages will be deindexed also.

I have not agreed that dreamhost would do something like this but maybe i am missing something and what they did was actually a good thing?
__________________
Zlatan Ibrahimovic seo web hosting

Reply With Quote
  #2  
Old Dec 27th 2006, 11:46 pm
dshah's Avatar
dshah dshah is offline
of the Nightfall
 
Join Date: Aug 2005
Location: San Francisco CA
Posts: 1,838
dshah will become famous soon enoughdshah will become famous soon enough
thats damn scary, if dreamhost did it.
Reply With Quote
  #3  
Old Dec 28th 2006, 12:01 am
johan-cr johan-cr is offline
of the Nightfall
 
Join Date: Sep 2006
Location: Shanghai China, Ystad Sweden
Posts: 2,035
johan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to all
I have an email from their support where they explain that they have done it, they also clearly mention that they think google bot is a big problem on my sites (i think it is great that it crawls my sites all the time). Already some indications from google that they can not reach my sitemaps...

I have sent an email back to their support where i want them to explain their actions, still awaiting that response.

Have also just finished the work of deleting all the robots.txt files that they have added there...
__________________
Zlatan Ibrahimovic seo web hosting

Reply With Quote
  #4  
Old Dec 28th 2006, 12:11 am
maiahost's Avatar
maiahost maiahost is offline
Twilight Vanquisher
Recent Blog: Mokal Aura
 
Join Date: Oct 2006
Posts: 664
maiahost has a spectacular aura aboutmaiahost has a spectacular aura about
Did they come up with some "reasonable" explanation like "google bot is crawling too fast and that's causing high server load?" or they simply told you that there's a problem with it ?
__________________
Cheap web hosting | Joomla 1.5 Hosting | |
Reply With Quote
  #5  
Old Dec 28th 2006, 12:22 am
koolasia koolasia is offline
Banned
 
Join Date: Sep 2006
Location: www.robertmakesmoney.com
Posts: 1,413
koolasia has a spectacular aura aboutkoolasia has a spectacular aura aboutkoolasia has a spectacular aura about
that sucks how can a host tell his customer plz dont let google see ur site
Reply With Quote
  #6  
Old Dec 28th 2006, 12:22 am
johan-cr johan-cr is offline
of the Nightfall
 
Join Date: Sep 2006
Location: Shanghai China, Ystad Sweden
Posts: 2,035
johan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to all
I am using dreamhost for my new sites that have little or no traffic, so the hitratio for googlebot vs real visits is extreme. That is also what dreamhost has picked up on.

Quote:
As you can see googlebot is most of the hits for the site - driving up
your usage just to accomodate a bot isn't probably worth it so adding the
robots.txt will take care of it.
I am at 1% of my bandwidth usage so that is not the problem. It should also be my decision and not theirs if i want to use my bandwidth for googlebot or other things...
__________________
Zlatan Ibrahimovic seo web hosting

Reply With Quote
  #7  
Old Dec 28th 2006, 12:30 am
maiahost's Avatar
maiahost maiahost is offline
Twilight Vanquisher
Recent Blog: Mokal Aura
 
Join Date: Oct 2006
Posts: 664
maiahost has a spectacular aura aboutmaiahost has a spectacular aura about
Now that's plain dumb if you ask me - who are they to decide what or who you want to accomodate and how you want to use your website. If I were you I'd call the supervisor of that Support person and give them hell.
__________________
Cheap web hosting | Joomla 1.5 Hosting | |
Reply With Quote
  #8  
Old Dec 28th 2006, 9:19 am
FireStorM's Avatar
FireStorM FireStorM is offline
of the Nightfall
 
Join Date: Dec 2005
Location: 0101010101001011
Posts: 2,072
FireStorM is a jewel in the roughFireStorM is a jewel in the roughFireStorM is a jewel in the rough
Lol thats crazy . They are dumb i think . stupid
__________________
"Every blade of grass is a study; and to produce two, where there was but one, is both a profit and a pleasure."
(Abraham Lincoln 1859)
Reply With Quote
  #9  
Old Dec 28th 2006, 4:31 pm
explorer's Avatar
explorer explorer is offline
Hand of A'dal
 
Join Date: Feb 2005
Posts: 454
explorer has a spectacular aura aboutexplorer has a spectacular aura about
A while back, one of my hosts blocked googlebot at the server level - not in the robots.txt files of individual sites. Earnings from a site I had on the server soon began to fall but it took me a while to figure out what had happened. The host claimed they had blocked access to the whole server because one particular site on the server (not mine) was being hammered by googlebot. To say I was annoyed would be a considerable understatement.
Reply With Quote
  #10  
Old Dec 29th 2006, 6:17 am
Coupons's Avatar
Coupons Coupons is offline
Twilight Vanquisher
 
Join Date: Sep 2005
Location: Coupons.FM
Posts: 844
Coupons has a spectacular aura aboutCoupons has a spectacular aura about
That is really serious! Have you posted this in their forum? I would like to see official replies.
__________________
Special - Free Hosting - at quality webhoster, for 12 months, no catch.
Discount - LunarPages coupon, Dreamhost promo code, Servage coupon
Reply With Quote
  #11  
Old Dec 29th 2006, 6:24 am
Rogem Rogem is offline
Champion of the Naaru
Recent Blog: Programmers are?
 
Join Date: Dec 2006
Location: United Kingdom
Posts: 171
Rogem is on a distinguished road
I assume it's just them trying to save on bandwidth, I think 2TB bandwidth costs like $100 a month per a server. You should complain to them that there stopping you doing well by stopping google bot crawl your site.
__________________
Full On Design
Reply With Quote
  #12  
Old Dec 29th 2006, 7:28 pm
michael_dreamhost michael_dreamhost is offline
Peon
 
Join Date: Dec 2006
Posts: 3
michael_dreamhost is on a distinguished road
Introduction to Server Administration

It is possible to create a script that just consumes more and more memory and more and more cpu, but never actually transfers any information and only uses a tiny amount of disk space. If a user had a script that did this we would disable it. Everyone would be pretty much in agreement that this was a good idea, especially the other users on the same machine in a shared hosting environment.

The other end of the spectrum is a plain html file with a link to a video. Someone could go to your site and download the video and it would use a lot of bandwidth, but very little cpu or memory on the web server.

Many web hosting customers use third party scripts or write their own code that has not been optimized very well. This is usually ok because the website often does not get much traffic. Once a website starts to get more popular though the good web designers then go back and improve on their initial design of the website to make it more efficient.

DreamHost does not in general add a robots.txt file to a customers account, but if as in this case the code is very inefficient and goolge bot is hammering it, we will add the file to protect the server and then contact the customer to work with them on improving their website. The key here is that it is the code on the website that needs improving.

I saw that support said the following:

"We're not asking for you to completely block out bots that crawl your site, but we are asking for you to slow it down. Please read our wiki article here:

http://wiki.dreamhost.com/index.php/Bots_spiders_and_crawlers
"

We were clear that the user above could update the robots.txt to be whatever he wanted. The user has said he will use the information in the article above and google's webmaster tools to slow down the bot. I will also instruct the admins to try the slow down method first as well.

If a site is affecting the performance of a server we reserve the right to even shut it down completely until the site can be fixed. Still, we do work hard to find other solutions and in this case we merely blocked google bot until the problem could be resolved. Overall there is nothing dumb or evil or sneaky about trying to keep a server up and functioning well. If we didn't stop sites from running out of control there would be ten times the number of customers complaining that our servers were slow and crashy.

Overall, the server resources, professional communication and troubleshooting provided to this customer at our well-known low prices is astounding! Just to put it in perspective the user above is in the Top 10 cpu users on all of DreamHost (not percent). There are thousands and thousands of other users with more traffic and less cpu usage. He is currently using a dedicated server’s worth of cpu resources and most other hosts probably would have just forced him to move to a dedicated server.

We try very hard to be accommodating and it does seem like the above complaint will be resolved to everyone’s liking.
Reply With Quote
  #13  
Old Dec 29th 2006, 10:53 pm
johan-cr johan-cr is offline
of the Nightfall
 
Join Date: Sep 2006
Location: Shanghai China, Ystad Sweden
Posts: 2,035
johan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to alljohan-cr is a name known to all
Quote:
I will also instruct the admins to try the slow down method first as well.
Sounds like a good approach.

Quote:
We try very hard to be accommodating and it does seem like the above complaint will be resolved to everyone’s liking.
True, Dreamhost support has been very good after the first email where they shut down all bots. Everything resolved for now.
__________________
Zlatan Ibrahimovic seo web hosting

Reply With Quote
  #14  
Old Dec 30th 2006, 3:36 am
Coupons's Avatar
Coupons Coupons is offline
Twilight Vanquisher
 
Join Date: Sep 2005
Location: Coupons.FM
Posts: 844
Coupons has a spectacular aura aboutCoupons has a spectacular aura about
Thank you Michael. I asked this in Dreamhost's forum, so that we could get an official reply.
And here it is, really quick
__________________
Special - Free Hosting - at quality webhoster, for 12 months, no catch.
Discount - LunarPages coupon, Dreamhost promo code, Servage coupon
Reply With Quote
  #15  
Old Feb 19th 2007, 6:29 am
relysites relysites is offline
Hand of A'dal
 
Join Date: Jul 2006
Posts: 358
relysites will become famous soon enough
That was an interesting read, and good that dreamhost actually posted within it, glad it worked out.
Reply With Quote
  #16  
Old May 15th 2007, 7:29 am
ZaxiHosting's Avatar
ZaxiHosting ZaxiHosting is offline
of the Nightfall
 
Join Date: May 2007
Posts: 1,958
ZaxiHosting will become famous soon enough
Unprofessional behaviour i Think
Reply With Quote
  #17  
Old May 15th 2007, 9:19 am
ishan's Avatar
ishan ishan is offline
of the Nightfall
 
Join Date: Oct 2006
Location: India
Posts: 2,054
ishan is just really niceishan is just really niceishan is just really niceishan is just really niceishan is just really nice
Overselling is costing them now I think . I never believed in Dreamhost packages . Could you please check Server Status from your cPanel & see if server load is high which may be the reason they are trying to slow down Google Bot.
According to our company's hosting policy , it doesnt matter if any kind of bot visits your website & even if it is consuming more bandwidth than normal websites.
Slowing down a bot is not good for indexing according to what I read in my Google WebMaster Tools account. They have recommended Speed to be Normal.

Just my $0.02 though

Thanks
Ishan
__________________
Ishan Talathi | LaceHost - Cheap Web Hosting
500+ DP Members HOST with us, So can YOU - , Check it out, TODAY.
Reply With Quote
  #18  
Old May 16th 2007, 2:36 am
inworx inworx is offline
Starcaller
 
Join Date: Oct 2006
Location: India
Posts: 4,869
inworx is a splendid one to beholdinworx is a splendid one to beholdinworx is a splendid one to beholdinworx is a splendid one to beholdinworx is a splendid one to beholdinworx is a splendid one to beholdinworx is a splendid one to beholdinworx is a splendid one to behold
Googlebot normally uses 0.5% of server resources on celeron 2 GHz with 512 MB RAM. So, probably there might be lots of sites being crawled by google bot at same time. They ontrol panel shows 1 user as 1. So, all google bot = 1 user.
__________________
Universal Hosts - Premium Managed Shared and reseller Web Hosting starts at just $1.99
Now selling Managed Dedicated Servers and Virtual private Servers
WebHostTalk - All about web hosting
Reply With Quote
  #19  
Old May 16th 2007, 8:47 am
neonKnight neonKnight is offline
Grunt
 
Join Date: May 2007
Posts: 36
neonKnight is on a distinguished road
good read, are these sort of problems with dream host common or not really? i was considering acquiring an account with them, but after this i have my doubts
Reply With Quote
  #20  
Old May 16th 2007, 5:08 pm
michael_dreamhost michael_dreamhost is offline
Peon
 
Join Date: Dec 2006
Posts: 3
michael_dreamhost is on a distinguished road
Quote:
Originally Posted by ishan View Post
Slowing down a bot is not good for indexing according to what I read in my Google WebMaster Tools account. They have recommended Speed to be Normal.
Ishan, can you please reference the url and post the quote here that you are referring to? At DreamHost we have been in direct contact with Google engineers and this does not match what we are hearing from them.

Quote:
Originally Posted by inworx View Post
Googlebot normally uses 0.5% of server resources on celeron 2 GHz with 512 MB RAM. So, probably there might be lots of sites being crawled by google bot at same time. They ontrol panel shows 1 user as 1. So, all google bot = 1 user.
Inworx, what you are saying is off the mark technically as far as measuring the effect googlebot can have on a server. It includes no mention of the number of pages being crawled nor the the type of pages being crawled. Imagine for instance a site such as http://example.com/crashme.cgi that just recursively allocates memory. This single page view will obviously have a much larger effect then thousands of small static html pages.

Also you reference the DreamHost control panel but without any knowledge of how it works. "panel shows 1 user as 1. googlebot=1 user" means what exactly? If you are trying to say that we can not tell which domains and scripts are being visited or which of the many ips that googlebot connects from is the culprit, you guessed incorrectly in this case.

Quote:
Originally Posted by neonKnight View Post
good read, are these sort of problems with dream host common or not really? i was considering acquiring an account with them, but after this i have my doubts
neonKnight, to give you an idea why we might stop googlebot temporarily: we have had cases where it gets confused by a blog that might only have a couple posts to it in total, but since the blog has many dynamic elements and is circularly linked, googlebot will hit the sight thousands of times in a very short period as it tries to follow all the circular links. We work closely with our customers to keep their sites running smoothly. The only time we will take corrective measures in the meantime is if the site is affecting the server for the other customers. It doesn't do anyone any good to let the server spiral out of control.

Hope this helps!
Reply With Quote
Reply

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Stupid, stupid domain prices being asked for ... mcfox Domain Names 49 Jul 5th 2009 2:50 pm
FutureHost - Live Chat Support, Email Support, 99.7% Uptime, Addon Websites, & More.. csquare016 Web Hosting 21 Sep 1st 2008 3:04 pm
Dreamhost fast(!) support erkanbs Site & Server Administration 5 Dec 21st 2006 12:37 pm
Looking to Hire - Christian Support Reps (Online Chat Support) donny Services 5 Sep 16th 2006 7:16 pm
Call me stupid or Call me Stupid – I FINALLY decided to join Adsense LaCabra Placement / Reviews / Examples 8 Nov 9th 2005 9:09 am


All times are GMT -8. The time now is 7:31 am.