OK, maybe a n00b question (I'm a n00b, so probably). How do I stop THIS happening?!: http://www.google.com/search?q=www.itsgottabered.com+fatal+error&btnG=Search Have a look at the cached version of my site, if I have not been recrawled since then. Google has crawled me at the exact point (not hard, given the recent outages with my host) when the Mysql server was down - the site is working fine otherwise - and consequently my Google entry is just a php error page!! That CAN'T be good for business!! How do I trap this error, so that browsers, and in particular, robots are informed that the site is temporarily down and to come bak later? Some fairly simple php code, I am guessing...! Thanks, markowe
Easiest way to do this is just repress the output from mysql_connect. Try the following: @mysql_connect("host", "username", "password") or die("Site down temporarily, please try again"); PHP: Something like would suffice. The other option is something like that: $linkId = @mysql_connect("host", "username", "password"); if (!$linkId) { //do everything you want, including maybe redirecting to an "offline" page or //something. This is what I used to do on shared accounts } PHP:
That would help for real users, naturally, but I don't know of anyway to inform a spider to come back later...
Sorry for my first post, I didn't full understand apparently. Here is another shot. The first thing I would do is switch hosts. This shouldn't be a problem you have to deal with, because your mysql host shouldn't be down enough to warrant this kind of solution. That being said, if I had to solve this problem, this is how I would do it. This solution assumes you know a good bit of php: 1 - submit a google sitemap to force a re-spider sooner than later. More information can be found at google.com/webmaster. This way, you will request a spider and it should come in the next 2 months. Google doesn't tell us how often they spider our sites, so this is usually a good idea. 2 - When your mysql_connect function fails, as in the code I posted above, write a robots.txt file to root disallowing everything. This should force the bot to move on. If your site can connect, delete the robot file (or just make sure it's not there). There are enumerous downsides to this solution, namely that it sucks to have that overhead. Like I said, way easier to just find a new host. If anyone else has any solutions, I would be interested to see their approach to solving this problem.
I do not believe any spiders listen to the requests of web masters, other than to consider the contents of the robots.txt file. Google indexes sites on a long or short cycle -- depending on how they rate your site. However, using the google site map should help ensure your site is indexed faster when new pages are added or current ones are substantially changed. The cached page at Google is not good for business. It tells evil hackers something about your site and, if the file belongs to a well known project, what kind of CMS or ecommerce suite you are running and perhaps even hint at the version based on the line number at which the error occured. This kind of information can be a significant security issue. Therefore, you want to make sure only "black hat" safe responses are made and actions taken whenever errors occur in your scripts.
libneiz's idea of writing the robots.txt file on failure is interesting, so I would like to suggest the 'reverse'. Assuming you are using Apache, you could actually make robots.txt an alias to a script... make that script check the status of MySQL and decide whether or not to 'return' a 'full access' or a 'no access' robots.txt file. Just a thought.
Hmmm, all pretty complex solutions for a two-bit programmer like me I guess the solution is to wait to get crawled again...?! But surely webcrawlers listen to, say 301 redirects, so why wouldn't it at least go, "aha, it's temporarily down, I'll come back in LESS than the usual 5 days or whatever"? Ah well... Gonna have to get a bit more rigid about my error trapping... darn, that's all I need, more work...!
By the way, re: changing my host, I use Netfirms, who have a pretty good reputation - they claim they have been subject to DOS attacks the last few days. I will give them the benefit of the doubt for the moment. As it happens, things are working MUCH better this morning, and as for DOS attacks, they can be suffered by any provider, so it wouldn't do help much to change. I can recommend Netfirms - the pricing is very reasonable, LOADS of options, great control panel. (hey, look, not a affiliate code in sight! Just information, that's all!)
Well, first of all I would use the mysqli functions, if you are not using that already: http://php.net/mysqli Then the best way to trap a MySQL error is: <?php $mysqli = new mysqli("localhost", "my_user", "my_password", "world"); # check connection if (mysqli_connect_errno()) { # here you catch the error with mysqli_connect_error() so you can also log it, maybe mail it to you } ?> PHP: http://php.net/manual/en/function.mysqli-connect.php When you encounter the error I would do a 302 redirect (not a 301: remember, 301 -> moved permanently, 302 -> moved temporarily) to your home page or to an error page. You can do this with a simple 'Location' header: header("Location: http://www.example.com/"); PHP: since, as stated here: http://php.net/header, 'Location' already issues a 302 This way Google will keep the original page in its index and will crawl it back whenever he feels like (with a 301 he will think that the page had permanently moved and no longer exists) HTH, cheers
Dude, THAT is what I was looking for! It's gonna take me a while to get my head round it, but it's pretty simple, I can see. Will ask stupid questions if I get stuck Many thanks, markowe