Site is waaaay worse today. I have about 50 posts from users who want my head on a platter, haha. I turned persistent connections back on; I feel like I should up my database connections again as well, at least it'd prevent database errors :\ But I do understand your point - this stuff shouldn't be happening at all.
How much traffic are you getting? That's an awful lot of concurrent sessions happening in the same second. Also how big are all your databases combined (GB-wise)? And yeah... if it's better with the old settings, revert for now until you can fix the underlying issue. I'm wondering if you simply need more memory for the server. MySQL will only perform well if there's enough FREE memory for MySQL to store all databases in-memory (cache).
My database is about 4Gb compressed. I think it's closer to 7 or 8 on the disk. My vBulletin "active users" section is listing about 5500 users online at the moment, although that's clearly a fudge factor - I think based on unique IP access over the last 20-30 minutes. I'll check on the swap space logs to see if it was thrashing into physical memory when I had issues today.
Well even if it's not hitting swap for normal memory usage, you need more memory for MySQL... if you have the memory, MySQL will store cached versions of your tables in-memory. If you don't, it will read records from the tables as needed from the disk. Disk swap isn't going to show that since MySQL won't try to use more memory than it has. But that being said, you definitely need more memory. You should check the disk i/o when things are slow and see if your server is being bottlnecked by how fast it can read stuff from disk. I'm guessing your disk is working overtime seeking around all over the disk to get data to answer all the concurrent queries. You literally can't have too much memory in your DB server, so get more... get as much as you can. My servers all have 256GB RAM per server (overkill, but like I said... the more the better).
Hahaha. Okay, I'll talk to my hardware guys and see what they can do for me. [edit] Do you know if you can just shutdown the box, add the memory, and it'll start getting used by a Linux box when you turn it back on?
So, I beefed up my server to 32Gb of RAM, and started it up. My post table is now corrupt and I'm running a myisamchk -r on it; hopefully it's not hosed.
Yeah, normally you can just add memory and that's all you need to do unless it's some ancient BIOS on the hardware. Did you just unplug it rather than shutting it down??? Lol (well, not funny really)
I ran `shutdown -h now`, and it should've shutdown mysql properly, but I noticed a message in the logs saying it failed. I'm guessing my box just killed the process when it was shutting down, causing the table to become corrupt.
And I wasn't able to repair the table, so now I get to restore a backup. Fun stuff. Luckily my mirror wasn't corrupt, so I could make a backup from just before the site went down, but it looks like I'm going to be working a 14 hour day. woooo
Crappy... Just for future reference, never force kill mysqld if you can help it. If you *must*, shut down web server (so nothing is writing anything to the database), the run this SQL command: FLUSH TABLES; Code (markup): Then kill it... That will flush all the tables to disk properly. Strange that repair table couldn't fix it... I've never run into a situation where REPAIR TABLE couldn't repair a MyISAM table.
Thanks for the advice -- definitely will use that in the future if I need to. Also, yeah, that repair was super strange. I ended up trying to run myisamchk -r -f on it after it failed, as plenty of other folks online suggested would work, and it actually segmentation faulted, which I found really difficult to believe. But it is what it is.
It was showing up in top. The network admin and other guys around suggested that we take out the memory (the segmentation fault and memory changes seemed too weird to just be a coincidence to them) and trying to run the repair again to see if that would work, and it's been running for a long time now (way longer than before) while I'm also unzipping a backup on that machine. Honestly, I'd rather if I still had the additional ram, since I'm sure this table is just fubar. And they've all gone home and left me to clean up the mess, so I'm not certain I can even add the additional ram now if I wanted to.
Well, it's worth a shot... you could also just transfer the corrupt table files to a different server and try repairing there if you have another server available.
*sighs* http://docs.oracle.com/cd/E17952_01/refman-5.0-en/myisam-crash-recovery.html Apparently yes. Not to mention that fact that I didn't turn off all of my webservers so it was still getting hit with some requests. Possibly why the repair failed.
Yup, all 32GB of memory up and running. And I haven't had any warnings or alerts from the site since it came back up. Fingers crossed, but I'll wait a few days until I start feeling better.