PHP/MySQL running slowly, possibly due to amount of queries. Bit of a long read.

Discussion in 'PHP' started by TSelbeck, May 20, 2011.

  1. #1
    Hi,

    First of all, the server runs a Core i7 930, 16GB of RAM and a 20mbps port.

    The server hosts many smaller sites, but has 1 large site – the main website. The main website receives around 15,000 unique visitors per day, and around 500,000 page views per day. It runs from 1 database, with around 3,000 tables, with a combined 4,000,000 rows. The site is a target for other sites hot linking images, and scraping content (both of which are needed for the site to grow).

    Problems with the site started around 5 days ago. I have the site set up to my iPhone, so whenever MySQL fails – I get an alert straight to my iPhone, and once a minute until it is resolved. I looked into the problem a little, and was getting max_user_connections being reached in MySQL (which is currently set to something ridiculously high like 600 anyway).

    I concluded that it must just be lots of page loads that are causing the problem, so I decided to write some sort of caching system to minimise the database queries. This meant that after the 4,000,000 pages of the site had been ‘cached’, very little MySQL interaction would be needed, unless the user was logged in (in which case, a MySQL connection would be needed). However, the problem still persisted.

    I decided to look into the MySQL process list in WHM, and noticed that for the majority of the time, there were about 10 or so queries going on at once. Not a problem. However, when the alerts started arriving to my iPhone, I refreshed the list and noticed it was up to 1,500 queries on the list.

    I wasn’t sure as to whether these were all queries at the same time, or whether something was causing slow page loads and causing the queries to hang. So, I decided to install a little script to track page views of each individual IP address, finding out when they first access the site, when they last viewed a page, and how many pages they had loaded. This worked fine, until the inevitable spike in the MySQL processes list again. All of the queries were either reading or writing from the table I’d just created to keep track of the page views.

    I did manage to get a list of hundreds of IPs over the space of 30 minutes or so, and find which of them was loading lots of page views. When the MySQL spike came, I cleared the database and started a fresh to see if I could catch out 1 individual IP. I did. There was 1 IP that loaded ~2500 pages in 35 seconds. So I blocked his IP address from accessing the server.

    Problem still persisted.

    I removed the IP tracking script from the site to see if that helped, problem still persisted.

    I got no alerts to my phone from the site over the course of the night, so all seems fine there for the moment (however, the spikes could be anywhere from 20 minutes to 12 hours apart).

    The site this morning seems to be almost fine. However, some things are *really* slow. The majority of the pages on the main site load fine, however some pages are barely loading. The admin pane for instance, only makes 1 query to the database (checking the login details), yet takes an eternity to load. Other sites on the server are almost non-responsive too, and PHPmyAdmin, pretty much at a standstill, and even some basic HTML pages are sluggish

    I’m a little stuck of where to go really. Tried caching, narrowing the problem down to just 1 IP, upgraded PHP/Apache, optimized MySQL… and still having problems.

    Problem is driving me absolutely nuts. I’d essentially love to find out what pages or queries are causing the problem, or what IPs are causing the problem (sever stats disabled due to time/resources it takes to generate)

    Any help would greatly, greatly be appreciated. I’m certain there is something wrong. The site isn’t that big (compared to similar sites), yet they can cope fine with just 1 server. (For the record, the site isn’t anything related to downloads/porn/music/warez/torrents/etc)

    Apologies for the long read

    Tom
     
    TSelbeck, May 20, 2011 IP
  2. ssmm987

    ssmm987 Member

    Messages:
    180
    Likes Received:
    4
    Best Answers:
    3
    Trophy Points:
    43
    #2
    Have you checked the basic stuff like CPU, memory usage, upload/download speed, logs etc. already? If a high amount of queries would be the problem, that would be clearly visible in the statics, also it may show you if the hardware is working correctly.

    If these data shows that something is wrong, while there is no dramatic increase in traffic, than there is probably one bad php page. Scan the sources of the recently changed pages for loops, and queries. If there is a dramatic increase in traffic, than you might consider upgrading the server.
     
    ssmm987, May 20, 2011 IP
  3. TSelbeck

    TSelbeck Peon

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Hey ssmm, thanks for the reply.

    CPU Usage: 16.7%
    Memory usage: 32.44%

    Server uptime: 25 minutes 5 seconds
    Total accesses: 92556 - Total Traffic: 6.8 GB

    61.5 requests/sec - 4.6 MB/second - 76.8 kB/request
    256 requests currently being processed, 0 idle workers

    Just had a look at RAW access logs, but it's almost 2GB. Not even sure how to harvest the data from them :X

    Everything looks sort of fine from what I can tell? I agree that it could be 1 bad page causing the problems, however - no code has changed on the site in almost 2 months. Even with the site only getting around 50% of the visitors/bandwidth/page-views

    Even text files that are only a couple of kb big are taking a couple of seconds to load.

    Thanks for the help :)
     
    Last edited: May 20, 2011
    TSelbeck, May 20, 2011 IP
  4. ssmm987

    ssmm987 Member

    Messages:
    180
    Likes Received:
    4
    Best Answers:
    3
    Trophy Points:
    43
    #4
    CPU and Memory usages are looking just fine, and shouldn't be a problem like this. (I don't know if these stats are fetched when the server was slow, or fast)

    I read your post again, and came across this piece, which I overread the first time:
    
    
    I did manage to get a list of hundreds of IPs over the space of 30 minutes or so, and find which of them was loading lots of page views. When the MySQL spike came, I cleared the database and started a fresh to see if I could catch out 1 individual IP. I did. There was 1 IP that loaded ~2500 pages in 35 seconds. So I blocked his IP address from accessing the server.
    
    Code (markup):
    Looks like someone is DDossing your server (Overloading by sending a lot of requests). The fact that that no code has changed in the last months also backs up this suggestion. You can verify this by looking at the logs (Most recent data is added either on top, or on the bottom, and you may want to empty the logs if they are already almost 2 GB), If someone is Ddossing your site, you might contact their provider, or consider legal action. As I'm not a lawyer, and not familiar with laws and stuff, it's better if you contact someone who is.
     
    ssmm987, May 20, 2011 IP
  5. TSelbeck

    TSelbeck Peon

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Thanks again for the reply ssmm.

    I combed through the entire site looking for pages which could potentially be a problem and resolved the issue with those.

    The entire site has been working fine for around the last 3 hours (sure, it has been going down every 30 minutes or so, but only for a few seconds). Mysql process list in WHM shows like 3 active queries whenever I load it. However, at about 9:05pm (GMT), same thing has happened again... site froze, even sites that don't require MySQL, and mysql process list showed around 1,500 active queries.

    I thought that a Dos attack for a long while at the beginning, but it seemed strange that it only lasted for around 20-30 minutes at a time. OF course, the IP logging system that I had before would have worked a treat.. but kind of stumbles with the read/writes when the requests come flooding in, when it's needed the most.

    I might take a peep at the logs. I'll look into how to clear them, and then see if I can have a look through those.

    Thanks for the help so far. If I don't find the problem soon, I think I'm going to start to cry.
     
    TSelbeck, May 20, 2011 IP
  6. Aotearoa

    Aotearoa Member

    Messages:
    40
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    43
    #6
    The only lasting for 20-30 minutes sounds more like a misconfigured spider than a DDOS attack.

    Try adding a "Crawl-delay" line to your robots.txt to see if that helps.

    e.g. for 10 second delay between requests add a line that says:
    Crawl-delay: 10

    HTH

    Bruce
     
    Aotearoa, May 20, 2011 IP
  7. TSelbeck

    TSelbeck Peon

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Hey Aotearoa, that's not a bad idea! Thanks :D

    Think I'm getting a little further. Blocked a couple more IPs. Had the IP checker script running again, left it on for about 15 minutes. Total page views, 15,000 in 15 minutes. So nothing too crazy, really.

    Think i'll look into who is hotlinking. Kinda tricky as I don't have server stats enabled. Infact, I shall start a new thread for the hotlinking issue that I'm having.

    Thanks for the help so far :)
     
    TSelbeck, May 22, 2011 IP