100+ Requests a Second, Two Servers?

Discussion in 'Site & Server Administration' started by arunforce, May 8, 2010.

  1. #1
    I'm an administrator on MPGH, a large forum running vBulletin + vBSEO.

    We have two Quad Core dedicated servers, one dedicated for PHP (litespeed), one dedicated for MySQL. Right now we have had 1600+ guests and members on in the last 30 minutes.

    Our Server Load Averages 57.99 59.24 58.09

    Right now, the bottleneck is the PHP server, the MySQL server is delivering the content fast, but the PHP server's processes shoot up to 100% CPU usage, then die out.

    We are considering upgrading to a Xeon server for the PHP, but I'm not so sure the PHP server is acting correctly, is this normal? Is there anything we can do to reduce the load?
     
    arunforce, May 8, 2010 IP
  2. RHS-Chris

    RHS-Chris Well-Known Member

    Messages:
    1,007
    Likes Received:
    35
    Best Answers:
    10
    Trophy Points:
    150
    #2
    Make sure you have some sort of PHP caching system installed on the web server. There is definitely as issue with loads like that.
     
    RHS-Chris, May 8, 2010 IP
  3. arunforce

    arunforce Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    We have xcache installed.
     
    arunforce, May 8, 2010 IP
  4. p.hall

    p.hall Guest

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Do you have free memory, is the server swapping? When you run top, do you see whether the load is from "waiting" or something else? Also try sar -u 2, this is from the sysstat package.
     
    Last edited: May 8, 2010
    p.hall, May 8, 2010 IP
  5. arunforce

    arunforce Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    top - 21:14:31 up 32 days, 23:39, 1 user, load average: 59.96, 57.14, 53.96
    Tasks: 190 total, 61 running, 126 sleeping, 3 stopped, 0 zombie
    Cpu(s): 94.2%us, 4.0%sy, 0.0%ni, 0.5%id, 0.0%wa, 0.3%hi, 1.0%si, 0.0%st
    Mem: 3361384k total, 2190180k used, 1171204k free, 131016k buffers
    Swap: 4875716k total, 46440k used, 4829276k free, 1114128k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    15698 nobody 20 0 156m 19m 11m R 14.1 0.6 0:05.27 lsphp5
    15744 nobody 20 0 157m 20m 10m S 13.1 0.6 0:04.49 lsphp5
    15700 nobody 20 0 159m 22m 10m R 10.1 0.7 0:05.09 lsphp5
    15750 nobody 20 0 158m 20m 10m R 10.1 0.6 0:04.46 lsphp5
    15499 nobody 20 0 160m 25m 12m S 9.1 0.8 0:07.87 lsphp5
    15509 nobody 20 0 160m 22m 10m S 9.1 0.7 0:07.58 lsphp5
    15799 nobody 20 0 160m 21m 9280 R 9.1 0.7 0:02.59 lsphp5
    15491 nobody 20 0 160m 24m 12m S 8.1 0.8 0:07.89 lsphp5
    15505 nobody 20 0 160m 23m 10m S 8.1 0.7 0:08.47 lsphp5
    15516 nobody 20 0 161m 25m 11m R 8.1 0.8 0:08.28 lsphp5
    15522 nobody 20 0 160m 23m 10m S 8.1 0.7 0:07.74 lsphp5
    15551 nobody 20 0 157m 24m 13m R 8.1 0.7 0:07.97 lsphp5
    15752 nobody 20 0 157m 20m 10m R 8.1 0.6 0:04.93 lsphp5
    15791 nobody 20 0 156m 18m 10m R 8.1 0.6 0:04.33 lsphp5
    15511 nobody 20 0 157m 20m 11m R 7.1 0.6 0:07.90 lsphp5
    15515 nobody 20 0 160m 27m 14m S 7.1 0.8 0:08.21 lsphp5
    15535 nobody 20 0 159m 22m 11m R 7.1 0.7 0:07.89 lsphp5
    15557 nobody 20 0 158m 22m 11m R 7.1 0.7 0:07.56 lsphp5
    15615 nobody 20 0 156m 21m 12m R 7.1 0.6 0:07.28 lsphp5
    15749 nobody 20 0 157m 20m 10m R 7.1 0.6 0:04.46 lsphp5
    15751 nobody 20 0 157m 21m 11m R 7.1 0.7 0:04.91 lsphp5
    15798 nobody 20 0 159m 21m 9984 R 7.1 0.7 0:02.55 lsphp5
    15809 nobody 20 0 156m 17m 8496 R 7.1 0.5 0:01.79 lsphp5
    15870 nobody 20 0 159m 19m 8680 R 7.1 0.6 0:00.40 lsphp5
    15521 nobody 20 0 160m 23m 10m S 6.0 0.7 0:08.18 lsphp5
    15524 nobody 20 0 160m 22m 9.8m R 6.0 0.7 0:06.59 lsphp5
    15525 nobody 20 0 161m 26m 12m S 6.0 0.8 0:08.06 lsphp5
    15586 nobody 20 0 159m 24m 11m R 6.0 0.7 0:07.92 lsphp5
    15616 nobody 20 0 160m 22m 9.9m R 6.0 0.7 0:06.74 lsphp5
    15657 nobody 20 0 157m 21m 11m R 6.0 0.7 0:07.22 lsphp5
    15738 nobody 20 0 157m 21m 11m R 6.0 0.7 0:04.67 lsphp5
    15739 nobody 20 0 157m 18m 9344 R 6.0 0.6 0:04.62 lsphp5
    15740 nobody 20 0 157m 20m 10m R 6.0 0.6 0:05.14 lsphp5
    15743 nobody 20 0 157m 19m 9.9m R 6.0 0.6 0:04.57 lsphp5
    15801 nobody 20 0 157m 19m 10m R 6.0 0.6 0:02.28 lsphp5
    15518 nobody 20 0 161m 26m 12m S 5.0 0.8 0:07.72 lsphp5
    15519 nobody 20 0 160m 25m 12m S 5.0 0.8 0:07.79 lsphp5
    15520 nobody 20 0 159m 21m 10m R 5.0 0.7 0:07.90 lsphp5
    15526 nobody 20 0 160m 23m 10m R 5.0 0.7 0:08.24 lsphp5
    15528 nobody 20 0 159m 22m 10m R 5.0 0.7 0:07.81 lsphp5
    15546 nobody 20 0 160m 23m 10m R 5.0 0.7 0:06.83 lsphp5
    15699 nobody 20 0 160m 22m 10m R 5.0 0.7 0:04.51 lsphp5
    15701 nobody 20 0 160m 22m 9.8m R 5.0 0.7 0:05.08 lsphp5
    15755 nobody 20 0 157m 19m 10m R 5.0 0.6 0:04.54 lsphp5
    15790 nobody 20 0 159m 21m 9.8m R 5.0 0.6 0:04.06 lsphp5
    15792 nobody 20 0 160m 23m 10m R 5.0 0.7 0:04.00 lsphp5
    15800 nobody 20 0 160m 21m 8832 R 5.0 0.7 0:02.16 lsphp5
    15805 nobody 20 0 158m 20m 9488 R 5.0 0.6 0:01.67 lsphp5
     
    arunforce, May 8, 2010 IP
  6. p.hall

    p.hall Guest

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Can you check either in the logs or with netstat -na | grep :80 | sort whether you have many connections from one IP? You might be a victim of a DoS attack. If this is the case, you can install mod_evasive to downplay the attack.
     
    p.hall, May 8, 2010 IP
  7. arunforce

    arunforce Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    We often get DDoSed, but I do not believe this to be one.

    tcp 0 0 93.190.140.127:80 125.166.93.210:11308 TIME_WAIT
    tcp 0 0 93.190.140.127:80 125.166.93.210:11309 TIME_WAIT
    tcp 0 0 93.190.140.127:80 125.166.93.210:11310 TIME_WAIT
    tcp 0 0 93.190.140.127:80 125.166.93.210:11311 TIME_WAIT
    tcp 0 0 93.190.140.127:80 125.166.93.210:11321 TIME_WAIT
    tcp 0 0 93.190.140.127:80 125.166.93.210:11322 TIME_WAIT
    tcp 0 0 93.190.140.127:80 131.191.81.102:55166 TIME_WAIT
    tcp 0 0 93.190.140.127:80 131.191.81.102:55604 TIME_WAIT
    tcp 0 0 93.190.140.127:80 131.191.81.102:55609 TIME_WAIT
    tcp 0 0 93.190.140.127:80 165.228.190.83:65127 FIN_WAIT2
    tcp 0 0 93.190.140.127:80 173.174.51.254:4296 ESTABLISHED
    tcp 0 0 93.190.140.127:80 173.174.51.254:4365 TIME_WAIT
    tcp 0 0 93.190.140.127:80 173.174.51.254:4367 ESTABLISHED
    tcp 0 0 93.190.140.127:80 173.174.51.254:4368 ESTABLISHED
    tcp 0 0 93.190.140.127:80 173.174.51.254:4369 FIN_WAIT2
    tcp 0 0 93.190.140.127:80 173.174.51.254:4370 ESTABLISHED
    tcp 0 0 93.190.140.127:80 173.2.98.190:52436 TIME_WAIT
    tcp 0 0 93.190.140.127:80 173.34.225.234:53613 TIME_WAIT
    tcp 0 0 93.190.140.127:80 173.34.225.234:53675 ESTABLISHED
    tcp 0 0 93.190.140.127:80 173.34.225.234:53697 TIME_WAIT
    tcp 0 0 93.190.140.127:80 173.34.225.234:53709 ESTABLISHED
    tcp 0 0 93.190.140.127:80 173.34.225.234:53710 TIME_WAIT
    tcp 0 0 93.190.140.127:80 173.34.225.234:53712 ESTABLISHED
    tcp 0 0 93.190.140.127:80 173.34.225.234:53756 ESTABLISHED
    tcp 0 0 93.190.140.127:80 173.34.225.234:53762 ESTABLISHED
    tcp 0 0 93.190.140.127:80 174.34.156.130:35711 ESTABLISHED
    tcp 0 0 93.190.140.127:80 174.92.135.44:4398 ESTABLISHED
    tcp 0 0 93.190.140.127:80 186.58.133.202:21587 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:2628 ESTABLISHED
    tcp 0 0 93.190.140.127:80 186.58.133.202:29230 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:30769 FIN_WAIT2
    tcp 0 0 93.190.140.127:80 186.58.133.202:35529 ESTABLISHED
    tcp 0 0 93.190.140.127:80 186.58.133.202:37042 ESTABLISHED
    tcp 0 0 93.190.140.127:80 186.58.133.202:41924 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:4495 ESTABLISHED
    tcp 0 0 93.190.140.127:80 186.58.133.202:50886 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:51094 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:5405 ESTABLISHED
    tcp 0 0 93.190.140.127:80 186.58.133.202:55024 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:55104 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:57327 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:60305 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:61595 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:62560 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:64906 TIME_WAIT
    tcp 0 0 93.190.140.127:80 186.58.133.202:6684 ESTABLISHED
    tcp 0 0 93.190.140.127:80 186.58.133.202:8025 FIN_WAIT2
    tcp 0 0 93.190.140.127:80 187.126.29.173:60165 TIME_WAIT
    tcp 0 0 93.190.140.127:80 187.126.29.173:60226 TIME_WAIT
    tcp 0 0 93.190.140.127:80 187.126.29.173:60236 TIME_WAIT
    tcp 0 0 93.190.140.127:80 187.126.29.173:60237 TIME_WAIT
    tcp 0 0 93.190.140.127:80 187.159.25.21:54699 FIN_WAIT2
    tcp 0 0 93.190.140.127:80 187.35.236.226:57584 TIME_WAIT
    tcp 0 0 93.190.140.127:80 189.110.244.127:11680 ESTABLISHED
    tcp 0 0 93.190.140.127:80 189.175.3.10:1249 ESTABLISHED
    tcp 0 0 93.190.140.127:80 189.182.151.180:50545 ESTABLISHED
    tcp 0 0 93.190.140.127:80 189.46.4.141:51093 ESTABLISHED
    tcp 0 0 93.190.140.127:80 189.46.4.141:51094 ESTABLISHED
    tcp 0 0 93.190.140.127:80 189.69.151.219:2325 FIN_WAIT2
    tcp 0 0 93.190.140.127:80 190.34.9.72:4659 TIME_WAIT
    tcp 0 0 93.190.140.127:80 190.34.9.72:4662 TIME_WAIT
    tcp 0 0 93.190.140.127:80 190.34.9.72:4663 ESTABLISHED
    tcp 0 0 93.190.140.127:80 192.71.148.10:32498 TIME_WAIT
    tcp 0 0 93.190.140.127:80 192.71.148.10:35757 TIME_WAIT
    tcp 0 0 93.190.140.127:80 192.71.148.10:35763 TIME_WAIT
    tcp 0 0 93.190.140.127:80 192.71.148.10:36740 TIME_WAIT
    tcp 0 0 93.190.140.127:80 193.126.159.58:62701 FIN_WAIT2
    tcp 0 0 93.190.140.127:80 193.126.159.58:62707 FIN_WAIT2
    tcp 0 0 93.190.140.127:80 193.126.159.58:62708 FIN_WAIT2
    tcp 0 0 93.190.140.127:80 193.126.159.58:62709 ESTABLISHED
    tcp 0 0 93.190.140.127:80 193.126.159.58:62710 FIN_WAIT2
    tcp 0 0 93.190.140.127:80 193.126.159.58:62711 ESTABLISHED
    tcp 0 0 93.190.140.127:80 193.126.159.58:62720 ESTABLISHED
    tcp 0 0 93.190.140.127:80 198.53.161.147:60253 TIME_WAIT
    tcp 0 0 93.190.140.127:80 200.103.128.97:61147 TIME_WAIT

    AND

    [root@j5 ~]# sar -u 2
    Linux 2.6.28.2 (j5) 05/08/2010

    09:28:05 PM CPU %user %nice %system %iowait %steal %idle
    09:28:07 PM all 93.89 0.00 5.86 0.00 0.00 0.25
    Average: all 93.89 0.00 5.86 0.00 0.00 0.25
     
    arunforce, May 8, 2010 IP
  8. arunforce

    arunforce Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Ok, I asked the other administrator, we do have a DDoS prevention system already installed. We have a firewall limit set at 50 connections per user, do we need to lower it?
     
    arunforce, May 8, 2010 IP
  9. p.hall

    p.hall Guest

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    If you pasted all lines from netstat in the post above, then it definitely doesn't look like DoS, so there is no need to touch the limits.
     
    p.hall, May 8, 2010 IP
  10. arunforce

    arunforce Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #10
    That isn't the entire netstat, the entire netstat is HUGE.
     
    arunforce, May 8, 2010 IP
  11. arunforce

    arunforce Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Ok, I'm assuming there is nothing that can be done. So I'll be upgrading.

    However there is a disagreement between us on which to upgrade to.

    Upgrade the server to?
    - Core i7 (8 Cores)
    - Xeon (4 Cores)
    - Dual Xeon (Turn both servers into one, 2x4 Cores)

    Which would be the logical choice?
     
    arunforce, May 8, 2010 IP
  12. RHS-Chris

    RHS-Chris Well-Known Member

    Messages:
    1,007
    Likes Received:
    35
    Best Answers:
    10
    Trophy Points:
    150
    #12
    IMO, upgrading will not help. A software based DDOS deterrent system is only good for low end attacks. You need a hardware based system in order to mitigate large scale attacks, assuming that is what it is. With the setup you currently have, you should be able to handle those loads with ease! Take a look through your log files, try tracing some of those processes and figure out what they are doing. I would also try disabling your VB plugins for a bit, and see if the load comes down to a normal level, and if it does, start enabling them one by one to see which one is the culprit. I think upgrading right now is just throwing money away, you need to troubleshoot what the issue is.
     
    RHS-Chris, May 8, 2010 IP
  13. arunforce

    arunforce Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #13
    It's not just the DDoS, the server lags on the weekend. Generally, when there are over 1200+ people on during a 30 minute period, our server slows down to load average of about 10.
     
    arunforce, May 8, 2010 IP
  14. sysadmin

    sysadmin Peon

    Messages:
    111
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #14
    Hello,

    Please use FAST HDDS like SAS and good RAM, May be you can consider a small cluster too. :)
     
    sysadmin, May 9, 2010 IP
  15. p.hall

    p.hall Guest

    Messages:
    26
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #15
    His problem doesn't seem to be related to the disks or RAM. His server doesn't swap and there is not much IO activity. It seems like his Apache / PHP configuration needs to be optimized.
     
    p.hall, May 10, 2010 IP
  16. arunforce

    arunforce Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #16
    We are upgrading to VB4 today, I think, without any extensions. This should help determine if the issue is related to plugins (99% sure it isn't, we tried this previously).
     
    arunforce, May 10, 2010 IP