1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Can't seem to get php cpu spikes under control

Discussion in 'Site & Server Administration' started by sfraise, Sep 4, 2010.

  1. #1
    I'm trying to get this thing under control but so far am having no luck.
    I'm running a Joomla 1.5 site under apache 2.2, php 5.3.3, eaccelerator, zend, memcached/mod_memcache, and using fastcgi with worker.
    Server specs are:
    Intel core 2 quad q9400 2.66ghz
    8gb ddr2 ram
    64bit
    1tb hdd

    Everything will be going smooth and fine, cpu load fluctuates between 0.5-3 depending, then all of a sudden out of nowhere it spikes to 40,50, or even 70. Looking at top I don't see anything eating that much cpu other than php, looking at error logs I don't see any scripts with fatal errors on an endless loop or anything. Spamd fails when the php spikes the cpu load and I get a system email notifying me. The cpu then seems to go back under control after a few minutes then we start the cycle all over again.

    I'm running clamav, mailscanner, configserver lsf, mod_security, and run maldetect once a day as well as run chkrootkit once in a while. We did have an issue a few months ago where I was playing with a new login component that turned out to have an exploit in it and we got hammered. Everything seems to be cleaned out, nothing shows in any scans and I don't see any offending scripts going through by hand but there's always the possibility there's something still residing in there somewhere undetected.

    I've gone through and turned off modules in Joomla trying to find the offending code but doesn't seem to matter, I've upgraded Joomla to the latest 1.5 release and it did seem to help a bit but still didn't stop the sudden spikes.

    I backed php down to 5.3.2 but still getting high cpu spikes with php hogging the majority. Here's what a snapshot from top looks like:

    Tasks: 191 total, 2 running, 189 sleeping, 0 stopped, 0 zombie
    Cpu(s): 34.4%us, 3.3%sy, 0.0%ni, 60.6%id, 0.9%wa, 0.1%hi, 0.8%si, 0.0%st
    Mem: 8177492k total, 5500940k used, 2676552k free, 52472k buffers
    Swap: 2096472k total, 49036k used, 2047436k free, 2965196k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    24164 oohyane 16 0 215m 63m 22m R 63.6 0.8 0:11.39 php
    24150 oohyane 16 0 220m 68m 22m S 29.6 0.9 0:17.10 php
    24170 oohyane 16 0 203m 51m 22m S 27.6 0.6 0:09.79 php
    5185 nobody 18 0 441m 42m 2692 S 9.0 0.5 4:49.70 httpd
    2718 mysql 15 0 611m 222m 4100 S 5.0 2.8 100:47.11 mysqld
    5310 nobody 18 0 436m 38m 2708 S 2.3 0.5 4:56.47 httpd
    5186 nobody 18 0 1249m 38m 2728 S 2.0 0.5 5:03.47 httpd
    5272 nobody 18 0 432m 37m 2692 S 2.0 0.5 5:21.10 httpd
    17024 nobody 18 0 431m 31m 2644 S 2.0 0.4 1:08.55 httpd
    19580 nobody 18 0 364m 27m 2624 S 1.3 0.3 0:37.67 httpd
    9978 nobody 18 0 436m 36m 2732 S 1.0 0.5 3:08.89 httpd
    17059 nobody 18 0 363m 26m 2344 S 1.0 0.3 1:03.24 httpd
    19841 nobody 18 0 358m 21m 2640 S 1.0 0.3 0:35.11 httpd
    2056 root 10 -5 0 0 0 S 0.7 0.0 35:31.76 kondemand/2
    17135 root 18 0 370m 69m 9.9m S 0.7 0.9 8:00.47 java
    566 root 11 -5 0 0 0 S 0.3 0.0 3:08.29 kjournald
    18221 root 20 0 118m 16m 1592 S 0.3 0.2 0:46.87 lfd

    I really need to get this figured out, I can't afford to keep spending the majority of my day messing with this instead of actually developing sites.
     
    Last edited: Sep 4, 2010
    sfraise, Sep 4, 2010 IP
  2. madaboutlinux

    madaboutlinux Member

    Messages:
    250
    Likes Received:
    7
    Best Answers:
    2
    Trophy Points:
    43
    #2
    Does this spike happens once in a day OR often? If often, does it happen on the same time of the day? Have you checked if there is any cronjob that is causing the load spike?
     
    madaboutlinux, Sep 5, 2010 IP
  3. YoGem

    YoGem Active Member

    Messages:
    676
    Likes Received:
    8
    Best Answers:
    2
    Trophy Points:
    90
    #3
    Mmm... maybe a series of stupid question but:

    Is your dedi accessible via IP? Are you testing scripts and websites in this DEDI? Have you had a look at apache (or http server daemon) to see if you have some strange requestes? Maybe it can be a brute forcing attempt to discover MYSQL injection, or PHP XSS probings?
     
    YoGem, Sep 5, 2010 IP
  4. anands

    anands Well-Known Member

    Messages:
    436
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    120
    #4
    Have you setup php cache like APC or eaccelerator or xcache? it can help you reduce load on server. It looks like a lot of php process running on the server due to more number of requests. Cache can reduce the number of compiling.
     
    anands, Sep 5, 2010 IP
  5. AnthonyG

    AnthonyG Well-Known Member

    Messages:
    114
    Likes Received:
    3
    Best Answers:
    2
    Trophy Points:
    135
    #5
    1. worker for apache doesnt work very well at all.
    2. id agree with anands, but would use xcache
    3. id off mod_sec & use suhosin instead.
    4. off the fastcgi manager and run php-fpm instead.
    5. off apache & use nginx instead.

    To truly see whats causing the spikes, i would strace the pid's when it spikes.

    strace -p 24164

    Ideally you wont be able to view the strace live as it will stream to fast, i would output the strace to a file if anything then read over it after wards.
     
    AnthonyG, Sep 6, 2010 IP
  6. sfraise

    sfraise Peon

    Messages:
    453
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Thanks for the replies so far, as I mentioned in the original post I am running eaccelerator as well as memcached/mod_memcache. I have Joomla set to use memcache as the cache option, and in php.ini I have memcached set to handle the php sessions.

    I've gone through and disabled extensions hoping to to pinpoint which extension could be causing the issue, but even with just the bare bones essential the cpu load will spike at times. I'm to the point of thinking the template I'm using may be the cause as it's originally built for 1.0 but I tweaked it to work with 1.5 when I migrated. I'm going to do a complete rebuild using a new template as soon as I get the database copied over to a sandbox test account. Csf is giving me some info that is telling me that processes suddenly run wild, over 93 at times when the cpu load spikes, and I'm also concerned with the mail setup on the account as exim, spamd, pop, and imap tend to fail when the cpu spikes.

    Here's some info from my csf, what do you guys think of what this is showing?
    Sep 6 16:51:25 server lfd[23375]: *Email Queue* The exim delivery queue size is 196092
    Sep 6 16:58:50 server lfd[24042]: Directory Watching terminated after 22 seconds
    Sep 6 16:58:50 server lfd[24042]: LF_DIRWATCH taking 22 seconds, temporarily throttled to run every 360 seconds
    Sep 6 17:01:59 server lfd[24153]: *LOAD* 5 minute load average is 17.57, threshold is 6 - email sent
    Sep 6 17:04:29 server lfd[24235]: *Skipped File* /tmp/#sql_ab1_0.MYD - Too large to scan
    Sep 6 17:06:31 server lfd[24272]: *Excessive Processes* Userohyane Kill:0 Process Count:16
    Sep 6 17:07:30 server lfd[24235]: Directory Watching terminated after 46 seconds
    Sep 6 17:07:30 server lfd[24235]: LF_DIRWATCH taking 46 seconds, temporarily throttled to run every 1080 seconds
    Sep 6 17:13:49 server lfd[25900]: 5 (sshd) login failures from 201.38.138.2 (BR/Brazil/-) in the last 300 secs - *Blocked in csf*
    Sep 6 17:14:34 server lfd[25981]: *SSH login* from 216.51.193.200 into the root account using password authentication
    Sep 6 17:51:39 server lfd[29370]: *Email Queue* The exim delivery queue size is 196099
    Sep 6 18:02:04 server lfd[30624]: *LOAD* 5 minute load average is 13.37, threshold is 6 - email sent

    Sep 7 00:37:59 server lfd[6525]: 5 (mod_security) rule triggers from 67.83.75.157 (US/United States/ool-43534b9d.dyn.optonline.net) in the last 300 secs - *Blocked in csf*
    Sep 7 00:40:46 server lfd[6689]: *Email Queue* Unable to obtain exim_outgoing.conf queue length within 30 seconds - Timed out
    Sep 7 00:42:16 server lfd[6707]: *Skipped File* /tmp/#sql_ab1_0.MYD - Too large to scan
    Sep 7 00:46:19 server lfd[6881]: *Excessive Processes* Userohyane Kill:0 Process Count:16
    Sep 7 00:49:04 server lfd[7825]: *LOAD* 5 minute load average is 23.80, threshold is 6 - email sent
    Sep 7 00:54:20 server lfd[8203]: *Skipped File* /tmp/#sql_ab1_0.MYD - Too large to scan
    Sep 7 00:58:24 server lfd[8515]: 5 (sshd) login failures from 122.72.31.130 (CN/China/-) in the last 300 secs - *Blocked in csf*
    Sep 7 01:49:33 server lfd[13940]: *LOAD* 5 minute load average is 11.07, threshold is 6 - email sent
    Sep 7 02:00:08 server lfd[14733]: *System Integrity* has detected modified file(s): /usr/bin/pure-pw /usr/bin/pure-pwconvert /usr/bin/pure-statsdecode /usr/sbin/exim /usr/sbin/exim_dbmbuild /usr/sbin/exim_dumpdb /usr/sbin/exim_fixdb /usr/sbin/exim_lock /usr/sbin/exim_tidydb /usr/sbin/pure-authd /usr/sbin/pure-ftpd /usr/sbin/pure-ftpwho /usr/sbin/pure-mrtginfo /usr/sbin/pure-quotacheck /usr/sbin/pure-uploadscript /usr/sbin/runq /usr/sbin/sendmail
    Sep 7 02:26:52 server lfd[16805]: *Excessive Processes* Userohyane Kill:0 Process Count:16
    Sep 7 02:40:14 server lfd[18280]: *WHM root access* from 216.51.193.200
    Sep 7 03:24:26 server lfd[21998]: *LOAD* 5 minute load average is 7.65, threshold is 6 - email sent
    Sep 7 03:51:26 server lfd[24104]: *Email Queue* Unable to obtain exim queue length within 30 seconds - Timed out
    Sep 7 03:53:11 server lfd[24171]: *Excessive Processes* Userohyane Kill:0 Process Count:93
    Sep 7 03:54:11 server lfd[24188]: *User Processing* PID:23666 Kill:0 Userohyane VM:219(MB) EXE:/usr/bin/php CMD:/usr/bin/php
    Sep 7 03:54:40 server lfd[23996]: Directory Watching terminated after 46 seconds
    Sep 7 03:54:40 server lfd[23996]: LF_DIRWATCH taking 46 seconds, temporarily throttled to run every 1080 seconds
    Sep 7 03:55:11 server lfd[24361]: *User Processing* PID:22040 Kill:0 Userohyane VM:221(MB) EXE:/usr/bin/php CMD:/usr/bin/php
     
    sfraise, Sep 7, 2010 IP
  7. AnthonyG

    AnthonyG Well-Known Member

    Messages:
    114
    Likes Received:
    3
    Best Answers:
    2
    Trophy Points:
    135
    #7
    1st i'd start by chking into why this user is causing log entries like these:
    Excessive Processes* Userohyane Kill:0 Process Count:93

    2nd id look into why sql is puking out, as well as creating tmp tables:
    Skipped File* /tmp/#sql_ab1_0.MYD - Too large to scan
     
    AnthonyG, Sep 7, 2010 IP
  8. RRWH

    RRWH Active Member

    Messages:
    821
    Likes Received:
    49
    Best Answers:
    0
    Trophy Points:
    70
    #8
    There are so many parts here that are not helping you, A couple of suggestions.

    Your mail queue is 190K -> It could be the culprit.

    Every time the mail queue runs the whole server will bog down.

    Sort out the Mysql issues - at least tune it and know exactly what it is doing.

    Once you get a handle on these 2 items then move onto looking at Apache
     
    RRWH, Sep 8, 2010 IP
  9. sfraise

    sfraise Peon

    Messages:
    453
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #9
    I deleted all of the messages in the mail queue.
    Mysql seems to be running pretty well and not using up much resources, I've got that tuned down pretty tight at this point I think.
    The only thing I need to really tackle on the database side at this point is the indexes and making sure the scripts are taking advantage of them.
    The reason your seeing the mysql stuff being written to the tmp folder is due to caching, same reason csf gave the same notices about eaccelerator.so files being skipped for being too large before I set it to ignore it. I did just bump up my tmp table size in my.cnf just to ensure tmp tables aren't getting written to disk (saw 23k was written to disk in phpmyadmin status).

    It seems the major issue is around some bad php code somewhere causing php to eat large large large portions of the cpu resources, but not sure exactly what's causing it yet.
    I did narrow one issue down to sh404sef which is the url rewriting component I use which causes the following 500 error in debug mode:
    JDatabaseMySQL::query: 1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 2 SQL=SELECT username FROM jos_users WHERE id=

    Call stack
    # Function Location
    1 JSite->render() /home/oohyane/public_html/index.php:79
    2 JDocumentHTML->render() /home/oohyane/public_html/includes/application.php:168
    3 JDocumentHTML->_parseTemplate() /home/oohyane/public_html/libraries/joomla/document/html/html.php:249
    4 JDocumentHTML->getBuffer() /home/oohyane/public_html/libraries/joomla/document/html/html.php:386
    5 JDocumentRendererModules->render() /home/oohyane/public_html/libraries/joomla/document/html/html.php:190
    6 JDocumentRendererModule->render() /home/oohyane/public_html/libraries/joomla/document/html/renderer/modules.php:41
    7 JModuleHelper->renderModule() /home/oohyane/public_html/libraries/joomla/document/html/renderer/module.php:84
    8 require() /home/oohyane/public_html/libraries/joomla/application/module/helper.php:173
    9 cbPluginHandler->trigger() /home/oohyane/public_html/modules/mod_cblogin/mod_cblogin.php:460
    10 cbPluginHandler->call() /home/oohyane/public_html/administrator/components/com_comprofiler/plugin.class.php:509
    11 call_user_func_array() /home/oohyane/public_html/administrator/components/com_comprofiler/plugin.class.php:551
    12 getprofilebookTab->onAfterLogoutForm()
    13 cbTabHandler->_getAbsURLwithParam() /home/oohyane/public_html/components/com_comprofiler/plugin/user/plug_cbprofilebook/cb.profilebook.php:169
    14 cbSef() /home/oohyane/public_html/administrator/components/com_comprofiler/plugin.class.php:3072
    15 CBframework->cbSef() /home/oohyane/public_html/administrator/components/com_comprofiler/plugin.foundation.php:2469
    16 call_user_func_array() /home/oohyane/public_html/administrator/components/com_comprofiler/plugin.foundation.php:2121
    17 JRoute::_()
    18 shRouter->build() /home/oohyane/public_html/libraries/joomla/methods.php:54
    19 JRouter->build() /home/oohyane/public_html/plugins/system/shsef.php:250
    20 shRouter->_buildSefRoute() /home/oohyane/public_html/libraries/joomla/application/router.php:167
    21 shSefRelToAbs() /home/oohyane/public_html/plugins/system/shsef.php:405
    22 sef_404->create() /home/oohyane/public_html/administrator/components/com_sh404sef/sh404sef.class.php:1665
    23 include() /home/oohyane/public_html/components/com_sh404sef/sef_ext.php:300
    24 JDatabaseMySQL->loadResult() /home/oohyane/public_html/components/com_sh404sef/sef_ext/com_comprofiler.php:184
    25 JDatabaseMySQL->query() /home/oohyane/public_html/libraries/joomla/database/database/mysql.php:355
    26 JError->raiseError() /home/oohyane/public_html/libraries/joomla/database/database/mysql.php:231
    27 JError->raise() /home/oohyane/public_html/libraries/joomla/error/error.php:171
    28 JException->__construct() /home/oohyane/public_html/libraries/joomla/error/error.php:136

    I upgraded sh404sef to the latest build and it seems to have helped the cpu load significantly, however I'm still getting temporary high spikes.
    I'm going to try running with sh404sef completely disabled for a few hours and see what the cpu load looks like, I hate to do it as it really messes with my search engine ranking running regular dynamic urls instead of the sef ones as it looks like dup content to them, but hopefully it wont get picked up for just a few hours.

    After sh404sef my next place to look is going to be community builder, don't really see any real errors from it without sh404sef enabled but that's the biggest component to the site so want to go through it completely.
     
    sfraise, Sep 8, 2010 IP
  10. sfraise

    sfraise Peon

    Messages:
    453
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #10
    I upgraded sh404sef as well as community builder and even though the spikes aren't as high or long lasting they're still occurring.
    I identified an integration issue between these two components and brought it to the sh404sef teams attention and they're releasing a new version with a fix in the next few days.
    However, even with that fix I don't think this is going to completely solve the cpu load issues.
    I still think something else is causing php to run off and eat resources.
     
    sfraise, Sep 10, 2010 IP
  11. sfraise

    sfraise Peon

    Messages:
    453
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #11
    I went back to prefork instead of worker and I saw a tremendous increase in performance, I'm guessing some extension could multi-thread and ended up hurting performance.
    I also turned off gzip in the joomla backend and went to using mod_deflate, seemed to knock a couple of seconds off of the page load speed as well.

    Load is currently running in the .45 range for the last couple of hours with no spikes and the site just "feels" smoother and faster now.
    Not sure if this was the magic cure but I think it's at least one more big step in the right direction.

    ** Been a few more hours and no load spikes what so ever, just checked again and it was at .09! Lol I didn't know that was even possible with this site with everything it has going on.
    I can't even begin to explain how happy I am now lol.
     
    Last edited: Sep 14, 2010
    sfraise, Sep 14, 2010 IP
  12. sfraise

    sfraise Peon

    Messages:
    453
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #12
    After several hours I did get a spike in load for a couple of minutes.
    After a bit of investigating I noticed the /tmp /var/tmp folders were 100% full.
    Since eaccelerator is set to write to the /tmp directory by default it's clear I need to change the cache dir path for it to prevent it from filling up the /tmp directory (/tmp is symlinked to /var/tmp).

    The only problem here is I installed eaccelerator through easyapache and this is apparently quite different than installing it manually.
    It puts the eaccelerator.ini file in /home/cpeasyapache/src/eaccelerator/eaccelerator-0.9.6.1 instead of where most of the tutorials and manuals tell you it should be.
    When I change the values in eaccelerator.ini here it doesn't do anything, simply looking at my phpinfo tells me that it's not changing at all. I also have to values in the php.ini file for eaccelerator so I can't change it there either.

    Anyone know how to change the eaccelerator values when it's installed through easyapache?
     
    sfraise, Sep 14, 2010 IP