1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Server stops responding for minutes, but pings fine $$ reward

Discussion in 'Apache' started by floodrod, Nov 26, 2007.

  1. #1
    If anyone can lead me to solving this, I will paypal $15 to the person who can help me kill this problem.. I can not give out root details, but I can follow directions and commands..

    A few times daily, my server stops responding.. Pages don't load on ALL domains hosted on my dedicated server..

    I wait a few minutes and it's back up... During the down time, the server pings fine...

    I contacted my dedicated server company several times and they keep trying things that don't work, and they say it's my ISP. It's not my ISP, as I can surf fine on all other sites when it happens..

    Server Swap done by the host
    Memory Tested fine (they say)

    Apache logs show nothing wrong, messages show nothing wrong, and other logs are clean..

    If someone is willing, I can follow instructions, and send them info such as php info, and whatnot..

    edit.. forgot to mention, its a linux dedicated Cpanel bot

    Thanks
     
    floodrod, Nov 26, 2007 IP
  2. InFloW

    InFloW Peon

    Messages:
    1,488
    Likes Received:
    39
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Sounds like a load spike due to maybe a cronjob running? I'd try being in SSH when it happens and being in top and see what sort of load average it's at when this happens.
     
    InFloW, Nov 27, 2007 IP
  3. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #3
    login directly using SSH !!

    open n5( five ) ssh connection on your bash/shell to your server

    then run on 4 of the connections on your remote server the following - ONE line on each shell/connection:

    tail -f /var/log/warn
    tail -f /var/log/messages
    tail -f /var/log/apache2/error_log
    tail -f /var/log/apache2/access_log

    then on your 5th connection - that's your "working" connection to enter bash commands.
    then first check your apache config - u did NOT mention which apache version - 1.3, 2.0 or 2.2

    to check apache config - typcially for 2.2. on a suse 10.1 linux it would be:

    rcapache2 configtest
    if reply OK
    then run next:
    rcapache2 extreme-configtest

    if any of above NOT OK - then follow instructions and clean up your apache config

    if OK
    then do

    rcapache2 reload
    that is a soft-restart of apache and should give you in the

    tail -f /var/log/apache2/error_log

    a line similar to:
    [Tue Nov 27 23:30:03 2007] [notice] Graceful restart requested, doing restart
    [Tue Nov 27 23:30:07 2007] [notice] Apache configured -- resuming normal operations

    as you then can see the "downtime" during a soft restart of apache is in the range of 4 seconds.

    if your apache does a correct / timely reload
    then do a full restart as follows

    rcapache2 restart

    look again at your

    tail -f /var/log/apache2/error_log

    the downtime is a few seconds longer

    if any of above 2 reload/restart commands create a delayed restart of your apache as you experience it in your normal operations - then look at the details of the 3 OTHER tail -f commands / widows - check for the seconds during and after your tests to get any possible feedback on possible problems

    I had once a non-starting apache a year ago - the result was a
    ln - s
    from the normal access_log file to an additional working copy for life traffic monitoring by a monitoring SW via browser. the "ln -s" link caused apache to stay down after the daily access_log rotation.

    think about WHAT SW or changes you installed / made just before this problem started to occur

    another check you can do is to compare the above 4 log files at the exact time of your reoccurring problems and look at the warn / messages or apache errors you find during the downtimes/problems you have!!


    if above give you no clue and no help to your problem, then you may have to wait for someone HERE in DP with more experience in that kind of problem to help you further.


    pls note that your apache reload/restart does NOT affect your SSH connection !
    if however you would login to cpanel or similar - then your browser oriented connection might get lost = bad situation !!
     
    hans, Nov 27, 2007 IP
  4. floodrod

    floodrod Well-Known Member

    Messages:
    829
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    135
    #4
    Thanks for the outline:)

    Apache 1.3.39

    tail -f /var/log/warn
    No such file in that directory

    tail -f /var/log/messages
    Got It

    tail -f /var/log/apache2/error_log
    Only path to apache error log is /usr/apache/logs/error_log

    tail -f /var/log/apache2/access_log
    Only path to apache error log is /usr/local/apache/logs/access_log

    rcapache2 configtest
    I can only use /usr/local/apache/bin/apachectl configtest

    rcapache2 extreme-configtest
    No extreme configtest available

    rcapache2 reload
    no reload option available

    thats pretty much where I stand.. Any advice?



     
    floodrod, Nov 27, 2007 IP
  5. floodrod

    floodrod Well-Known Member

    Messages:
    829
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    135
    #5
    I'm upgrading to apache 2.2 with php 5 now.. We will see how it goes
     
    floodrod, Nov 27, 2007 IP
  6. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #6
    what to do ?
    >> common sense!
    since you gave minimal system info = search until you find
    example:
    /var/log/warn

    every descent linux system has that warn-logfile
    find it and adapt path in your tail - command line
    like your
    Only path to apache error log is /usr/apache/logs/error_log
    etc !!

    same for OTHER missing commands - it you miss a particular command a.m. then research for your own equivalent - > use Google to get instant replies.
    if apache 2.2. installed - then most likely you may have similar or equal tools as a.m. at your disposal.

    so what's the output of
    /usr/local/apache/bin/apachectl configtest
    ??
    and of your

    /usr/local/apache/bin/apachectl restart

    ??

    since u update from apache 1.3 to 2.2
    you may have a few minor problems specially may be using mod_rewrite
    we had ( me too ) such problems before and you find HERE in DP forum, topic apache (mod_rewrite) solutions for upgrade 1.3>2.x
     
    hans, Nov 27, 2007 IP
  7. floodrod

    floodrod Well-Known Member

    Messages:
    829
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    135
    #7
    config test results in

    syntax OK

    Apache restart

    [Tue Nov 27 16:38:55 2007] [notice] SIGHUP received. Attempting to restart
    [Tue Nov 27 16:38:56 2007] [notice] Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0. 9.7a mod_auth_passthrough/2.1 FrontPage/5.0.2.2635 mod_bwlimited/1.4 configured -- resuming normal operations


    took a few seconds. no hangtime


     
    floodrod, Nov 27, 2007 IP
  8. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #8
    1.
    i see u have upgraded to apache 2.2.
    fine

    2.
    now observe and see if you still get your apache NOT restarting properly during certain daily situations
    best is to observe life using the tail -f commands a.m.

    3.
    as said above
    use the 4 log files - specially the apache error_log
    look at the precise times of your downs - then see what errors occurred during previous down times exactly the seconds around your previous downtimes

    you may download your log files - all a.m. into your laptop for offline processing - for the periods you HAD known downtimes.

    4.
    also as said b4
    research when the first such occurrence appeared
    what did u change in your system ( config OR any link, symlink, new tool, new script installed, new site, etc ) immediately prior to the first occurrence

    as a.m.
    it happened to me once about a year ago, unfortunately i don't recall the exact details but something to the extend of writing a full access_log live into user-space of a domain account for real time processing of traffic by user
    that kept my apache2.2 down until manual restart
     
    hans, Nov 27, 2007 IP
  9. floodrod

    floodrod Well-Known Member

    Messages:
    829
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    135
    #9
    okie dokie... Thanks for the help hans, I will keep you posted..

    It's hard to catch it when it stops responding as it happens a few times a day at random times..
     
    floodrod, Nov 27, 2007 IP
  10. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #10
    even if you don't catch it life when it happens
    by looking at PREVIOUS incidents and searching ALL a.m. log files at the minutes / seconds before it happened, then you almost surely find the action triggering the down of apache

    but may be now with apache2 it's all OK who knows

    if you find the exact incident triggering your apache down time - also check crontab if there is a scheduled task among other factors that might trigger the problem.

    you HAVE to find the problem - unless its solved by upgrading now. else you risk downtimes=loss of $ for all sites.

    however

    since you upgraded from 1.3 to 2.2 - make sure ALL your previous apache features are fully working since there were a few changes in default configuration between 1.3 and 1.2.

    not sure if u have OTHER sites on your server as well or only your own site(s) - but test the mod_rewrite function in your new apache. it either works or does NOT at all. in my case latter after same upgrade - until i changed the apache config.

    and now as a final "homework" since you have another dist still unknown to us

    check all your apache commands to make sure you know your own functionalities

    http://httpd.apache.org/docs/2.2/programs/apachectl.html

    u see that my suse
    rcapache2 reload
    may find its equivalent in your

    apachectl graceful

    find out ALL your paths and names of the 4 ( four ) a.m. log files
    and know all apache commands of your current version/distribution.

    remember each time you change your apache config - you have to relaod that new config by using your

    apachectl graceful

    and BEFORE reloading a possibly faulty modified apache config always run your

    apachectl configtest
    if OK then do your graceful or restart - else correct first.
    changes in .htaccess are instantly active - changes in apache main config files usually only after a reload of config or restart of server.

    for future questions it would be much more helpful to give FULL details of your linux dist and main setup versions - even a regular link to your site in question may help to help you more directly and more efficiently
     
    hans, Nov 27, 2007 IP
  11. floodrod

    floodrod Well-Known Member

    Messages:
    829
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    135
    #11
    OK, it seems to just have happened..

    messages are always filled with this, besides an occasional failed authentication try hy hack attempts..

    Nov 28 09:10:59 jimbob kernel: ** RABHIT ** IN=eth0 OUT= MAC=00:10:dc:e2:cd:0f:00:18:19:cf:c1:f0:08:00 SRC=78.149.199.223 DST=69.72.214.58 LEN=40 TOS=0x00 PREC=0x00 TTL=53 ID=6071 PROTO=TCP SPT=113 DPT=59897 WINDOW=0 RES=0x00 ACK RST FIN URGP=0
    Nov 28 09:25:42 jimbob kernel: ** RABHIT ** IN=eth0 OUT= MAC=00:10:dc:e2:cd:0f:00:18:19:cf:c1:f0:08:00 SRC=190.198.248.23 DST=69.72.214.58 LEN=40 TOS=0x00 PREC=0x00 TTL=56 ID=6016 PROTO=TCP SPT=113 DPT=41871 WINDOW=0 RES=0x00 ACK RST FIN URGP=0
    Nov 28 09:41:35 jimbob kernel: ** RABHIT ** IN=eth0 OUT= MAC=00:10:dc:e2:cd:0f:00:18:19:cf:c1:f0:08:00 SRC=78.144.142.137 DST=69.72.214.58 LEN=40 TOS=0x00 PREC=0x00 TTL=52 ID=15585 PROTO=TCP SPT=113 DPT=53868 WINDOW=0 RES=0x00 ACK RST FIN URGP=0
    Nov 28 09:44:10 jimbob kernel: ** RABHIT ** IN=eth0 OUT= MAC=00:10:dc:e2:cd:0f:00:18:19:cf:c1:f0:08:00 SRC=201.9.15.152 DST=69.72.214.58 LEN=40 TOS=0x00 PREC=0x00 TTL=56 ID=6267 PROTO=TCP SPT=113 DPT=33936 WINDOW=0 RES=0x00 ACK RST FIN URGP=0
    Nov 28 09:44:29 jimbob kernel: ** RABHIT ** IN=eth0 OUT= MAC=00:10:dc:e2:cd:0f:00:18:19:cf:c1:f0:08:00 SRC=78.165.183.249 DST=69.72.214.58 LEN=40 TOS=0x00 PREC=0x00 TTL=53 ID=8621 PROTO=TCP SPT=113 DPT=39128 WINDOW=0 RES=0x00 ACK RST FIN URGP=0

    access log is always clean

    127.0.0.1 - - [28/Nov/2007:09:45:55 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:45:56 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:46:03 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:46:13 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:46:19 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:46:53 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:47:01 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:48:08 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:48:09 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:48:10 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:48:11 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:48:12 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:48:13 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:48:14 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:48:15 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:48:16 -0500] "GET / HTTP/1.0" 200 2860
    127.0.0.1 - - [28/Nov/2007:09:48:17 -0500] "GET / HTTP/1.0" 200 2860




    error logs just show bad requests or files that aren't there- I modified the path's as I pasted

    [Wed Nov 28 09:47:07 2007] [error] [client 82.195.137.125] File does not exist: /path/favicon.ico
    [Wed Nov 28 09:47:07 2007] [error] [client 82.195.137.125] File does not exist: /path/favicon.ico/404.shtml
    [Wed Nov 28 09:50:08 2007] [error] [client 74.224.203.239] File does not exist: /path/favicon.ico
    [Wed Nov 28 09:50:08 2007] [error] [client 74.224.203.239] File does not exist: /path/favicon.ico/404.shtml
    [Wed Nov 28 09:50:32 2007] [error] [client 74.224.203.239] File does not exist: /path/favicon.ico/favicon.ico
    [Wed Nov 28 09:50:32 2007] [error] [client 74.224.203.239] File does not exist: /path/favicon.ico/404.shtml
    [Wed Nov 28 09:53:18 2007] [error] [client 77.102.100.245] File does not exist: /path/favicon.ico/favicon.ico
    [Wed Nov 28 09:53:18 2007] [error] [client 77.102.100.245] File does not exist: /path/favicon.ico/404.shtml

    still can't find warn log.. when I locate warn, thousands of results come up..
     
    floodrod, Nov 28, 2007 IP
  12. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #12
    1.
    you still never told me what linux you use
    how in heaven do u expect most time/resource-efficient help if i have to guess what u are using ??

    2.
    warn log
    search until you find - if it takes yoou 5 or 500 hrs never mind it IS there - in every reasonable linux install - any dist
    normal linux dists have most logs in
    /var/log
    normally in same folder as messages
    or in subfolders of above

    3.
    you have to be much more precise with your quoted log entries

    >>> what EXACT second did apache stop

    then show for ALL logs the seconds before until after apache stop

    in your above quoted logs you start each of your quotes in a different minute - hence none of the quoted log lines is of ANY value for analysis until ALL log lines are from SAME period of time starting BEFORE apache stop until AFTER apache restart !!

    find ALL 4 a.m. logs - even if you have to go manually thu 1000 warn files
    there is ONE warn log only - and it is for your dist easy to be found if you know what dist you have and if you GOOGLE or read howtos

    since apache stops - you have to find entries in log files about that apache action ! the chance that apache stops without log entry is near zero
     
    hans, Nov 28, 2007 IP
  13. ray9

    ray9 Guest

    Messages:
    69
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #13
    Do you have apf and bfd installed? Is your firewall in demo mode?
    Are any of the IPs from the rabhit yours?
     
    ray9, Nov 28, 2007 IP
  14. floodrod

    floodrod Well-Known Member

    Messages:
    829
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    135
    #14
    bfd= yes
    apf= yes
    demo mode.. don't think so. I am currently on a mobile device at work and can't check till later..

    rabhit- src ips not mine, but dst ips are mine.

    I can check More thoroughly tonight when I'm home

     
    floodrod, Nov 29, 2007 IP
  15. Ladadadada

    Ladadadada Peon

    Messages:
    382
    Likes Received:
    36
    Best Answers:
    0
    Trophy Points:
    0
    #15
    I have seen these symptoms before... check how many Apache processes are running at the time when it is not responding. There's a setting in your Apache .conf file for max_clients and if you reach that number of apache processes Apache will stop accepting requests.

    max_clients is usually around 300 - 600. If you had a single user on a dial-up modem who requested 500 files simultaneously, he could cause your entire site to stop working until he finished downloading some of them.

    If your number of Apache processes is down around a normal level then this isn't your problem but it's worth checking.
     
    Ladadadada, Nov 29, 2007 IP
  16. ray9

    ray9 Guest

    Messages:
    69
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #16
    floodrod, any news?
     
    ray9, Nov 30, 2007 IP
  17. floodrod

    floodrod Well-Known Member

    Messages:
    829
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    135
    #17
    Webhostgear runs a server diagnostic and repair service.. I emailed them and they think it is being caused by APF firewall and have offered a price to investigate and repair.

    I am in the process of gathering all the information on APF I can, and will try to investigate this before asking them to do it..

    I am going to start by following the manual http://rfxnetworks.com/appdocs/README.apf line for line
    Anyone have any other APF hints or resources?
     
    floodrod, Dec 3, 2007 IP
  18. ray9

    ray9 Guest

    Messages:
    69
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #18
    why do you think I asked for apf right away?
     
    ray9, Dec 4, 2007 IP
  19. floodrod

    floodrod Well-Known Member

    Messages:
    829
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    135
    #19
    because you are smart..

    I looked and it wasn't on test mode. I also changed some variables today. I will have to monitor it and see if it did anything.

    btw, server is pretty active. burns almost 100 gigs of b/w a month and serves tens of thousands of pages daily. I'm thinking it might have had something to do with the standard flood control number that was preset. I can't remember what it was called, but I adjusted it to 52000 instead of the 37000 (estimate) that comes preset.

    later I will post the exact details when I get off this crappy mobile device
     
    floodrod, Dec 4, 2007 IP