Nginx Marks Site Unavailable

Discussion in 'Nginx' started by Daniel Senior, Mar 30, 2020.

  1. #1
    Have version 1.17.2 (free version)
    Have it setup as a load balancer with 2 machines behind it. Periodically the Nginx Marks both servers unavailable and get back a 504 message. The 2 servers are just front end servers running Apache that forwards request to other servers.
    Can do a crtl R (constant refresh holding it down), no problem, pause and start doing a crtl r manually and we can see the page just spinning.
    TCPDUMP shows request going to servers (firewall packet sniffer), on the servers see the packet hit the interface, after 30 seconds, see it move to the second server, but on both servers get no response. Telnet from the Nginx box to the servers no response, other boxes can reach the port that Nginx is trying to reach (8800).
    The Log shows that the 1st server is unavailable, then the second, then receive back a 504 bad gateway. Run the same test, but switch to an A10, and never get a failure.

    Currently on a Production LB, and will be creating a Test LB with the same configs, so that test can be run to confirm the issue.
    Does the Nginx server think it is a DDOS issue? Do not see anything in logs. Does the server have the capability to block telnet attempts from it?

    Has anyone ever seen this kind of behavior? Any ideas on what is occuring?

    Thanks
     
    Daniel Senior, Mar 30, 2020 IP
  2. SolaDrive

    SolaDrive Well-Known Member

    Messages:
    136
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    153
    #2
    Who setup this setup? Do you not have a person who manages your servers? I would start with checking your access, nginx and tcp logs to see what's going on. It sounds like an issue with the LB or nginx not listening correctly.
     
    SolaDrive, Mar 30, 2020 IP
  3. Daniel Senior

    Daniel Senior Peon

    Messages:
    2
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    1
    #3
    I'm The one who set this Server up. As for who manages it, it depends, monitoring is Ops, changes to Linux kernel, System Engineers, installing and maintaining the LB (Nginx) Networking (Me), all under the umbrella of Infra.
    Now to answer you other questions:
    My access-- Root
    TCP Logs:
    Access Logs:
    64.14.35.6 - - [30/Mar/2020:11:12:25 -0400] "GET /webservices/gainloss/realized/account/73910000?detail=N HTTP/1.1" 504 167 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:74.0) Gecko/20100101 Firefox/74.0" "-"
    [root@sciv5-fis-extlb01 nginx]# cat access.log-20200331 | grep '1.1" 502' 64.14.35.6 - - [30/Mar/2020:17:26:46 -0400] "GET / HTTP/1.1" 502 559 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36" "-"

    Error Log ( At the time of the error):
    2020/03/30 11:11:25 [warn] 99522#99522: *2234652 upstream server temporarily disabled while connecting to upstream,
    2020/03/30 11:11:25 [error] 99522#99522: *2234652 upstream timed out (110: Connection timed out) while connecting to upstream,
    2020/03/30 11:12:25 [warn] 99522#99522: *2234652 upstream server temporarily disabled while connecting to upstream,
    2020/03/30 11:12:25 [error] 99522#99522: *2234652 upstream timed out (110: Connection timed out) while connecting to upstream,

    2020/03/30 17:25:23 [warn] 99522#99522: *2268068 upstream server temporarily disabled while connecting to upstream
    2020/03/30 17:25:23 [error] 99522#99522: *2268068 upstream timed out (110: Connection timed out) while connecting to upstream

    2020/03/30 17:26:46 [warn] 99522#99522: *2268136 upstream server temporarily disabled while connecting to upstream
    2020/03/30 17:26:46 [error] 99522#99522: *2268136 upstream timed out (110: Connection timed out) while connecting to upstream

    2020/03/30 17:26:37 [warn] 99522#99522: *2268126 upstream server temporarily disabled while connecting to upstream, client:
    2020/03/30 17:26:37 [error] 99522#99522: *2268126 upstream timed out (110: Connection timed out) while connecting to upstream,

    nginx not listening correctly.-- What do you mean by that? It gets requests on 80 and 443 and forwards the requests to appropriate server URL and port. On Nginx outside interface it is listening on 80 and 443, on the inside interface it isn't listening to any ports, but allows traffic out to the servers

    So can you be more specific with your answer? I appreciate the answer, but will need more feed back. I realize that there could be an issue, and that's why I'm on this forum.

    Thanks

    Dan
     
    Daniel Senior, Mar 31, 2020 IP