1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Monitor server sensors with ipmitool

Discussion in 'Site & Server Administration' started by digitalpoint, Sep 19, 2013.

  1. #1
    A few months ago we migrated to new servers, and it was a bit of a rush to get them installed in the data center (new servers physically made it to the data center about 3 hours before old servers were losing their power circuit... but that's a whole different topic).

    Long story short is that it was such a rush to get stuff migrated to the new servers that I didn't really have time to build monitoring tools for the new servers, and I finally found a couple hours today to get that done...

    The new servers have an "Intelligent Platform Management Interface" (IPMI for short) system where we can (among other things) monitor hardware even when the servers are powered off.

    I thought I was going to have to set up SNMP probes to monitor everything, but it turned out there's a Linux kernel module that you can dynamically load to be able to access hardware sensor data directory from the shell. I figured this might save some time for someone looking to do the same thing with IPMI enabled servers...

    First load the kernel modules if they aren't already...
    modprobe ipmi_msghandler
    modprobe ipmi_devintf
    modprobe ipmi_si
    Code (markup):
    For what I'm trying to do now, I just wanted to read the sensors, so after the kernel modules are loaded, you can run this...
    ipmitool sensor
    Code (markup):
    Which outputs this for me...
    CPU1 Temp        | 52.000     | degrees C  | ok    | 0.000     | 0.000     | 0.000     | 97.000    | 100.000   | 102.000  
    CPU2 Temp        | 57.000     | degrees C  | ok    | 0.000     | 0.000     | 0.000     | 97.000    | 100.000   | 102.000  
    System Temp      | 38.000     | degrees C  | ok    | -9.000    | -7.000    | -5.000    | 80.000    | 85.000    | 90.000    
    Peripheral Temp  | 53.000     | degrees C  | ok    | -9.000    | -7.000    | -5.000    | 80.000    | 85.000    | 90.000    
    PCH Temp         | 55.000     | degrees C  | ok    | -11.000   | -8.000    | -5.000    | 90.000    | 95.000    | 100.000  
    FAN1             | 6525.000   | RPM        | ok    | 300.000   | 450.000   | 600.000   | 18975.000 | 19050.000 | 19125.000 
    FAN2             | 6600.000   | RPM        | ok    | 300.000   | 450.000   | 600.000   | 18975.000 | 19050.000 | 19125.000 
    VTT              | 1.040      | Volts      | ok    | 0.816     | 0.864     | 0.912     | 1.344     | 1.392     | 1.440    
    CPU1 Vcore       | 1.008      | Volts      | ok    | 0.480     | 0.512     | 0.544     | 1.488     | 1.520     | 1.552    
    CPU2 Vcore       | 1.024      | Volts      | ok    | 0.480     | 0.512     | 0.544     | 1.488     | 1.520     | 1.552    
    VDIMM AB         | 1.328      | Volts      | ok    | 1.104     | 1.152     | 1.200     | 1.648     | 1.696     | 1.744    
    VDIMM CD         | 1.328      | Volts      | ok    | 1.104     | 1.152     | 1.200     | 1.648     | 1.696     | 1.744    
    VDIMM EF         | 1.328      | Volts      | ok    | 1.104     | 1.152     | 1.200     | 1.648     | 1.696     | 1.744    
    VDIMM GH         | 1.328      | Volts      | ok    | 1.104     | 1.152     | 1.200     | 1.648     | 1.696     | 1.744    
    +1.1 V           | 1.088      | Volts      | ok    | 0.880     | 0.928     | 0.976     | 1.216     | 1.264     | 1.312    
    +1.5 V           | 1.472      | Volts      | ok    | 1.248     | 1.296     | 1.344     | 1.648     | 1.696     | 1.744    
    3.3V             | 3.264      | Volts      | ok    | 2.640     | 2.784     | 2.928     | 3.648     | 3.792     | 3.936    
    +3.3VSB          | 3.264      | Volts      | ok    | 2.640     | 2.784     | 2.928     | 3.648     | 3.792     | 3.936    
    5V               | 4.928      | Volts      | ok    | 4.096     | 4.288     | 4.480     | 5.504     | 5.696     | 6.912    
    +5VSB            | 4.992      | Volts      | ok    | 4.096     | 4.288     | 4.480     | 5.504     | 5.696     | 6.912    
    12V              | 11.766     | Volts      | ok    | 10.176    | 10.494    | 10.812    | 13.250    | 13.568    | 13.886    
    VBAT             | 3.168      | Volts      | ok    | 2.400     | 2.544     | 2.688     | 3.312     | 3.456     | 3.600    
    PS1 Status       | 0x1        | discrete   | 0x0100| na        | na        | na        | na        | na        | na        
    PS2 Status       | 0x1        | discrete   | 0x0100| na        | na        | na        | na        | na        | na
    Code (markup):
    The first 4 columns is the important "current status" data... the other columns are min/max alarm thresholds for each sensor.

    Now we have a quick way to see things like temperatures, voltages, fan RPM, power supply status, etc. Much simpler than setting up SNMP probes.

    You can also do all sorts of other interesting things if you need, for example this command will show you status of the chassis...
    ipmitool chassis status
    Code (markup):
    System Power         : on
    Power Overload       : false
    Power Interlock      : inactive
    Main Power Fault     : false
    Power Control Fault  : false
    Power Restore Policy : previous
    Last Power Event     : 
    Chassis Intrusion    : inactive
    Front-Panel Lockout  : inactive
    Drive Fault          : false
    Cooling/Fan Fault    : false
    Code (markup):
     
    digitalpoint, Sep 19, 2013 IP