Had weird problems with one of my servers lately, it just appears to have crashed completely once, and it's suddenly become inaccessible and only responding to ping. We've looked through the server logs and tried to find something, and found this (so far). Thing is, noone seems to know what this means. Mar 22 14:57:33 andrea kernel: [bad_page+94/141] bad_page+0x5e/0x8d Mar 22 14:57:33 andrea kernel: [prep_new_page+233/246] prep_new_page+0xe9/0xf6 Mar 22 14:57:33 andrea kernel: [buffered_rmqueue+205/309] buffered_rmqueue+0xcd/0x135 Mar 22 14:57:33 andrea kernel: [get_page_from_freelist+140/168] get_page_from_freelist+0x8c/0xa8 Mar 22 14:57:33 andrea kernel: [__alloc_pages+74/683] __alloc_pages+0x4a/0x2ab Mar 22 14:57:33 andrea kernel: [change_clocksource+298/300] change_clocksource+0x12a/0x12c Mar 22 14:57:33 andrea kernel: [__pagevec_lru_add_active+138/149] __pagevec_lru_add_active+0x8a/0x95 Mar 21 12:21:54 andrea kernel: Modules linked in: r8169 ide_cd cdrom rtc Mar 21 12:21:54 andrea kernel: <1>Fixing recursive fault but reboot is needed! Mar 21 12:21:54 andrea kernel: Eeek! page_mapcount(page) went negative! (-1) Mar 21 12:21:54 andrea kernel: page->flags = 8000087c Mar 21 12:21:54 andrea kernel: page->count = 2 Mar 21 12:21:54 andrea kernel: page->mapping = ed827cd0 Mar 21 12:21:54 andrea kernel: ------------[ cut here ]------------ Mar 21 12:21:54 andrea kernel: kernel BUG at mm/rmap.c:578! Mar 21 12:21:54 andrea kernel: invalid opcode: 0000 [#1] Code (markup):
Well it's not running out of memory, we are starting to suspect that the memory has gone bad (the hardware) and that it's causing weird errors and sometimes fatal errors.
Yeah it's the next step. They're gonna swap the memory, and if it's not memory I'll have to suck up the downtime of a full hardware/server diagnostics. I hate troubleshooting weird computer errors :/