3.0.0-12-server #20-Ubuntu
Ska försöka få dit nåt och öppna burken å känna på alla kablar/minnen etc.
Skulle ju troligtvis kunna vara något sådant som har hoppat loss lite under flytten.
Kör inte X.
Integrerat nvidia.
Har testkört minnena utan problem. Ser detta i loggarna, precis innan servern dör (tror jag):

Kod:
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.757782] [drm] nouveau 0000:01:00.0: Detected an NV50 generation card (0x0ace80b1)
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.765238] [drm] nouveau 0000:01:00.0: Attempting to load BIOS image from PRAMIN
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833606] [drm] nouveau 0000:01:00.0: ... appears to be valid
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833620] [drm] nouveau 0000:01:00.0: BIT BIOS found
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833628] [drm] nouveau 0000:01:00.0: Bios version 62.79.5f.00
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833637] [drm] nouveau 0000:01:00.0: TMDS table version 2.0
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833655] [drm] nouveau 0000:01:00.0: Found Display Configuration Block version 4.0
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833663] [drm] nouveau 0000:01:00.0: Raw DCB entry 0: 02000300 0000001e
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833670] [drm] nouveau 0000:01:00.0: Raw DCB entry 1: 02011332 00020010
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833677] [drm] nouveau 0000:01:00.0: Raw DCB entry 2: 0000000e 00000000
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833686] [drm] nouveau 0000:01:00.0: DCB connector table: VHER 0x40 5 16 4
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833693] [drm] nouveau 0000:01:00.0:   0: 0x00000000: type 0x00 idx 0 tag 0xff
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833700] [drm] nouveau 0000:01:00.0:   1: 0x00001161: type 0x61 idx 1 tag 0x07
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.833710] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 0 at offset 0xDEBA
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.918277] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 1 at offset 0xE17C
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.918285] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 2 at offset 0xE17E
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.918300] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 3 at offset 0xE23F
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.918317] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 4 at offset 0xE304
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.918324] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table at offset 0xE369
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.938180] [drm] nouveau 0000:01:00.0: 0xE369: Condition still not met after 20ms, skipping following opcodes
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.938214] [drm] nouveau 0000:01:00.0: timingset 255 does not exist
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.938221] [drm] nouveau 0000:01:00.0: timingset 255 does not exist
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.938228] [drm] nouveau 0000:01:00.0: timingset 255 does not exist
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.938234] [drm] nouveau 0000:01:00.0: timingset 255 does not exist
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.957154] [drm] nouveau 0000:01:00.0: 4 available performance level(s)
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.957168] [drm] nouveau 0000:01:00.0: 0: memory 0MHz core 200MHz shader 400MHz voltage 1010mV fanspeed 60%
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.957179] [drm] nouveau 0000:01:00.0: 1: memory 0MHz core 300MHz shader 600MHz voltage 1010mV fanspeed 100%
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.957189] [drm] nouveau 0000:01:00.0: 2: memory 0MHz core 350MHz shader 800MHz voltage 1010mV fanspeed 100%
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.957200] [drm] nouveau 0000:01:00.0: 3: memory 0MHz core 450MHz shader 1100MHz voltage 1010mV fanspeed 100%
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.957219] [drm] nouveau 0000:01:00.0: c: memory 0MHz core 450MHz shader 1100MHz
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.957568] [TTM] Zone  kernel: Available graphics memory: 1721122 kiB.
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.957575] [TTM] Initializing pool allocator.
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.957612] [drm] nouveau 0000:01:00.0: Detected 32MiB VRAM
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.957618] [drm] nouveau 0000:01:00.0: Stolen system memory at: 0x00de000000
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.957635] mtrr: type mismatch for e0000000,10000000 old: write-back new: write-combining
Nov 21 12:34:45 CH-SRV-1 kernel: [   18.961811] [drm] nouveau 0000:01:00.0: 64 MiB GART (aperture)
Nov 21 12:34:45 CH-SRV-1 kernel: [   19.005187] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
Nov 21 12:34:45 CH-SRV-1 kernel: [   19.005197] [drm] No driver support for vblank timestamp query.
Nov 21 12:34:45 CH-SRV-1 kernel: [   19.064777] No connectors reported connected with modes
Nov 21 12:34:45 CH-SRV-1 kernel: [   19.064789] [drm] Cannot find any crtc or sizes - going 1024x768
Nov 21 12:34:45 CH-SRV-1 kernel: [   19.066850] [drm] nouveau 0000:01:00.0: allocated 1024x768 fb: 0x40000000, bo ffff8800d66f1000
Nov 21 12:34:45 CH-SRV-1 kernel: [   19.067121] fbcon: nouveaufb (fb0) is primary device
Nov 21 12:34:45 CH-SRV-1 kernel: [   19.067470] Console: switching to colour frame buffer device 128x48
Nov 21 12:34:45 CH-SRV-1 kernel: [   19.068405] fb0: nouveaufb frame buffer device
Nov 21 12:34:45 CH-SRV-1 kernel: [   19.068413] drm: registered panic notifier
Nov 21 12:34:45 CH-SRV-1 kernel: [   19.068445] [drm] Initialized nouveau 0.0.16 20090420 for 0000:01:00.0 on minor 0
Nov 21 12:34:45 CH-SRV-1 kernel: [   19.575219] hda_codec: ALC889A: BIOS auto-probing.
Nov 21 12:34:45 CH-SRV-1 kernel: [   20.372804] type=1400 audit(1321875281.586:2): apparmor="STATUS" operation="profile_load" name="/sbin/dhclient" pid=512 comm="apparmor_parser"
Nov 21 12:34:45 CH-SRV-1 kernel: [   20.372833] type=1400 audit(1321875281.586:3): apparmor="STATUS" operation="profile_replace" name="/sbin/dhclient" pid=511 comm="apparmor_parser"
Nov 21 12:34:45 CH-SRV-1 kernel: [   20.374360] type=1400 audit(1321875281.586:4): apparmor="STATUS" operation="profile_load" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=511 comm="apparmor_parser"
Nov 21 12:34:45 CH-SRV-1 kernel: [   20.374388] type=1400 audit(1321875281.586:5): apparmor="STATUS" operation="profile_replace" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=512 comm="apparmor_parser"
Nov 21 12:34:45 CH-SRV-1 kernel: [   20.375313] type=1400 audit(1321875281.586:6): apparmor="STATUS" operation="profile_load" name="/usr/lib/connman/scripts/dhclient-script" pid=511 comm="apparmor_parser"
Nov 21 12:34:45 CH-SRV-1 kernel: [   20.375339] type=1400 audit(1321875281.586:7): apparmor="STATUS" operation="profile_replace" name="/usr/lib/connman/scripts/dhclient-script" pid=512 comm="apparmor_parser"
Nov 21 12:34:45 CH-SRV-1 kernel: [   22.785343] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
Nov 21 12:34:45 CH-SRV-1 kernel: [   23.299800] init: failsafe main process (650) killed by TERM signal
Nov 21 12:34:46 CH-SRV-1 kernel: [   24.888582] type=1400 audit(1321875286.102:8): apparmor="STATUS" operation="profile_replace" name="/sbin/dhclient" pid=765 comm="apparmor_parser"
Nov 21 12:34:46 CH-SRV-1 kernel: [   24.889773] type=1400 audit(1321875286.102:9): apparmor="STATUS" operation="profile_replace" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=765 comm="apparmor_parser"
Nov 21 12:34:46 CH-SRV-1 kernel: [   24.890488] type=1400 audit(1321875286.102:10): apparmor="STATUS" operation="profile_replace" name="/usr/lib/connman/scripts/dhclient-script" pid=765 comm="apparmor_parser"
Nov 21 12:34:46 CH-SRV-1 kernel: [   25.049188] type=1400 audit(1321875286.262:11): apparmor="STATUS" operation="profile_load" name="/usr/sbin/mysqld" pid=766 comm="apparmor_parser"
Nov 21 12:34:47 CH-SRV-1 kernel: [   26.050147] init: apport pre-start process (818) terminated with status 1
Nov 21 12:34:47 CH-SRV-1 kernel: [   26.062136] init: apport post-stop process (843) terminated with status 1
Nov 21 12:34:47 CH-SRV-1 kernel: [   26.079531] audit_printk_skb: 3 callbacks suppressed
Nov 21 12:34:47 CH-SRV-1 kernel: [   26.079540] type=1400 audit(1321875287.290:13): apparmor="STATUS" operation="profile_replace" name="/usr/sbin/mysqld" pid=851 comm="apparmor_parser"

Nov 21 12:34:53 CH-SRV-1 kernel: [   31.936027] eth0: no IPv6 routers present
Verkar ju vara nåt fel med grafiken... eller?
I princip samma sak kom idag vid 10:45, men då dog väl inte servern tror jag..?
*********
Vi har för närvarande, och har haft ett tag, stora problem med vår server. Den kraschar helt utan förvarning och utan att det på något sätt går att felsöka. Enda sättet att få igång den igen är att starta om den manuellt, något som måste göras fysiskt på burken och det tar således tid att få en person att göra det då den står i en serverhall.Om det är någon som har tips på vad som kan vara fel, skriv en kommentar.

Problemet visar sig som att hela servern dör och man kommer inte åt den alls via ssh/web eller liknande. Vid inkoppling med skärm och tangentbord till servern visade det sig att även terminalen var helt död och vid ett tillfälle "förvrängd".

Det som då kan misstänkas är väl att det kan vara det integrerade grafikkortet, alternativt hela moderkortet, som är trasigt.
Eller vad säger ni?
Har ni förslag på vad vi kan prova att åtgärda?