Project

General

Profile

Actions

Bug #39

closed

Toushirou is unstable

Added by Marc Dequènes about 14 years ago. Updated over 13 years ago.

Status:
Resolved
Priority:
High
Category:
System :: Hardware
Start date:
2010-04-11
Due date:
% Done:

100%

Estimated time:
Patch Available:
No
Confirmed:
Yes
Branch:
Entity:
DuckCorp
Security:
No
Help Needed:

Description

Still trying to find if we can save Toushirou. Maybe it is not an hardware problem.

Possible references to causes:
Actions #1

Updated by Marc Dequènes almost 14 years ago

  • % Done changed from 30 to 40
  • Security set to No

Recently in kernel.log:

May 25 17:28:57 Toushirou kernel: [950699.816031] Disabling lock debugging due to kernel taint
May 25 17:28:57 Toushirou kernel: [950699.816062] Machine check events logged

I discovered mcelog, and obtained:

May 26 14:51:36 Toushirou mcelog: failed to prefill DIMM database from DMI data
May 26 14:51:36 Toushirou mcelog: Kernel does not support page offline interface
May 26 14:51:36 Toushirou mcelog: HARDWARE ERROR. This is *NOT* a software problem!
May 26 14:51:36 Toushirou mcelog: Please contact your hardware vendor
May 26 14:51:36 Toushirou mcelog: MCE 0
May 26 14:51:36 Toushirou mcelog: CPU 0 BANK 3 
May 26 14:51:36 Toushirou mcelog: ADDR 85aff40 
May 26 14:51:36 Toushirou mcelog: TIME 1274878296 Wed May 26 14:51:36 2010
May 26 14:51:36 Toushirou mcelog: MCG status:
May 26 14:51:36 Toushirou mcelog: MCi status:
May 26 14:51:36 Toushirou mcelog: Error enabled
May 26 14:51:36 Toushirou mcelog: MCi_ADDR register valid
May 26 14:51:36 Toushirou mcelog: MCA: Generic CACHE Level-2 Generic Error
May 26 14:51:36 Toushirou mcelog: STATUS 942000450001010a MCGSTATUS 0
May 26 14:51:36 Toushirou mcelog: MCGCAP 6 APICID 0 SOCKETID 0 
May 26 14:51:36 Toushirou mcelog: CPUID Vendor Intel Family 6 Model 15

Probably a real hadware problem, linked to memory, but perhaps the CPU cache is at fault; i hope this is RAM, which is being replaced next saturday.

Actions #2

Updated by Marc Dequènes over 13 years ago

  • Category changed from System :: Base to System :: Hardware
Actions #3

Updated by Marc Dequènes over 13 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 40 to 100

RAM was changed and increased btw.

Uptime is now 113 days with no problems.

Old RAM => trashcan.

Actions

Also available in: Atom PDF