Posted by drow on the 9th of August, 2007 at 8:28 pm under tech.    This post has 7 comments.

My desktop seems to have melted something precious in this morning’s lightning storm, despite the recent and rather hefty UPS sitting behind it. I am, to say the least, annoyed. It was a nice self-built Shuttle system from 2005, but now all it is is a machine check generator:

HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 4 northbridge TSC 6eb92ec370b
  Northbridge Watchdog error
       bit57 = processor context corrupt
       bit61 = error uncorrected
  bus error 'generic participation, request timed out
      generic error mem transaction
      generic access, level generic'
STATUS b200000000070f0f MCGSTATUS 4

It’s not even bad RAM or a bad RAM socket, the way most memory errors seem to be; I had two DIMMs in two sockets so I was able to try for both possibilities. Unless it fried both somehow, of course. It seems to boot a little further when left cold for a while, and not as far just after crashing. I am inclined to suspect the motherboard.

I think I’m going to have to write it off as a loss. Which means I need a replacement. I can get by with just my laptop for a little while, but not indefinitely.

Are there any good options for pre-built Linux systems yet? I was somewhat underwhelmed by Dell’s. I usually build from parts, so looking at pre-built systems always has a certain amount of sticker shock. My main concerns are CPU / memory performance (a lot of compiling here), room for at least three hard drives, and reasonably quiet.

Otherwise it’s off to newegg again…

Or I could always buy an equivalent computer to my current one, for a fraction of the original price, at a local consumer electronics megastore! How’s that for a different idea.



* Required

Posted on the 9th of August, 2007 at 10:23 pm.

I had a very similar issue once that actually turned out to be a pretty cool fix. This was on a server at my house with a single dual-core Xeon and some ECC ram. The problem was a little bit intermittent, and still persisted after swapping out some new ram. I began to notice that after a cold, cold boot, things were fine, but started to get worse as the box headed up.

Upon examination of the motherboard, I noticed a small black band along one of the data lines by the first DIMM slot. After stripping off the solder mask a little, I could make out a really tiny hairline fracture. Both sides were perfectly conductive when the board was cold, but as I heated it up, the board would pull the traces apart just a little and open the line. I just soldered the two sides with some really soft solder and it’s been fine ever since.

Posted on the 9th of August, 2007 at 11:15 pm.

Was your UPS protecting possible xDSL line too? Other routes for lightning could be TV antenna etc.
Ideally all of electricity, phone, TV should be grounded to same point.

Posted on the 10th of August, 2007 at 1:25 am.

HP seems to make some good desktop systems, and you can get them with just FreeDOS rather than Windows.

Posted on the 10th of August, 2007 at 1:25 am.

Clarification regarding the HP desktops: look under “small business”, not “home”.

Posted on the 10th of August, 2007 at 2:37 am.

The UPS was not protecting the DSL line; that’s the only plausible route I can think of, though it’s quite a tangled and twisty one (through three devices, the others of which are fine).

HP does seem to make some pretty nice desktops even in the consumer space. I’m currently planning on picking up an HP a6152n; I actually use Windows occasionally, so avoiding it is less of a benefit to me than to many other Planet Debian readers :-) My budget’s a little thin for the quad core alternatives in their small business space, and my workload is somewhat unique (for the overall space of consumers, not for Planet Debian readers, I expect!) such that doubling the number of cores is still a Big Deal.

Posted on the 10th of August, 2007 at 9:26 am.

Hi Daniel,

I read your post through PDN: I had a similar problem a while ago, and all I needed to fix it has been to unmount all components and remount the PC. Most probably a loose SATA cable was causing these problems.
I also updated my BIOS for good measure :)

HTH

Posted on the 12th of August, 2007 at 6:49 pm.

[…] The gloom from my last entry has more or less lifted. To refresh, my desktop died shortly after a lightning storm; no definite cause and effect relationship, but a strong correlation. I shut it down, and after the next reboot I got scary errors all over the place, starting with this disclaimer: […]