Posted by drow on the 7th of June, 2009 at 7:15 pm under tech.    This post has no comments.

My trusty Shuttle barebones desktop finally gave up the ghost on Monday. I got back from a week’s vacation, sat down to read email, and after about an hour it shut itself off and refused to turn on again (fans spin up, lights come on, then everything shuts off again within five seconds). My best guess is that the power supply is shot. I could probably have gotten a replacement, but the thing is four years old so this seemed like an auspicious moment to replace it.

I mentioned this died in the middle of the work day. So I needed to replace it promptly. Rather than building from components as I used to, I drove down to the local Best Buy and looked around. They had a plausible HP Phenom quad-core system, and one Dell Core i7 based system. Literally one - they don’t stock any other i7 based systems here, and this one was discontinued and sold out. So I got a nice discount on the floor model.

The system is a Dell Studio XPS 435MT (since it’s discontinued, I can’t find a shiny product page to link to, but here’s the system documentation - other Studio XPS systems like the Studio XPS 435 should be similar).

Installing Linux

I didn’t do a real install on it. Instead I left the internal 640GB drive with Windows, and transplanted the two drives of my Shuttle’s RAID array into the new case. There’s only one empty 3.5″ bay; I propped the second drive in the empty 5.25″ bay and ordered a mounting bracket from Newegg. While I was there I ordered 6G RAM to go with the 4G included (which involved pulling out one of the 1G sticks).

Getting it to boot in this setup took me several tries. I had a working grub configuration on what was now /dev/sdc ((hd2) in grub speak). But I needed to put the MBR on (hd0). First I fixed up drive names in /etc/fstab and /boot/grub/menu.lst using System Rescue CD. The live CD almost but not quite handles LVM on top of RAID; it starts them in the wrong order. So after booting, vgscan and vgchange -ay, and away we go.

Also from the rescue CD, I needed to build or install a new kernel - I didn’t have the e1000e network driver or the ata_piix SATA driver enabled in the Shuttle’s kernel. It was easier to build a new kernel than to transfer the standard kernel .deb onto the system, so I did that. I also regenerated the initrd (see previous rant) about how hard initrd makes this sort of transplant).

Several times I got dumped into the initrd while I was getting this right and my USB keyboard didn’t work. I never did figure out why; the right modules were there, but perhaps they aren’t loaded early enough.

Unfortunately grub’s “setup” command runs “install” with “/boot/grub/menu.lst” instead of “(hd2)/boot/grub/menu.lst”. Eventually I figured out what I needed to add, and ran first setup and then a modified version of the install command that setup echoed. And now it booted.

Graphics acceleration

X was very, very slow when the system first came up. Eventually I figured out that only current git kernels had the PCI IDs listed for the Radeon RV770 - 2.6.29 was too old. An update brought working DRM, though see below for more about X. I haven’t used X without 2D acceleration since I worked on the Power Mac controlfb driver; I’d forgotten how disgustingly slow it is!

Fans, temperature sensors, and power management

Windows utilities like SpeedFan and Everest report three internal fans. Everest labels them as CPU, Chassis, and Power Supply. Linux lm-sensors detects the IT8720F chip but only shows two fans; the power supply fan is presumably on the same sensor chip, but Linux is detecting it as not present for some reason. I’m going to come back to this.

There are two temperature sensors (three show up, but -55C is pretty unlikely, so I presume it’s not connected). The motherboard sensor typically reads 42C in Windows and the “Aux” sensor (which I haven’t located) reads about 74C in Windows… and 84C in Linux.

The most likely source of heat in the case is the GPU (Radeon HD 4850, RV770 chip). After rebooting from Linux to Windows, Everest showed three additional temperature sensors on the GPU, which were falling rapidly. So Linux seemed to be driving the GPU unnecessarily hard. This seems to be a known issue; here’s some detailed information about Radeon power management, from Alex Deucher (written just last week).

So until DRM KMS (kernel mode switching) lands, or at least until X.org 6.13, I get to either ignore the extra heat and noise or else run the ATI fglrx driver. I reluctantly downgraded from 2.6.30-rc9 to Debian’s 2.6.29 kernels and installed fglrx. I had to remove fglrx-atieventsd to stop my disk churning although unfortunately I can’t reproduce the problem now (bug filed anyway, in hopes someone recognizes it).

Now the temperatures are stable, although fglrx leaves scary kernel messages suggesting its memory management is buggered (#524871). *sigh*.

Also, those temperature sensors are apparently on a standard i2c bus on the Radeon. It looks like radeonfb is capable of exposing the bus to userland i2c tools or lm-sensors modules, but only for chips it supports - which the RV770 definitely isn’t. Hopefully this is another thing which can be fixed after DRM KMS lands. The current version only seems to expose the DDC devices though maybe I’m misreading it.

Sound almost works

This system uses hda-intel. Sound does work - but any channel comes out all speakers! I haven’t tracked down the problem yet. I’m used to having trouble with 5.1 surround sound and ALSA, but I’ve never had trouble with stereo before. It’s probably a configuration problem.

Disk temperature

The hard drive in the spare 5.25″ bay (with or without the mounting bracket) gets very warm - 49C instead of the 41C/42C my other drives run at. So I went looking for ways to keep it cool. I ended up with a clever plan: this machine spends most of its time in Linux, and there’s nothing Linux needs on that drive. So can I keep it spun down?

The answer is “yes, with effort”. With any recent kernel, hdparm can spin down SATA hard drives (hdparm -y /dev/sda). It spun right back up again with nasty messages in the kernel log, though. When searching for similar errors on google I discovered that probing SMART data - including the temperature sensor in the drive - would cause the platters to spin up (irritating!). I eventually tracked this to hddtemp checking the temperature of both /dev/sda and /dev/sg0, and filed bug #531849, which has an easy workaround. Once hddtemp is configured to only check /dev/sda, it knows not to spin up the drive.

Next I added this to /etc/rc.local:

# Spin down the drive with only Windows on it, and make it spin down
# again if idle.
hdparm -S120 /dev/sda
hdparm -y /dev/sda

That doesn’t work, for some reason; it does run, but either the disk doesn’t spin down or else something spins it back up shortly afterwards. So right now I have the hdparm commands in my $HOME/.xsession instead, and now it stays spun down all the time.



* Required