The Zalman Reserator [or How I Horribly Broke my PC]

Oh, the gods were angry at me last week. Very angry. They not only put me through the hell of completely destroying my computer (or so I thought), they also influenced me to exhibit my geek stubbornness and not replace anything, but instead break out the soldering iron and spend some frustrating hours soldering wire traces together that should never be attempted by mortal man.

In the end, I guess it was worth it. I mean that I stubbornly decided to fix things instead of junking my fantastic (and now old) computer. And I have the Zalman Reserator completely to blame for the whole thing.

The Zalman is a water cooling device that has no fans. Before, my computer had a total of 6 fans keeping it cool (CPU, Case, Graphics Card, and 3 in the power supply). Turning it on was similar to the noise of jet aircraft taking off from an aircraft carrier. Now it has just one, and it’s barely audible. The Zalman itself is inaudible and works flawlessly, so I feel somewhat bad for placing the blame on it. But how did this great travesty begin?

Due to the small yet real influence the angry gods had on me that fateful Thursday, my hand slipped as I installed the new heatsink of the Zalman onto the CPU, and suddenly a few copper traces were now exposed to the world in a way that god had not intended. They had, in fact, been severed by the edge of my needle nose pliers, and – as much as I tried to ignore the cut traces – the computer simply wouldn’t boot despite all of my anguished bleating. Yes, it was a sad time.

After a restless night of tossing and turning, dreams of soldering teeny tiny wires and conductive pens made for gnomes that could fix my sub-millimeter problems, I awoke the next day with newfound energy for fixing things, and putting my world back in order. I mean, how hard can it be to fix some silly wire traces coming from a CPU on a PCB? It’s got to be POSSIBLE, yes?

Now if I actually had some sort of measuring device to tell you how small these traces were, then I’d tell you. But I didn’t. My best guess is that they were 0.000001 mm wide. Well, they seemed that small anyway. Take a look at them – see the shiny colored copper areas? I had at this point scraped some of the coating off to inspect the damage and prepare for soldering.

Notice how there are two traces next to each other? Yes, that makes for a lot of fun – they are so close together that it’s not easy to get solder on just one trace without hitting the second one. To get an idea of the size of these traces, look at the socket on the left – the holes are for the CPU pins. The traces are less than half the width of the holes that the CPU pins fit into.Now if I actually had a soldering iron worth a damn, or some kind of PCB repair kit or a conductive pen with a tip the size of a mosquito’s schwanz, then I’d have been optimistic. But all I have at home is a Harry-Homeowner-I-Solder-Once-A-Year iron to work with. On the positive note, I can now tell you that it can be done. Even with crappy tools.

So here is what I had to do. First of all, I had to make a suitable tool to solder with – which meant taking that ugly soldering iron tip and sharpening it to a point. Any self respecting geek will have a dremel tool (or a cheap knockoff like me) with the appropriate grinder tip to do this with.

While certainly not the best tool, if you’re desperate like I was, then this will do. This is pretty easy to do.

The next thing to do was to rip apart a copper stranded cable and get a thin coating of solder onto several strands. This part is pretty easy as well – it just meant grabbing any old electric cord, chopping out a section, and then getting a coating of solder onto each of the strands.

At this point, things were going pretty well. But after this point, things got a lot hairier.

I don’t know if anyone in the world does this kind of PCB repair for a living, but one thing is certain – they probably have the correct tools to do it with. And they must also have a lot of patience.

I spent a few hours fiddling with these small wire strands and attempting to solder them to the broken traces. And yes, I was extremely pleased when I finally got the traces repaired. This involved touching the wires quickly with the iron while they were on top of the metal traces (to get them to initially stick) and then afterwards applying a small bead of solder to make sure that they stayed stuck in place. This was a lot harder than it sounds. But in the end, the wires were stuck. I mean, how geeky is that? Here is what my terrible soldering job looked like (there is a metal stud where the hole was in the previous picture).

Ugly, but functional (or so I thought). After getting finally getting a post, I quickly assembling things in my ignorant optimism – with it only to refuse post at all after the assembly was complete.

After another deconstruction, lots of coffee, re-heating the solder contacts, conductivity tests- it looked like things were actually going to work.

And here I am writing this blog entry on the one and the same, with those very horrifying looking solders (but with the naked eye you can’t really see how bad they look). And yes, I didn’t have any microscope or special magnifying tool to help with this.

So what’s the point? Perhaps I can at least encourage someone out there that has jammed a screwdriver into their motherboard to try to fix it, even if you only have primitive tools – there is at least one other person out there that has successfully done it :)

Of course, now that I’m concerned that the solder joints might “go bad” in the future, I guess it’s time to start looking for an upgrade. Now that I have a good reason, that is, I’m looking forward to it!

BTW, the Reserator works great, but one missing point in the instructions led to me sticking a tool in the computer in the first place, which wouldn’t have happened otherwise. There is a mounting plate on the bottom of the motherboard directly below the CPU which has to be swapped when installing on a socket 939 board (in my case, with an ASUS board, it was glued on, so it took some careful prying to remove it). However, if you follow the picture instructions that come with the Reserator, you’ll install it upside down like I did, and then the nuts will pull through when you attempt to tighten the metal pegs that hold the heatsink bracket. And if you’re as unlucky as me, you might even cut some traces along the way.

Leave A Comment »
Share this post!
del.icio.us:The Zalman Reserator [or How I Horribly Broke my PC]digg:The Zalman Reserator [or How I Horribly Broke my PC]spurl:The Zalman Reserator [or How I Horribly Broke my PC]wists:The Zalman Reserator [or How I Horribly Broke my PC]simpy:The Zalman Reserator [or How I Horribly Broke my PC]newsvine:The Zalman Reserator [or How I Horribly Broke my PC]blinklist:The Zalman Reserator [or How I Horribly Broke my PC]furl:The Zalman Reserator [or How I Horribly Broke my PC]reddit:The Zalman Reserator [or How I Horribly Broke my PC]fark:The Zalman Reserator [or How I Horribly Broke my PC]blogmarks:The Zalman Reserator [or How I Horribly Broke my PC]Y!:The Zalman Reserator [or How I Horribly Broke my PC]smarking:The Zalman Reserator [or How I Horribly Broke my PC]magnolia:The Zalman Reserator [or How I Horribly Broke my PC]segnalo:The Zalman Reserator [or How I Horribly Broke my PC]

Encrypt Files and Directories Easily

I was recently looking for an easy way to encrypt individual files and directories (recursively), and I ran across the linux command mcrypt. This nifty little utility does just what I want, but doesn’t do anything fancy – it just does encryption on a single file or standard input.

With a wee bitty script, however, you can encrypt anything you like quite easily. You have to have mcrypt installed (and also tar & bzip2, but you’ve likely got that already). Check this out:

#!/bin/bash
IFS=$’\n’
if [[ -z $3 ]]
then
echo “Use: encrypt [file/directory] [password] [outputname]”
exit
fi
echo “Encrypting $1 with password $2 into file $3″
tar -c $1 | mcrypt -p -q -k $2 > $3
echo “Done with encryption.”

Save it as “encrypt.sh” or whatever other name floats your boat, give it execute permissions, and you’re all set. It will tar, compress, and encrypt your file(s) and directories into whatever output file you specify. Just make sure you don’t forget the password you use to encrypt the file with: there isn’t any easy way to find out what it was if you lose it.

In order to decrypt your data, use this little script:

#!/bin/bash
IFS=$’\n’
if [[ -z $2 ]]
then
echo “Use: decrypt [file/directory] [password]”
exit
fi
echo “Decrypting $1 with password $2″
cat $1 | mdecrypt -q -p -k $2 | tar –x
echo “Done with decryption.”

Save it as “decrypt.sh” and give it execute permissions, and now you can easily decrypt your data as well. It can’t really get much easier than that!

Leave A Comment »
Share this post!
del.icio.us:Encrypt Files and Directories Easilydigg:Encrypt Files and Directories Easilyspurl:Encrypt Files and Directories Easilywists:Encrypt Files and Directories Easilysimpy:Encrypt Files and Directories Easilynewsvine:Encrypt Files and Directories Easilyblinklist:Encrypt Files and Directories Easilyfurl:Encrypt Files and Directories Easilyreddit:Encrypt Files and Directories Easilyfark:Encrypt Files and Directories Easilyblogmarks:Encrypt Files and Directories EasilyY!:Encrypt Files and Directories Easilysmarking:Encrypt Files and Directories Easilymagnolia:Encrypt Files and Directories Easilysegnalo:Encrypt Files and Directories Easily

How to Import Thunderbird Mail into Outlook

There are lots of things out there on the net related to getting your mail out of Outlook and into Thunderbird, but not really anything that deals with the reverse. I’ve been using Thunderbird for a while, and as much as I like it, I really want to give the new Outlook a try.

Unfortunately, Outlook has pretty lame importing features, so you have to jump through hoops to get your mail into Outlook from Thunderbird. This unusual method should work for importing any mbox formatted mail (Thunderbird, for example) into Outlook. I tried it with Outlook 2007; it should work fine with all older versions as well.

You’ll need to have Outlook Express installed on your computer (if you have XP or 2000 installed it should already be there) and you will also need to install Eudora from http://www.eudora.com/ (you can uninstall it later when you are done).

The whole process is painless, but it feels really unecessary to have to go through so many steps to import your email into Outlook. The basic problem is that Outlook doesn’t have the capability of importing mbox formatted email or Eudora email either. Considering that the mbox format has been around for 11 years or so and the commercial version of Eudora for 16 years, I wonder what the holdup is.

Anyway, here is what you need to do:

  1. Install Eudora.
  2. Import your mbox file into Eudora.  To do this, you need to copy your mbox file to the default location of the Eudora mailbox file.  After you have done this, Eudora will open your mbox file and index it.
  3. Import your Eudora mailbox into Outlook Express (yuk!)
  4. Import your Outlook Express mail into Outlook.

So why can’t Outlook import mail as well as Outlook Express does (meaning you’d get to skip a step here)? Who knows. Someone somewhere at MS might know why Outlook seems has such lame importing features.

Another method that I’ve used is the following (if you use Linux):

  1. Import your Thunderbird mail into KMail (the KDE mail program).
  2. Select the mail messages you want to export.  You can use shift-click to select a whole group of messages at a time, but you’re likely going to have to do this for messages inside of each folder separately.
  3. Right click and then click “Save”.  It will save the selected emails into a single mbox file.
  4. Install IMAPSize.  This is a free Windows program.  You will have to now move your mbox files to your Windows computer.
  5. With IMAPSize, use the tool “mbox2eml” under the tools menu on your mbox files.  All individual emails will now be extracted from the mbox files into separate eml files.
  6. Now import your mail into Outlook Express by dragging the emails from Windows Explorer directly into the Outlook Express window.
  7. Finally, import the mail from Outlook Express into Outlook.

Anyway, if you really really want to get your mail out of Thunderbird into Outlook, this is probably how you have to do it.

Leave A Comment »
Share this post!
del.icio.us:How to Import Thunderbird Mail into Outlookdigg:How to Import Thunderbird Mail into Outlookspurl:How to Import Thunderbird Mail into Outlookwists:How to Import Thunderbird Mail into Outlooksimpy:How to Import Thunderbird Mail into Outlooknewsvine:How to Import Thunderbird Mail into Outlookblinklist:How to Import Thunderbird Mail into Outlookfurl:How to Import Thunderbird Mail into Outlookreddit:How to Import Thunderbird Mail into Outlookfark:How to Import Thunderbird Mail into Outlookblogmarks:How to Import Thunderbird Mail into OutlookY!:How to Import Thunderbird Mail into Outlooksmarking:How to Import Thunderbird Mail into Outlookmagnolia:How to Import Thunderbird Mail into Outlooksegnalo:How to Import Thunderbird Mail into Outlook

FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)

I recently had an infrequently used hard drive fail to mount, and upon inspection I found that it was no longer recognizable and an error was being produced at the console:

mount: wrong fs type, bad option, bad superblock on /dev/hdc1,
missing codepage or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

And doing a dmesg gives the following output:

EXT3-fs: unsupported inode size: 0

So what to do, what to do? What does this mean?

What is happening is that the the superblock is corrupted. Fortunately there is a backup of the superblock elsewhere, and the location of it depends on the block size used on the partition. To replace the main superblock with a backup (alternative) superblock, use this command:

fsck -b 32768 /dev/sdh1

Of course, the number after the “-b” switch should be one of these values, depending on block size used on your file system:

  • 1k blocks = 8193
  • 2k blocks = 16384
  • 4k blocks = 32768

The location of your partition (’/dev/sdh1′ in my case) must be changed according to your actual partition location as well.

So how do you tell what size blocks your partition uses? Well, I have read a suggestion of running fsck with the ‘-n’ switch on your partition to get that information; that didn’t work at all for me. In fact, that would crash with the output of

fsck.ext3[5618] trap divide error rip:2aaefe7b7b57 rsp:7fffac521f50 error:0

So instead, I just made a guess that my drive used 4k blocks (which it does), and fsck worked like a charm. I can now mount the drive and get the data off that I thought was gone forever. (Note: you might want to make a backup copy using dd or something similar before messing with your partitions!)

Leave A Comment »
Share this post!
del.icio.us:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)digg:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)spurl:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)wists:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)simpy:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)newsvine:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)blinklist:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)furl:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)reddit:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)fark:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)blogmarks:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)Y!:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)smarking:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)magnolia:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)segnalo:FSCK Fun - Fix a Corrupt Superblock (unsupported inode size problem)

Easily Restore a Corrupt NTFS Boot Sector

A friend of mine brought a NTFS (Windows XP) hard drive that had been “erased” by a virus of some kind. I said I’d try to recover the lost data, which was unreadable to Windows.

After wasting some time undeleting a bunch of garbage on some other unimportant partitions, I realized that the main NTFS partition had not been automatically mounted for me - meaning there might be a problem with the partition itself. After manually attempting to mount the partition, I got a message that the boot sector was corrupt. So what to do?

NTFS partitions have a backup of the boot sector located on the last sector of the NTFS partition. There are probably various programs out there that one can pay for to restore this backup copy to its rightful place. There might even be a “Microsoft way” of doing things, which I can only guess requires you to agree to the terms of some EULA and give away any rights you have to your great collection of polka MP3s. Instead, all you need to do is this one line (as root):

mount -t ntfs /dev/sdg1 /media/tmp -o errors=recover

where you need to replace “/dev/sdg1″ with your NTFS partition location (I connected this drive with an external USB carrier) and “/media/tmp” with the location you’d like to mount the fixed partition. That’s all! Once you’ve mounted it, it’s fixed automatically and might even be bootable again (if this is the only problem you have).

This will even work if you accidentally begin to copy data over the beginning of your NTFS partition, since the copy of the boot sector is at the end of the partition. Note: This only works with kernel versions 2.6 and newer.  Can’t get a much easier fix than that!

5 Comments »
Share this post!
del.icio.us:Easily Restore a Corrupt NTFS Boot Sectordigg:Easily Restore a Corrupt NTFS Boot Sectorspurl:Easily Restore a Corrupt NTFS Boot Sectorwists:Easily Restore a Corrupt NTFS Boot Sectorsimpy:Easily Restore a Corrupt NTFS Boot Sectornewsvine:Easily Restore a Corrupt NTFS Boot Sectorblinklist:Easily Restore a Corrupt NTFS Boot Sectorfurl:Easily Restore a Corrupt NTFS Boot Sectorreddit:Easily Restore a Corrupt NTFS Boot Sectorfark:Easily Restore a Corrupt NTFS Boot Sectorblogmarks:Easily Restore a Corrupt NTFS Boot SectorY!:Easily Restore a Corrupt NTFS Boot Sectorsmarking:Easily Restore a Corrupt NTFS Boot Sectormagnolia:Easily Restore a Corrupt NTFS Boot Sectorsegnalo:Easily Restore a Corrupt NTFS Boot Sector

Howto: Install SciPy on 64-bit Suse

The installation of SciPy from source would be straightforward if it weren’t for the additional libraries LAPACK and BLAS that need to be installed as well. While I’m not new to compiling packages from source and resolving dependencies, this one stumped me - I ended up with this error:

/usr/local/lib/libflapack.a(slaruv.o): relocation R_X86_64_32S against `a local symbol’ can not be used when making a shared object; recompile with -fPIC

That’s telling me that I need to recompile using the “-fPIC” option, which I had already done. Strange. Anyway, if you want to get this installed the easy way, stop trying to compile this from source and do this instead:

Open up Yast, and add this repository to your sources:

http://repos.opensuse.org/science/SUSE_Linux_10.1/

If you are using Suse 9.3, 10.0, or 10.2, simply change the “SUSE_Linux_10.1″ to “openSUSE_10.2″, “SUSE_Linux_10.0″, or “SUSE_Linux_9.3″ in the provided link. After you’ve added this repository, open “Software Management” and search for ‘SciPy’. You should find it. You can additionally search for ‘NumPy’, ‘Lapack’, and ‘blas’ (although the dependencies should be sorted out automatically).

I would recommend additionally installing ‘matplotlib’ for plotting. If you do that, you’ll need to grab the matplotlibrc file and stick it in your .matplotlib directory. The only option I had to change to get plotting to work was the backend - I chose “QtAgg”.

Thats it! When trying to set this up, I found lots of Suse users with installation/compilation issues - but nobody seemed to know this simple method of installation.

2 Comments »
Share this post!
del.icio.us:Howto: Install SciPy on 64-bit Susedigg:Howto: Install SciPy on 64-bit Susespurl:Howto: Install SciPy on 64-bit Susewists:Howto: Install SciPy on 64-bit Susesimpy:Howto: Install SciPy on 64-bit Susenewsvine:Howto: Install SciPy on 64-bit Suseblinklist:Howto: Install SciPy on 64-bit Susefurl:Howto: Install SciPy on 64-bit Susereddit:Howto: Install SciPy on 64-bit Susefark:Howto: Install SciPy on 64-bit Suseblogmarks:Howto: Install SciPy on 64-bit SuseY!:Howto: Install SciPy on 64-bit Susesmarking:Howto: Install SciPy on 64-bit Susemagnolia:Howto: Install SciPy on 64-bit Susesegnalo:Howto: Install SciPy on 64-bit Suse

Linux Driver for the Hauppauge WinTV USB2

First of all, this device does work fine in Linux. But unfortunately, this USB device won't be recognized by the Linux kernel and so you won't be able to watch all your Family Guy, Simpsons, or Aqua Teen Hunger Force episodes on your PC without adding one line of code to the kernel module driver (perhaps new kernels will eventually recognize it).

There are more than just one type of WinTV USB2 device: the one I have has "Model 42014 Rev D197 Lot # 4405" on the back of it. If you do a 'lsusb', you should see this somewhere in the output:

CODE:
  1. Bus 001 Device 005: ID 2040:4201 Hauppauge

The device ID is the problem : the driver for this particular model is looking for "2040:4200", not "2040:4201". So, you simply need to edit the driver code and add the right number. To do this, you need to have your kernel source installed and you'll have to know how to configure your kernel for your other hardware. If you're up to the task, then take your favorite editor and open the file '/usr/src/linux/drivers/media/video/em28xx/em28xx-cards.c '. At about line 249 you'll see this:

C:
  1. /* table of devices that work with this driver */
  2. struct usb_device_id em28xx_id_table [] = {
  3. { USB_DEVICE(0xeb1a, 0x2800), .driver_info = EM2800_BOARD_UNKNOWN },
  4. { USB_DEVICE(0xeb1a, 0x2820), .driver_info = EM2820_BOARD_MSI_VOX_USB_2 },
  5. { USB_DEVICE(0x0ccd, 0x0036), .driver_info = EM2820_BOARD_TERRATEC_CINERGY_250 },
  6. { USB_DEVICE(0x2304, 0x0208), .driver_info = EM2820_BOARD_PINNACLE_USB_2 },
  7. { USB_DEVICE(0x2040, 0x4200), .driver_info = EM2820_BOARD_HAUPPAUGE_WINTV_USB_2 },
  8. { USB_DEVICE(0x2304, 0x0207), .driver_info = EM2820_BOARD_PINNACLE_DVC_90 },
  9. { },
  10. };

You'll want to change it to look like this:

C:
  1. /* table of devices that work with this driver */
  2. struct usb_device_id em28xx_id_table [] = {
  3. { USB_DEVICE(0xeb1a, 0x2800), .driver_info = EM2800_BOARD_UNKNOWN },
  4. { USB_DEVICE(0xeb1a, 0x2820), .driver_info = EM2820_BOARD_MSI_VOX_USB_2 },
  5. { USB_DEVICE(0x0ccd, 0x0036), .driver_info = EM2820_BOARD_TERRATEC_CINERGY_250 },
  6. { USB_DEVICE(0x2304, 0x0208), .driver_info = EM2820_BOARD_PINNACLE_USB_2 },
  7. { USB_DEVICE(0x2040, 0x4200), .driver_info = EM2820_BOARD_HAUPPAUGE_WINTV_USB_2 },
  8. { USB_DEVICE(0x2040, 0x4201), .driver_info = EM2820_BOARD_HAUPPAUGE_WINTV_USB_2 },
  9. { USB_DEVICE(0x2304, 0x0207), .driver_info = EM2820_BOARD_PINNACLE_DVC_90 },
  10. { },
  11. };

You're simply inserting this line:

C:
  1. { USB_DEVICE(0x2040, 0x4201), .driver_info = EM2820_BOARD_HAUPPAUGE_WINTV_USB_2 },

Save the changes, then go back to your '/usr/src/linux' directory, and do a normal 'make' and make 'modules_install'. As long as you're running the same kernel as the one you are compiling the modules for, you can now do a "modprobe em28xx" and you should be in business! Of course, there are other modules you'll have to load (or compile into the kernel) to get video working in general (look at the 'Video For Linux' section); but this will at least get your hardware talking.

6 Comments »
Share this post!
del.icio.us:Linux Driver for the Hauppauge WinTV USB2digg:Linux Driver for the Hauppauge WinTV USB2spurl:Linux Driver for the Hauppauge WinTV USB2wists:Linux Driver for the Hauppauge WinTV USB2simpy:Linux Driver for the Hauppauge WinTV USB2newsvine:Linux Driver for the Hauppauge WinTV USB2blinklist:Linux Driver for the Hauppauge WinTV USB2furl:Linux Driver for the Hauppauge WinTV USB2reddit:Linux Driver for the Hauppauge WinTV USB2fark:Linux Driver for the Hauppauge WinTV USB2blogmarks:Linux Driver for the Hauppauge WinTV USB2Y!:Linux Driver for the Hauppauge WinTV USB2smarking:Linux Driver for the Hauppauge WinTV USB2magnolia:Linux Driver for the Hauppauge WinTV USB2segnalo:Linux Driver for the Hauppauge WinTV USB2

Howto: Getting NVidia Drivers to Work with Linux

Ok, this isn't really a complete "HOWTO" - it's just a summary of the experiences I had with getting NVidia drivers to work properly with a 64-bit Linux distribution with my current hardware (although this will equally apply to a 32-bit distro).

It seems that there are an ungodly number of things that can go wrong (when they do go wrong) with NVidia drivers on the Linux platform. Usually things work fine after installation, be it by using the supplied NVidia installer or whatever method your distribution suggests. Unfortunately, my problem was unsolvable by all means that you will find on the web. That's right: I went through everything short of replacing my entire PC to solve this problem. What was my final solution? Well, if you're looking for a solution to your own NVidia woes, I would suggest you not read the rest of this paragraph and skip to the next one where I give you some hints on what to try when troubleshooting. My solution: I flashed the video card BIOS to that of a completely different vendor. While that doesn't make sense that you'd have to do this (and you won't find anyone from NVidia that will suggest you should do it), it was what I had to resort to to get things to work.

Symptoms: When any 3D (openGL) application is started, the PC will lock up hard - usually requiring a manual reset (meaning finger push button reset). The PC may not lock up hard, but freeze and become garbled instead; you may even be able to continue working but with a garbled screen (I couldn't). To reproduce this problem, simply run 'glxinfo' in a console - that should be enough to do it.

This occurs at least with the following distributions: Gentoo Linux, Ubuntu, and Suse - regardless of whether a 32-bit or 64-bit distribution is being used.

My hardware:
Motherboard: ASUS A8N32-SLI Deluxe
Graphics card: Gainward 7800GT PCIe
HDD: 2x Maxtor 6V300F0 300GB
PSU: 650 Watts
CPU: AMD X2 4400
RAM: 2x 1GB Kingston DDR400

First of all, you have to have actually successfully installed the NVidia drivers. Unless you've disabled it, you will see an NVidia splash screen when X starts. If you don't have this part right, then there are many resources on the web to help you depending on your distribution. It's usually always better to do the install using the method your distro prefers: emerge on Gentoo, apt-get for Ubuntu, or yast for Suse. If you instead try to use the supplied installer from NVidia's website, you could run into library problems later (which is the case with Gentoo).

Here is the list of things to check when you have the lockup problem (in no particular order):

1 - Passing Kernel Parameters

There are some kernel parameters that may or may not influence the performance or stability of the driver. They are:

CODE:
  1. noapic
  2. acpi=off
  3. noacpi
  4. nolapic

You should try passing these to the linux kernel when you boot - or some combination of them. "noapic" and "acpi=off" seem to be popular.

2 - IRQ Conflicts

Your NVidia board should have an assigned IRQ. When the nvidia driver module is loaded, execute this:

CODE:
  1. cat /proc/interrupts

You should see something like this:

CODE:
  1. CPU0       CPU1
  2. 0:    7226051          0    IO-APIC-edge  timer
  3. 8:          0          0    IO-APIC-edge  rtc
  4. 9:          0          0   IO-APIC-level  acpi
  5. 14:      16276          0    IO-APIC-edge  ide0
  6. 15:      88998          0    IO-APIC-edge  ide1
  7. 50:          0          0   IO-APIC-level  libata
  8. 58:     243948          0   IO-APIC-level  ohci_hcd:usb1
  9. 66:     113605          0   IO-APIC-level  NVidia CK804
  10. 74:    3148469          0   IO-APIC-level  ohci1394, sky2, nvidia
  11. 225:    3586632          0   IO-APIC-level  libata, eth0
  12. 233:     693809          0   IO-APIC-level  libata, ehci_hcd:usb2
  13. NMI:       3559       2995
  14. LOC:    7226318    7226296
  15. ERR:          0
  16. MIS:          0

Of course your output will look different. The point here is that the "nvidia" module has an interrupt number, and that it doesn't share it with other peripherals. In my case you see that it is sharing an interrupt with ohci1349 (firewire) and sky2 (ethernet) drivers. Try disabling or moving the peripheral that is sharing an IRQ number with the nvidia module. I disabled the hardware that shared with the nvidia card (of course, to no avail).

If you have no interrupt assigned to your video card at all, check that in your BIOS settings (of your motherboard) that you have "Assign IRQ to Video" enabled (or something similar).

One more note: the "type" of interrupt assigned to the nvidia board should be "level", not "edge". The driver module probably won't load if it is not "level".

It has been noted that Creative boards (SoundBlaster whatever, Audigy, etc) like to conflict with NVidia boards. If you have one, move it to a different slot or take it out temporarily for testing.

3 - Check For a Motherboard BIOS Upgrade

Normally there should be no reason to upgrade your motherboard BIOS. On occasion, something may have been fixed; but more often than not you're bound to introduce a new problem. If you start asking around for help from NVidia, they will tell you to do this, even if you're sure you don't need to (or if there is no update available for your board) - so you may as well get this item out of the way.

4 - Strange Motherboard Settings

Some BIOS settings can interfere with your NVidia card. If you have a disabled NX/XD-Bit (NoeXecute/eXecuteDisable) , you should try changing that. If you have onboard virus protection, disable it. If you have overclocking related BIOS options, turn them off. For example, my ASUS has an automatic overclocking feature - it should be disabled (later you can turn it on if you get things working, of course). Also, "PEG Link Mode" should be normal.

While you're at it, you can disable all onboard things you're not using: firewire, USB (if you can deal without USB mouse/keyboard), serial / parallel, audio, etc. That way you can eliminate them as possibly conflicting with your NVidia card.

5 - MEMTEST86

Ok, this should actually be #1, but I had assumed in the beginning that your system already has a known good and stable configuration. This is absolutely essential: run memtest86 . Some say you should run it for 10 hours (or overnight); the only time I ever found errors on a bad RAM module they appeared after a minute or less.

If you have an Ubuntu or SUSE boot CD, it has memtest86 as an option (it can't get easier than that). Otherwise, download this small (but excellent!) recovery CD and choose the memtest boot option.

If you have any RAM problems whatsoever, you need to fix this first - it's probably the culprit.

6 - AGP and Your xorg.conf File

There are a couple of settings that could affect your stability issue; primarily if you are using AGP (I am not). I haven't heard that these affect PCIe in any way, but I'm putting this here anyway since it seems to be a common problem among lots AGP users. From the NVidia README:

Option "NvAGP" "integer"
Configure AGP support. Integer argument can be one of:

Value Behavior 0 disable AGP 1 use NVIDIA's internal AGP support, if possible 2 use AGPGART, if possible 3 use any AGP support (try AGPGART, then NVIDIA's AGP)

Please note that NVIDIA's internal AGP support cannot work if AGPGART is either statically compiled into your kernel or is built as a module and loaded into your kernel.

Try 0 and see what happens. If it works, then you have AGP issues. You might have to remove AGPGART support from your kernel and use the right NvAGP option.

Also, look for "RenderAccel" and give it the parameter "false":

Option "RenderAccel" "false"

Later you can enable this if you get things working.

7 - Manually Upgrade Drivers

Even though I said you should use the proper installation method depending on your distro, try uninstalling the drivers and getting the newest from NVidia's website. Note: If you have an older NVidia card, you need to use the "Legacy" drivers supplied by your distro (and by NVidia).

Be careful with updating your drivers this way, as you can end up with library problems / improper links that could lead to further problems.

8 - Incompatible Libraries

It has been noted on some distros (Gentoo) that if you install the driver using NVidia's install mechanism instead of the default "right way" (emerge, apt-get, urpmi, whatever) you'll run into problems with old libraries lying around in places they shouldn't - which end up causing conflicts and odd behavior. If you've done this, then I can't really help you - but you should probably start by uninstalling any versions you have installed and manually looking for leftover library bits and removing them. Good luck.

9 - Reinstall X, Mesa, and Dependencies

Ok, this is a pain to do, but it's one of the things I did. Completely reinstall X and its dependencies - or if it's easier, just reinstall your whole distribution.

10 - Different Distributions?

Ok, perhaps a different distribution will fix your problem. Why? Well, different kernels might work differently with your board. If you don't mind trying out a different distribution, give this a shot. You won't find the real cause of your problem this way, however.

SUSE linux was the only distro I tried that worked 100% perfectly after an install (it detected my motherboard, and that I have a RAID controller, which it informed me would not be supported in RAID mode, which I already knew but no other installer chose to tell me). Ubuntu, Mandriva, and Gentoo could give you anywhere from a few to many niggles after installation.

11 - Vanilla Kernel Compilation

Another thing that NVidia support will ask you to do (this means go to kernel.org and get the latest official stable release and use that as your kernel). When you do this, I recommend only enabling the bare minimum you need (and the options that are required for the NVidia driver - see this page for more info.)

12 - Stability / Heat Issues

If you are actually able to use the drivers for 3D applications for a short period of time (and thereafter a lockup), then you don't have the same problem as I did, but you might have a stability problem due to overheating. You can monitor the temperature of your GPU in Linux, but I won't go over how to set that up. If available, I would suggest borrowing Windows from a friend, installing it and the NVidia Windows drivers, and then installing the NVidia NTune utility. It should come with a stress test that will notify you of any stability problems. While you're at it, check to see if 3D applications work properly in Windows for you (in my case, they worked flawlessly).

If you do have a heat problem, you probably should return your card. If you can't, then you could attempt to pull off the fan/heatsink contraption and apply fresh thermal paste to the offending chips. Google around for guides on how to do this.

One more thing: cheap power supplies will give you never-ending stability issues. Make sure that you have a big enough capacity power supply. Also, not all power supplies are the same: 500W from one brand will not "be the same" as 500W from another manufacturer. If you have a no-name cheapy powersupply with borderline specs for your system, get a newer better one (I did this unnecessarily - but it was cheaper than getting a new MB or video card).

13 - Maxtor Drives and NForce Incompatibility

The particular model drives I have (Maxor 6V300F0) have a nasty bug that causes instability with NForce chipset motherboards. Not all drives have this bug; certain firmware revisions do. There is a fix out there that Maxtor will not give you for some reason. Do a search to find out where to download the fix (I don't have it; I didn't need it, since my firmware revision was new enough). I don't know why Maxtor won't give out the fix (probably they are afraid of helping a hard drive firmware hacking scene, where people simply change the firmware in their drives that are identical to other models and gain hundred of extra gigabytes. Well, that's my guess.) In any case, I don't think I will be buying Maxtor again.

Last But Not Least : Video Card BIOS

None of the above items were in any way related to my problem. In the end, I took my video card (Gainward 7800GT) and flashed the BIOS with one from an eVGA 7800GT.

Initially I flashed with a slightly newer version of the Gainward 7800GT ROM, but that made absolutely no difference. I wrote the possibility of a BIOS problem off at that point. Later, short of buying a new board, I took the leap of faith and flashed with the eVGA BIOS.

So why does this work? What is the problem? Why should the BIOS affect whether or not the NVidia drivers will work on a particular platform (fine on Windows, but not on Linux)? I expect that NVidia won't answer this.

If you have the same problem as I did, you certainly won't get an RMA - your card works fine on Windows. There is no detectable issue with the card. And you are voiding your warranty by flashing it to that of a different vendor - but what other choice is there? Buy a second video card and cross your fingers?

Go to the site http://www.mvktech.net to find the right ROM and flashing utility. Of course, you shouldn't do this - but it was the only solution that worked for me. You very well might destroy your video card. Note: When flashing to a different vendor, you'll have to specify some flags to override the original vendor information.

One thing I found out from this whole debacle was that NVidia does not officially suggest or support any particular Motherboard/Video Card combination on Linux. So if you want to play it safe and buy a combination of hardware that just works - well, don't ask NVidia. If you're a home user like me and can't afford to drop $300 a pop on new video cards just because the one you bought has a non-detectable BIOS issue, well - good luck.

So as a last resort, you might want to try what I did and flash the your video card BIOS. But do a little research first and make sure your card is practically the same as the BIOS you're flashing to (I just made a wild-ass guess). If you flash it to a wrong BIOS, you're gonna end up with a nice decorative circuit board for your Christmas tree.

If anyone else out there had to do this, I'd appreciate hearing the details.

3 Comments »
Share this post!
del.icio.us:Howto: Getting NVidia Drivers to Work with Linuxdigg:Howto: Getting NVidia Drivers to Work with Linuxspurl:Howto: Getting NVidia Drivers to Work with Linuxwists:Howto: Getting NVidia Drivers to Work with Linuxsimpy:Howto: Getting NVidia Drivers to Work with Linuxnewsvine:Howto: Getting NVidia Drivers to Work with Linuxblinklist:Howto: Getting NVidia Drivers to Work with Linuxfurl:Howto: Getting NVidia Drivers to Work with Linuxreddit:Howto: Getting NVidia Drivers to Work with Linuxfark:Howto: Getting NVidia Drivers to Work with Linux