Ideas on cause of errors?

Krevinek

Evil PPC Tweaker
I have an older 8600 which has run great until 2 days ago. Doing some investigating, it refuses to boot completely into any OS (8, 9, even X). I get large numbers of bus errors during boot, lockups, and when I did boot into single-user mode in OS X I attempted to fsck...

Bus error (attempt 1)
Segmentation Fault (attempt 2)

I have not been able to boot from anything, 2 HDs, 2 different OS install CDs, nothing. Every single boot has ended in a bus error bomb dialog or a lockup.

This means something is going afoul in the hardware, and I am curious if anyone who knows hardware symptoms better than I do has something to say about this. (Even if the problem is the motherboard, I can repair it, as I have another Mobo and CPU sitting around, but I don't think they are the problem, personally... as I can even get to the desktop before the lockup occurs)

Anyone else thinking what I am thinking that a RAM stick is going south? (I am also going to check the settings on the CPU upgrade card to make sure they are OK while I am at it)
 
Krevinek said:
I get large numbers of bus errors during boot, lockups, and when I did boot into single-user mode in OS X I attempted to fsck...

Bus error (attempt 1)
Segmentation Fault (attempt 2)

(Even if the problem is the motherboard, I can repair it, as I have another Mobo and CPU sitting around, but I don't think they are the problem, personally... as I can even get to the desktop before the lockup occurs)

Anyone else thinking what I am thinking that a RAM stick is going south?
Sure, swap out memory first, always disconnect anything external, leaving only keyboard, mouse, and monitor.
But if this is not a bad hard drive, the errors look more like a bad CPU (why check your settings, do you change your CPU settings often?) or a bad logic board.
 
Try messing with the memory -- do you have it interleaved? Deinterleave it. Already have it deinterleaved? Interleave it.

Also, it could help to re-seat the RAM. We had a few 7300s and 7600 that would randomly have "Bus Errors" before the extensions loaded, and finally re-seating the RAM solved it. It was funny, if it DID manage to make it past the extensions loading, the computer would run for a while (a day or so) before finally requiring a restart.
 
ElDiabloConCaca said:
Try messing with the memory -- do you have it interleaved? Deinterleave it. Already have it deinterleaved? Interleave it.

Also, it could help to re-seat the RAM. We had a few 7300s and 7600 that would randomly have "Bus Errors" before the extensions loaded, and finally re-seating the RAM solved it. It was funny, if it DID manage to make it past the extensions loading, the computer would run for a while (a day or so) before finally requiring a restart.
With what you said, it sounds like one of my two 128MB RAM sticks aren't seated right. This machine was up 24/7 since last August without any problems at all. Then, it froze. It had problems booting like I described, then it suddenly boot and ran for another day without problems (it is an e-mail server as one duty, so I notice if it goes down), then went back to the behavior.

I will post what my results are.
 
Okay, here are my results:

CPU (Just fine, no settings were out of the ordinary... I checked just in case)
2x 16MB DIMMs (Just fine)
2x 32MB DIMMs (Just fine)
2x 128MB DIMM (Partial boot, bus error, etc)

So I investigated the two 128MB DIMM pair...

1x 128MB DIMM (Just fine)
1x 128MB DIMM (Will not boot at all if this DIMM is in slot A1)

So I am calling up OWC today (I bought it a little over a year ago from them) and getting it replaced. Thanks to you for helping me hone my troubleshooting so I wasn't wasting time testing each and every component. ;)
 
Well, I was premature... I cannot get a stable system with any sort of RAM configuration. I don't know if all RAM decided to die, or the mobo is going bad (as the problems change based on the RAM currently). Looks like I get to haul out the old mobo and revert this thing to the old 8600/300 configuration and see if that fares any better. If that doesn't work, I start selling it for parts.
 
Figure out where the CUDA button is and use that... It will drain all power out of the motherboard.

It is surprising how often that will cure what otherwise appears to be hardware failure.

It's always worth trying this before messing with RAM chips as the static from your hands can cause the very failure you are worried about.
 
Yes, I know what the CUDA button is, how to use it, and what it does. I have had plenty of time to learn this machine, just don't have experience troubleshooting a problem this severe, and could have rendered me without a desktop machine (which did a lot of serving and grunt work) for over 8 months.

The reason why I posted, is that the symptoms were so unusual and could have come from multiple sources, so I wanted to confer with you guys and see if you could help point me in the right direction and save me a few hours, which it did.

Anyways, the machine is up and running again, although without the G4 card. The cause of the failure was a logic board error, mixed with two bad RAM chips. I don't know which error caused the other, but slot A1 on the logic board was flaky at best, not booting at all, even with perfectly good RAM configurations. So, I swapped out the logic board with the original board and processor I had in storage (I bought a G4 CPU and another 8600 logic board because the two combined was cheaper than buying a Sonnet CPU which was compatible with my 8600/300 logic board), swapped the RAM over (all of it), and it wouldn't boot still (yes I did remember to hit the CUDA after moving everything over to the original board). More troubleshooting revealed that some 6-year-old Kingston RAM was bad, and had no problems booting after that. So, without the G4 to run Jaguar with, I opted for Yellow Dog Linux (which I had my own problems installing, but that is another story). After getting a minimal install in place, and installing/compiling packages by hand to bring it to the config I wanted, I now again have my desktop machine.

Thanks to those who gave advice, even if it was in the wrong direction. The only reason why I am still hanging onto this relic is that it has served me very well for 7 years (refurbished), and I cannot afford to replace it until I graduate next March with a BS in Computer Engineering and settle into a new job and home.
 
Back
Top