A crash in OS X


My faith in the "robustness" of OS X is seriously shaken.

Here's what happened:

While running Word 98 in Classic on my imac (revision A, 190 MB RAM), the system suddenly stopped responding to my mouse. Oddly, I could still move the cursor around (which was fixed in the "text I-bar" shape), but nothing responded to clicks-- not even the dock.

I then hit the "emergency escape" keys-- option-apple-escape, which normally brings up the application monitor window in OS X. Nothing. Power keys-- nothing. Interestingly, the caps lock light still went on and off on my keyboard. Even more interestingly, the system went to sleep after 10 minutes, and woke up again when I touched the keyboard. However, clicking and keystrokes did nothing more.

At this point I had to conclude my Mac was fatally hung, so I used the archaic paper clip hole to reboot. Then my real troubles began-- booting stopped at the "smiling mac" phase, never to continue. After booting up from an OS 9 CD-ROM, zapping the P-RAM with TechTool, and choosing OS 9 with the Startup Disk program, I now have a working computer again. But sad to say, I am unlikely to return to this incarnation of OS X.

My questions: was there some other key combination I could have tried? Is there some other last resort before rebooting? Any comments would be appreciated, and might rescue my failing optimism for this operating system!!

What happened was that the UI ("Aqua" or the window manager) somehow crashed. The core system (the kernel and all of its buddies) did, however, not crash. Aqua does crash from time to time, as really any program does. The problem is that command-option-escape and other escape key sequences seem to be routed through aqua. So, hung aqua and hung OS are virtually indistingusihable.
(John Sircusa's Arstechnica.com article about OS X PB makes a lot of good points, including one about the type of crash you saw. You might want to look at it for a discussion of real versus percieved stability.)

I had a crash like this, and when I force-restarted, I thought that the system startup was totally screwed, because it took just about forever to get past the smiley mac phase. But it did work -- I left, and ten minutes later, when I came back from the bathroom, there was the login screen.
If you are unsure abouth whether the system was truly hung on startup or was just taking its sweet time checking and mounting damaged disks, hold "v" at startup to see the (disconcerting to mac users) UNIX startup process. I did this and found that the percieved "hang" was just the boot script being fussy and making sure it totally repaired the boot drive before continuing.
So, how long did you wait before concluding startup was broken?

It is possible (but unlikely) that the hard restart damaged a file necessary for startup on the hard drive. (Check for drive/directory structure damage with TechTool from 9.) The easy solution is to reinstall OS X, which won't touch any of your preferences, or applications, or really any files that you can even see/access as a user, but it will replace all the behind-the-scenes files with clean copies.

In addition, had you been on an networked environment, the thing to do would be to telnet (or better, SSH) into the "hung" machine. You would be able to then kill the hung processes from the telnet session and the machiine would "unfreeze" nicely. Unfortunately, this requres a network environment, and a small amount of UNIX skill -- not something that should be required in a final Apple product.

I hope that your confidence in X isn't totally gone, because it really is a lot more stable (I have only had one Aqua crash in many months) and will pretty much never crash so hard you can't ssh in and fix things, if that's your bag.
zpincus covered pretty much everything I had been going to say.

I've seen two Quartz/Aqua/whatever crashes since I got OS X installed a couple of weeks after the beta was out, both of the sort you describe - mouse still moves, monitors go to sleep after the right delay, wake up all right, etc.

I really wished I had another computer then - I would have gotten onto ssh and played around, hopefully found out what the culprit was. I suppose I could have waited till I was at school, and gotten on from there, but then I couldn't have seen what effect each thing I did was having.

Here's an interesting little bug I've noticed, incidentally - After about a week of uptime, the menus start behaving like they did in OS 7 (you have to click and drag, clicking once on the menu and then once on the desired item doesn't work). The really peculiar thing is that logging out and then in again doesn't make any difference. Peculiar, since the window manager process belongs to the currently logged in person, so each time you log in it ought to be starting fresh.

[Edited by scruffy on 12-14-2000 at 09:53 PM]
I don't know how they fix it:

Micr*s*ft even knows how to crash Mac OS X!
It's unbelievable!
Mail it to Apple so they can protect Mac OS X against
crashes from Word enz

Thanks zpinca, that helps a lot. I *did* wonder whether I was too impatient with my "smiley mac"---- perhaps if I had gone to brush my teeth I would have found the same thing :) instead I waited 2 minutes max.

The "v" option is really handy to know, I'll try that next time. It would be nice (and I guess I'll tell apple) if the smiley mac would be more informative in the case that it's going crazy checking the file system. Something along the lines of the OS 9 "Your computer did not shut down properly", perhaps. I suppose Apple decided they didn't a screen like that because OS X wouldn't crash! :)

I actually could have telnetted in to my Mac, it's at work on a nice fast network, and I even enabled SSH. I just didn't think of that in time. Thanks for the tip.

My optimism in OS X is slowly recovering (it seems to come and go in waves) I really would love a fully operational Unix mac!!

I'm having issues, which is why I'm here. I have seen the kernel panic more than once, usually with heavy drive / network traffic. I have trouble pinning it down, 'cause that's my normal state. But I've had the presumable Aqua crash, and of course I tried to telnet ... nope. Machine responded to a ping, but I got no login prompt from telnet.

My confidence in the inherent stability is also shaken, along with some other things. ... but yeah, I believe fsck runs repeatedly on your boot drive before you get any visible startup sequence, then it'll run fsck on other drives early in the normal boot sequence, about 1/5 of the way across the thermometer.