Major screwup


A few months back, I had my company host transfer hosting of our email services from our ISP to an internal email server -- a G3 running Mac OS X 10.0.3 with sendmail 8.11.3 and qpooper 4. Everything worked great until today, when I upgraded to 10.0.4. Then, it stopped working.

As it turns out, upon upgrading, the installer changed my / directory to g+w, but I only figured this much out after I (by mistake) executed (as root) a:

chmod -R 755 /

Now, I can no longer add, change, or delete users on the system, and every single user that was on there can no longer log in--bad password.

any ideas? Our email has been down for half a day and if i dont figure this out before tonight, i'll be looking for a new job.

I don't want to appear to be too harsh here, but this is a good example of bad change control on a production system, and inadequate backups.

If you have a machine where it being down for a day is going to cost someone's job, then you have to look very hard at having a second machine where you can test the effect of changes you want to make to the system BEFORE you play on the production box.

In any case, you must be able to do a full backup of the machine before you apply any major changes. You then do the changes, which might be patching or updating the Operating System, installing software, or whatever. You then have a plan to test everything. The tests should cover stuff in some sort of priority order based on the business importance of the functions being provided by the machine. If any of the test fail, then you have three options : spend some time fixing the problem, but if you run out of time, or you can't fix it, you need to roll the machine back to its original state and re-test the functionality of the restored system. or you can decide to live with reduced functionality for a while if its minor.

Having done the chmod command you described, you have hosed the file permissions for the entire operating system, and you will need to do a full re-install to be assured of getting it right again.

A number of the behind the scenes UNIX programs will barf if certain files or directories are permission mode 755 as this insecure in certain circumstances. In addition, a number of the services running on the machine will be much easier to compromise if every file and directory in the system has this permission.

In short - you will have to pull back to orbit and nuke the site, its the only way to be sure ( ie full rebuild the of the box, and apply the OS patches to 10.0.3, or a full restore from backups - but is sounds like you don't have any backups ).

Given that 10.0.4 seems to have a few issues ( some permissions problems, nuking SSH keys, and Mac Manager administrator problems ), you might want to stick with 10.0.3, and wait for , and TEST, 10.0.5 .

If 10.0.4 has something that is critical for you, then test what it does on a spare machine and look really really hard, so you are prepared and forearmed with all the little tweaks you may need to apply before you roll a production box.

Have you checked for console messages and/or error logs? Sometimes they are very helpfull (other times not...)