# Mac OS X Problems



## John Philip (May 16, 2002)

hi,
Have a problem on a small network, with an OSX server:
4 Heavy-duty G4 workstations running OS9 and X, 4 Medium duty Workstations (G4) running OS9, Admin sector with 3 Mac's running OS9 (all G3's or 4's).
Cat 5e network, a 100Mbit 24-ports Asanté swith (with the admin an the medium machinery attached), uplinked to a 7- port Asante Giga Switch (with the server and the heavy workstations attached).
Have been getting constant server freezes, without any warnings to the workstations or on the server itself. Server seems to be running even though it has ceased to respond to mouse or keyboard. Freezes from 1 to 6 times a day.
Excerpts from the tail of the server log:
2002-05-16 10:13:54 CEST	Started child "/usr/sbin/slpd" as pid 432.
2002-05-16 10:13:54 CEST	Started child "/usr/sbin/sambadmind" as pid 433.
2002-05-16 10:13:54 CEST	Started child "/usr/sbin/PrintServiceMonitor" as pid 434.
2002-05-16 10:13:54 CEST	Started child "/usr/sbin/serveradmind" as pid 435.
2002-05-16 10:13:54 CEST	Automatic reboot timer enabled.
2002-05-16 10:13:54 CEST	Reaped child process 432 ("/usr/sbin/slpd"); quit with exit status 253.
2002-05-16 10:13:54 CEST	Process "/usr/sbin/slpd" respawning too rapidly!

After which the rest is silence..

Any ideas out there ? Any help would be much appreciated..

Kind regards

John Philip


----------



## sao (May 16, 2002)

It looks a problem with slpd (the SLP daemon)

 Have you checked the client and router compatibility?
 <<SLP on MacOS 9.0 will use IP multicast. If your network uses routers that are not capable of IP multicast, you will need to upgrade them or set up tunneling.>>

 <<If a service stops working almost immediately after being started (the current default is less than 10 seconds), watchdog marks that service as "faulty" and stops trying to restart it. This is a safety measure to prevent a situation in which a failing process restarts indefinitely.>>


 action: 
How to monitor a service and what to do if it stops working. The possible action options are:

*	off - If the specified service is not running, do not start it. If it is running, stop it. 
*	boot - Only start the service when watchdog starts. If it stops after that, it is not restarted. 
*	bootwait - Currently this behaves the same way as boot. 
*	respawn - Watchdog will restart the application if it stops working. 
*	now - Watchdog ignores this entry at startup and will actually change the entry in the configuration file to "off". If you change an entry to "now" and force watchdog to reread its configuration file, it will treat that entry as if it were marked "respawn". 

 Check the etc/slpa.conf file

 Maybe you get a better idea of what's wrong.


 Cheers...


----------



## John Philip (May 17, 2002)

Will try some of your good advice.
Get back next week with results...

Thanx again

John Philip


----------



## John Philip (Jun 7, 2002)

After checking the System profiler - I discovered that The log starts like this:
--
Version of ASP that generated this report = 2.7
(note:this string only shows up in pre-final or debug versions)
--
..and then further down in the list - most of what has to do with networking is designated 'Dev' for developer - and the rest as 'GM' - possible Golden Master.

The problem is that it is the third server on the same site and the third OSX server sw. pack. So the problem can only stem from Apples Software Updater..

Get back with a comment from Apple - if they care to give one..

John Philip


----------



## John Philip (Jun 20, 2002)

Well, well.
First Apple (verbally) expressed concern for the matter - and claimed to have analyzed the logs to show 2 (two) problems:
1) A networking protocol problem
and
2) A problem with the file system.
--
After careful consideration, Apple has returned (in writing):
Basically the error is a  NetInfo failure. That happens when NI hang 
which is when the filesystem does.
They suggest that try it without both scsi cards and using ATA drive... 
There have been a lot of complaints about SCSI hanging and a t one point 
Adaptec stated the 39160 and 29160 were not compatible with MacOS X... 
Since then they released newer drivers than the ones we included with OS 
X server. So maybe worth a try to update them...
--
So - Apple has no part in the problem!
However, the reply raises a few other questions:
1) The server acted the same way, when the filesystem was on the internal disk for a period.
2) If the Adaptec drivers incl. in OSX at one point was 'denounced' by Adaptec, why has Apple not released som sort of technote to this effect?
3) '..maybe worth an update...' is just about as uncommitted an answer as possible.
4) Why should it take local Apple Support, European Apple Support and finally some bigbrain in Cupertino so long to figure it out, if the Adaptec problem has been known to all of them for a while?

Probably the filesystem will feel better by an upgrade - but there's no definite indication that this will remedy the problem

PS.: OSX server version 10.1.3 worked a lot better with only one breakdown a week as opposed to the versions on either side of this - that has resulted in breakdowns 3-4 times a day (on a good day) - which is also a bit worrying. Does the Adaptec drives act less faulty on 10.1.3 ? I wonder...

John Philip

sigh, sigh - and shame on Apple

John Philip


----------



## sao (Jun 20, 2002)

John Philip,

 Unfortunately, the Adaptec SCSI drivers have been the source of many problems as they were a little late to release new drivers for MacOS X.

 But, Adaptec for long has been supporting the Mac. And I wonder whom is  to blame in this case.

 Yes, Apple easily could have added a technote to this effect, and more...really.

 But anyhow, now, at least we know.


 Cheers...


----------



## John Philip (Jun 20, 2002)

The essence is that I do not believe in Apple's explanation:

First the server freezes have been more or less constant - Adaptec or no Adaptec.
The only time there has been a significant smaller number of server freezes has been the period ver. 10.1.3 has been on the machine.
At that point the freezes decreased from 3-4 or more a day to 1-4 times a week.
The Adaptec controlled filesystem has been off and the working data has been copied to the internal disk - WITHOUT any significant improvement.

Secondly the Cupertino experts first stalled over the System profiler readout, stating that most of the components in 10.1.5 was 'Dev.', 'GM' or 'Beta' - and only a small portion of the OS was noted as being 'Finished version'.
In fact they asked where the software had come from - and we were ready to swear that the only source of the software was Apple Software Updater system - which they did not like at all, and stated they would look into that and get back to us (they did not - of course).

Thirdly the Cupertino experts got back stating that there were two significant problems: 1) A Networking problem and 2) A file system problem.
The networking problem, we already knew was there as the logs showed that the first error almost always came from the networking part of the OS - after which a multitude of errors tumbled the whole system.

The conclusion seems to be that Apple selects to point at the filesystem, and thereby Adaptec - and in doing so avoids any responsibility on their part of the problem.
Again, doing so by stating that it might be worth a try to update the drivers, seems to me to be a rather diplomatic and unspecific way of diagnosing the problem.

Ofcourse the filesystem and adaptec will get it's upgrading - but at the moment I am afraid it will not make the problem go away.

I'll get back after having tried this.

JP


----------



## sao (Jun 20, 2002)

John Philip,

 I understand. Good luck and...cross your fingers.

 Let me know.

 Cheers...


----------



## John Philip (Aug 6, 2002)

> _Originally posted by sao _
> *John Philip,
> 
> I understand. Good luck and...cross your fingers.
> ...


----------

