OSX 10.4.11 Authentication Slowness

chadpilkington

Registered
We have 3 OSX 10.4.11 servers and 1 OSX ~10.2.9 server. 2 of the 10.4s are PPC the other is Intel. The 10.2.9 box is PPC of course. On Monday night around midnight the 10.4 servers suddenly started behaving oddly. The server with pop3 accounts started reporting locked pop mail boxes and clents were getting timeouts while trying to receive mail. The server with a number of ftp clients (pureftpd) started pausing during log in and connections were timing out. On the same server SFTP clients were doing similar things. All three servers were having difficulty opening Server Admin and when they did the services spent most of their time refreshing including the server hardware details. After several hours of running they eventually start to time out on the refresh and show Red caution signs. Logging into the terminal application had a similar delay before it would show the command prompt.

When looking at the network traffic there was no significant flooding. The cpu's of the servers also were not showing any significant work.

All the servers seemed to be operating as web servers without any issues. The SMTP portion of the mail was running full steam and the WebObjects applications were processing things at their usual speeds. When clients were able to log in to the ftp they were able to download at their allotted bandwidth it was just the initial connection that was difficult. Speed benchmarks to the internet were showing close to our full bandwidth available (7 Mb/s on both upload and download)

The only change we had made was 4 days prior when we replaced a switch that was flaky.

Nothing I did seem to solve the problem for long. Rebooting seemed to help some but the issues kept coming back then suddenly 8 hours later everything returned to normal.

I came back in this morning and we where having the exact same problems. I replaced replacement switch with another one. I checked the harddrives. I removed everything but the 4 servers from the network. It didn't seem to help. I was still getting timeouts with Server Admin refresh and odd pauses in logging in to ftp.

Around 8 hours later everything seems to be back to normal.

I thought maybe it was a dns issues but any dig requests returned withing milliseconds even in the middle of the issues. The reverse DNS lookups of the server ip addresses check out. The logs don't mention any issue with this.

I don't see any major errors in any of the logs. We did seem to be getting a lot of attempts to ssh into the servers as admin and every once and a while as root. I locked down the soft firewall on the servers to prevent this attack but it was at the end of the day and after a reboot so things were already behaving the way I would expect.

We only have 20 or so pop3 clients, 10 or so ftp clients and 2 SFTP clients.

We use MySecureShell to lock the home directories of the SFTP clients. On the server that we have our Authoritative DNS on (we use our ISP's DNS for lookups) we use Mice and Men with bind 9 behind it. As I said before we use PureFtpd for ftping. We have some 3rd party java libraries for use in our WebObjects applications but other then that we have no real additional software on the servers.

My suspicion is that tomorrow I will come in to a nightmare again. If anyone has any ideas on next steps I would love to hear it.

Our Fiber media converter connects to a switch which has 3 servers on it and a router. The router had 3 switches connected to it. 2 for office computers and 1 for the internal network cards of the 3 servers. The 4th server has only 1 network card and the router is set to nat map the external ip addres of that server to the internal server ip address.
 
Back
Top