***********WARNING*********************
RAID1 mirrors disk blocks - it's goal in life is to make sure each disk involved in a single write operation contains the same data. The redundant levels of RAID are not a substitute for backups. Corrupt your data and/or filesystem & RAID1 will perfectly mirror garbage!!
***********WARNING*********************
OK, had to put the important warning out; now the details.
I am very new to this OS, but have been designing RAID systems for UNIX since 1989. I currently am a contract instructor for Veritas RAID, filesystem, and backup products.
If we are using RAID, and doing writes, what happens if the system crashes before the all writes to the multiple disks that comprise your filesystem complete? CORRUPTION of data can happen, so U gotta do backups too.
Take the following info with a grain of salt. Commercial UNIX's have 64 bit UID's so they need to support a few more concurrent user's than the typical Mac system does. RAID10 requires at least 4 disks. RAID5 requires at least 3 disks. RAID1 (mirroring) requires at least 2 disks. RAID1 by intself will be just fine for many of us cuz we don't NORMALLY write lot's of files larger than 64KB at the same time. RAID1 can have better read performance; it uses a round-robin technique to alternate read operations between disks. Write performance is about the same unless we are in a heavily loaded & write-intensive environment. Heavily loaded is when disk utilization as measured by #iostat is> 25-30%. Write intensive is when total I/O operations divided by total writes is greater that 25%, or 15% if using NFS version 2.
A 10,000 RPM 18GB SCSI disk can do about 120 I/O operations per second. At 64 KB per I/O, this comes close enough to 8MB per second I/O bandwidth per disk. Newer larger disks can write more than 64KB per I/O & will have higher bandwidth IF your app sends larger chunks. Otherwise, your are throttled rotational latency (RPM) & seek time. You might think that with a larger capacity disk, the sectors are closer together so rotational latency SHOULD improve. This not entirely true - sector 8 is not physically next to sector 9, it is about 4-5 sectors away. This gives the disk drive H/W time to digest sector 8 while rotating to sector 9. Go to
http://docs.sun.com & look at the #tunefs command to see how you too can second-guess the disk drive engineers. We will not get this level of performance going thru the filesystem. You can use the #time command & compare copying data with #dd versus #cp on the same size chunk & see the difference between raw disk performance (8MB per second) and filesystem perormance moving the same amount of data. This is why the big dogs run Oracle on raw partitions. If you wanna do muplitle runs of #cp, you gotta #umount them #mount the filesystem to flush all the file blocks cached in memory.
If your application requires more disk bandwidth, you gotta stripe.
The best RAID: Mirror in hardware, stripe in S/W. This is RAID 1+0, AKA RAID10. Runner-up is to do both 1+0 in S/W. RAID10 tolerates multiple disk failures the best. I had a student from Intel that was combining over 100 disks (yes, they use UNIX when they need to do REAL work).
The worst RAID: RAID5 in S/W, runner-up RAID5 in H/W. Only tolerates single disk failure. People make fun of you. Recovering the RAID5 volume after replacing a failed disk (read data, X-OR it then re-write) can take hours. If you do less than a full-stripe-write, you must first read the entire stripe to properly X-OR the new & existing data, this is done at the sector (512 bytes) level. This operation is called READ-MODIFY-WRITE for those of you wanting more gory details. Virtually all my customers on RAID5 were unhappy with performance.
I am about 99% sure that Apple's Xsrever RAID box will do RAID10. With H/W RAID changing a failed disk & resyncing the RAID (re-mirroring the data in the case of RAID1) is easier. Hot sparing is better in H/W RAID as well. If you must have the biggest & smartest dog, you can get some EMC & have up to 64GB front-end cache on your H/W RAID box. With the leftover change you could get a SAN, Veritas Netbackup with the "server-free-backup" extension, & do your backups directly from device-to-device. The code even runs on the switch.
When stripng, never use a stripe size smaller than 64KB unless you can prove via benchmark that a smaller size is better. When striping, it is useful to know the characteristics of the reads/writes your application(s) issue. AKA are we doing random of sequential I/O? If you are doing sequential I/O, your "full-stripe-width" (stripe-unit-size*number of disks) should equal the size of writes your application issues. If your app isuues 256KB I/O's, use 4 disks with a STU of 64KB.
On SunOS, I would say
#iostat -x, then for each disk look at the amount of data written, divide by the number of write operations to get a fair idea of my iosize. Unless otherwise stated, volume of data is expressed in sectors; 512 bytes.
Since the man page & the iostat command don't on this OS, I dunno - man says use #iostat -D.
If you are random I/O, use STU of 64K, and as for number of disks, more is usually better. Just remember that on SCSI for example, more than 4 heavily utilized disks will saturate a single SCSI bus. On old versions of Veritas Volume Manager, we used to say max disks = 8, Veritas Volume Manager wouldn't scale well beyond 8. I do not know where MACOSX hits the wall on # of striped disks. On Veritas, the S/W RAID stuff uses kernel memory, so U gotta be careful.
When I explain random I/O to my students, I use the grocery store example; the more checkout lines that are open, the less chance we have of getting in the line where the guy has an outta town check. With random I/O, the less chance we wanna use the same disk some other process is already in line for.
I will write more later as I figure this stuff-out