The primary functions of a disk array is to increase data availability, to increase total storage capacity, and to privide performance flexibility by selectively spreading data over multiple spindles.
Data Protection - As the number of disks on a system increases, the likelyhood of one failing increases. Thus, a disk array should be immune from a single disk drive crash. Disk mirroring (keeping an exact copy of a one disk on another) is the simplest, but requires twice the disk capacity (and associated cost). Encoding schemes can be used to reduce the redundancy required to lower ratios.
Storage Capacity is increased by placing many smaller form factor (5.25 and 3.5-inch) drives onto an intelligent controller which makes all the drives appear as one drive to the computer system.
Performance can be increased by spreading data over spindles and performing operations in parallel which allows multiple drives to be working on a single transfer request.
The original taxonomy of RAID levels was published in the SIGMOD paper by Garth Gibson and Randy Katz in 1988 (see below). The taxonomy roughly classifies RAID architectures according to the layout of data and parity information on disks. It is NOT gospel and does NOT cover every possible architecture (it has been pointed out here that that would require an N-tuple showing data block addressing, number and types of parity and ECC information, etc.), but when used properly provides a vocabulary and establishes a framework for discussion.
Raid Level 0 - Striping - Data is segmented and split onto multiple
spindles.
Short Reads - Easily handles multiple simultaneous reads
Long Reads - Single operation can be split and processed in
parallel
Short Writes - Easily handles multiple simultaneous reads
Long Writes - Single operation can be split and processed in
parallel
Redundancy - None
Cost - Good (no extra hardware)
Raid Level 1 - Mirroring - Duplicate data is kept on multiple
splindles
Short Reads - Faster (shorter latency) since
resolution can be from any of multiple disks
Long Reads - Faster since resolution can be from any of
multiple disks (*)
Short Writes - Slower since need to write to multiple disks
Long Writes - Slower since need to write to multiple disks
Redundancy - Excellent
Cost - Expensive - at least double the spindle cost
Raid Level 3 - Data protection disk - mathematical ECC type code
calculated from multiple spindles and stored on another spindle.
Short Reads - Normal speed (i.e. 1x per-spindle rate)
Long Reads - Normal speed
Short Write - Slower due to re-calculating of ECC code
(including reading from other spindles and the ECC write)
Long Write - slightly slower due to ECC writes, but less
reading required than in short writes (**)
Redundancy - Excellent
Cost - only slighly more than no redundancy options
Raid Level 4??? similar to 3, with block striping instead of byte.
Raid Level 5 - Striping plus data protection - stripe data across
multiple spindles (as in RAID Level 0) and have data protection calculations
(as in RAID level 3) but don't put all the calculated figures onto one
spindle, but spread it out.
Short Reads - Normal
Long Reads - Faster due to parallelism
Short Write - Slower due to ECC calculation (including
reading and writing)
Long Write - slighly slower due to ECC writes (**)
Redundancy - Excellent
Cost - only slignly more than no reduncancy options
(* should be the same speed as a single spindle)
(** -- should be faster than a single spindle due to parallelism on
write? somebody help me out --rdv)
Benefits of RAID:
High data availability (ie, if a single spindle crashes, no
data is lost)
Increased disk connectivity per system - since multiple
spindles appear as one spindle to the computer system.
Large capacity storage in a small footprint -
Flexibility through intelligent array controllers
Performance enhancements in some circumstances.
Streamed or Streamified RAID??? (SHMO)
A two-dimensional disk array parity scheme was described by Randy Katz, Garth Gibson, and David Patterson (all then with UC Berkeley - Gibson is now a professor at Carnegie Mellon University) at the 1989 IEEE Compcon conference. This method had one parity calculated along the disk strings and another calculated across them. This would increase the mean-time-to-data-loss by more than 10,000 fold. I am not aware of any implementations of this configuration.
Storage Technology Corp (STK - Louisville, Colorado) has described a somewhat similar scheme for their long-delayed Iceberg disk array. This would have a regular, orthogonal RAID 5 parity across drives along with a Reed-Solomon encoding on another drive. This is sometimes referred to as RAID 6 or RAID 5+. STK claims their design will allow failure of ANY TWO drives - which is beyond the survival capabilities of standard RAID 5.
A RAID 5 which is 'deep' can survive failures in more than one drive so long as it doesn't lose more than one drive per rank:
HBA1 HBA2 HBA3 HBA4 HBA5 HBA6 HBA7 HBA8
| | | | | | | |
Rank1 Disk1 Disk2 Disk3 Disk4 Disk5 Disk6 Disk7 Disk8
| | | | | | | |
Rank2 Disk9 Disk10 Disk11 Disk12 Disk13 Disk14 Disk15 Disk16
. . . . . - etc.
Rank4 . . . . Disk32
If the above is a RAID 5 then losing drives 5 & 6 will destroy data. If it is a RAID 6 then it will not. Losing drives 3 and 12 will not disable a RAID 5 nor a RAID 6.
But RAID 6 will cost more and may have slower performance for small random writes from having to update more parity data. I think there are clearly ways to mitigate the parity update perfomance for RAID 6 as well as RAID 5.
--
Dick Wilmot
Editor, Independent RAID Report
(510) 938-7425
RAID-7 is a marketting term created by Storage Computer, Inc. for what others here have described as RAID-4 with a write cache. John O'Brien (RAID7@world.std.com), (their marketting manager?) frequently posts here.
His claims of ~10x improvement on I/O rates for VAXes have been shown to be poorly measured; the change in systems was not simply a RAID-for-modern-disk swap, but included increasing the CPU power by a factor of three and eliminating the HSC and old disk technology. He has also made difficult-to-substantiate claims about the growth and market success of his company relative to competitors. Thus, wise advice would be to take everything Mr. O'Brien says with a grain of salt (not bad advice for dealing with anyone, but especially true for dealing with vendors).
The debate also appears here frequently as to whether or not you really WANT your RAID array doing write cacheing; Unix file systems may depend on specific ordering of writes and otherwise make assumptions that could leave you in trouble with power or disk failures. If write ordering is preserved, the danger is somewhat mitigated.
That said, some posters here are pleased with their RAID7 arrays, and although comp.arch.storage opinion runs prevalently against Mr. O'Brien himself (and lately his pal Michael Willett who interestingly is quoted here from before he worked for Storage Computer), the possibility exists that the product is worthwhile.
(Berkeley FTP pointers updated, 95/5/11)
A nice collection of RAID papers was published in the Fall, 1991 issue of _CMG Transactions_. A few more appeared in the December, 1992 _CMG Proceedings_ and there are 3 RAID papers in the 1993 International Symposium on Computer Architecture (Published as _Computer Architecture News 21_, #2, May, 1993 by ACM SIGARCH.
(dwilmot@crl.com, Dick Wilmot, Editor, Independent RAID Report)
There is a short RAID FAQ at
ftp.mcs.com (rdv, 96/2/21)
Try contacting the RAID project at the University of California, Berkeley. In the proceedings of the recent IEEE Mass Storage Symposium, Ann Drapeau and Randy Katz have a paper describing the reults of some investigations into the use of tape arrays. I think you can find RAID papers, perhaps this one, on anon ftp at ftp.cs.berkeley.edu. Have no address for Ann Drapeau, but Randy Katz is randy@cs.berkeley.edu.
Some of the RAID papers are available via anon ftp from ftp.cs.berkeley.edu:pub/raid/papers
Ann Drapeau's email address is alc@cs.berkeley.edu.
(dm_devaney@pnl.gov, Mike DeVaney) (eklee@cs.berkeley.edu, Edward K. Lee)
>>I am looking for papers or technical papers on RAID...
You could get that lengthy RAID taxonomy research report from Storage Computer as mentioned recently on these news groups, by Emailing them at RAID7@World.std.com Alternatively, their phone number is 603 880 3005. I do not know if their RAID research report is copyrighted or not.
I believe their executive in charge of RAID activities in Hong Kong would be John Taylor, the former Wang national accounts director. They also put on technical raid seminars which might be of interest to your PhD students, concentrating on performance enhancements over RAID 3/4/5 (somewhat less than an order of magnitude, but I have not reviewed their benchmark data.) The RAID theory discussed is rather interesting.
(MICHAEL.WILLETT@OFFICE.WANG.COM, Michael Willett) --------- >> I am looking for papers or technical papers on RAID or other multiple disks >> storage systems. Could somebody give me pointers for them?
Here are some papers that I either have read or am looking for: I don't have copies of this group:
Dishon, Yitzhak; Lui, T.S.; Disk Dual Copy Methods and Their Performance; FTCS-18: Eighteenth International Symposium on Fault-Tolerant Computing, Digest of Papers p 314-318 Gray, J.N. et. al., Parity Striping of Disk Arrays: Low Cost Reliable Storage With Acceptable Throughput, 16th International Conference on VLDB (Austrailia, August 1990) Katz, R.H.; Patterson, D.A.; Gibson, G.A.; Disk System Architectures for High Performance Computing; Proc. IEEE v 78 n 2 Feb 1990 Muntz, Richard R.; Lui, John C.S.; Proformance Analysis of Disk Arrays Under Failure; Proceedings of the 16th International Conference on Very Large Data Bases (VLDB); Dennis Mcleod, Ron Sacks-Davis, Hans Schek (Eds.), Morgan Kaufmann Publishers, Aug 1990 pp 162-173 Ng, Spencer; Some Design Issues of Disk Arrays; Compcon '89: Thirty-Fourth IEEE Computer Society Internationsl Conference p 137-142 DISK ARRAYS, STRIPING, SPINDLE SYCHRONIZATION Ng, Spencer W.; Improving Disk Performance via Latency Reduction; IEEE Transactions on Computers v 40 1 Jan 1991 p22-30 LATENCY REDUCTION, ROTATION LATENCY, DISK PERFORMANCE Reddy, A.L. Narasimha; Banerjee, Prithviraj; Performance Evalutaion of Multiple-Disk I/O Systems; Proceedings of the 1989 International Conference on Parallel Processing p 315-318 Here are some good papers on disk arrays with emphasis on RAID: Chen, Peter M.; Gibson, Garth A.; Katz, Randy H.; Patterson, David A.; Evaluation of Redundant Arrays of Disks Using an Amdahl 5890; 1990 ACM SIGMETRICS Conference on Measurement & Modeling of Computer Systems p 74-85 Chen, Peter M.; Patterson, David A.; Maximizing Performance in a Striped Disk Array; Proceedings of the 17th IEEE Annual International Symposium on Computer Architecture p 322-331 Chen, Shenze; Don Towsley; Performance of a Mirrored Disk in a Real-Time Transaction System; 1991 ACM SIGMETRICS Conference on Measurement & Modeling of Computer Systems p 198-207 Chervenak, Ann L.; Katz, Randy H.; Performance of a Disk Array Prototype; ACM SIGMETRICS 1991 Conference Proceedings p 188-197 Menon, J.; Mattson, R.L. and Spencer, N.; Distributed Sparing for Improved Performance of Disk Arrays; IBM Research Report RJ 7943 (Jan. 1991) Patterson, David A.; Chen, Peter; Gibson, Garth; Katz, Randy H.; Introduction of Redundant Arrays of Inexpensive Disks (RAID); Compcon 1989: Thirty-Fourth IEEE Computer Society International Conference p 112-117 Schulze, Martin; Gibson, Garth; Katz, Randy; Patterson, David A.; How Reliable is a RAID; Compcon '89: Thirty-Fourth IEEE Computer Society International Conference p 118-123
(danj@hub.parallan.com, Dan Jones) -------- >>I am looking for papers or technical papers on RAID...
A good set of the Berkeley papers are available via anonymous FTP. If I remember, the machine was ftp.cs.berkeley.edu. Also, an archie search on "RAID" would probably turn up a nice on-line collection of information. (sorry, not at an Internet site to check this right now...)
(buck@siswat.hou.tx.us , Lester Buck)
Further Information:
%A Garth Gibson
%A Randy H. Katz
%T A case for redundant arrays of inexpensive disks (RAID)
%C Proc. SIGMOD.
%c Chicago, Illinois
%D 1--3 June 1988
%P 109 116
%k RAID, disk striping, reliability, availability, performance
%k disk arrays, SCSI, hardware failures, MTTR, MTBF
%k secondary storage
%L Jacobson has a copy
%x Increasing the performance of CPUs and memories will be
%x squandered if not matched by a similar performance increase in
%x I/O. While the capacity of Single Large Expensive Disks (SLED)
%x has grown rapidly, the performance improvement of SLED has been
%x modest. Redundant Arrays of Inexpensive Disks (RAID), based
%x on the magnetic disk technology developed for personal
%x computers, offers an attractive alternative to SLED, promising
%x improvements of an order of magnitude in performance,
%x reliability, power consumption, and scalability. This paper
%x introduces five levels of RAIDs, giving their relataive
%x cost/performance, and compares RAID to an IBM 3380 and a
%x Fujitsu Super Eagle.
(tage@cs.utwente.nl)
Address: 11211 E Arapahoe Rd., Suite 200, Englewood, CO 80112
Phone: 303/799-9292, Fax: 303/799-9297
Sun Microsystems has a new Fibre Channel array that does RAID 0, 1,
and 5. See WWW.Sun.Com under the products descriptions.
(rdv,94/8/8)
See www.clariion.com. A division of Data General. Mostly big
systems, I believe.
Targetted at Unix systems -- Sun, HP, SGI, etc. See
www.baydel.com. Fairly big vendor, I'm told. (rdv, 97/3/18)
The RAIDBook, a 100+ page tutorial on RAID technology and the RAID Advisory Board, is available from Technology Forums, LTD, of 6931 Glenview Lane, Lino Lakes, MN 55014-1296.
Contact Joe Molina, President of Technology Forums at
I've read it, it's decent but a little repetitive. Defines many
low-level terms of interest only to those who need to know the
internals. (rdv,95/2/7)
Silicon Graphics provides software striping of SCSI disks; thus your
host can effectively act as a RAID controller, providing flexibility
and probably reduced price, possibly with a performance penalty in the
form of increased CPU overhead. However, it probably means that it can
spread the I/O load over multiple I/O controllers.
(similar features in other systems? SHMO --rdv)
RAID0 is in late beta under Linux. (evesg@etlcom3.etl.go.jp (Gjoen
Stein), 95/10/6)
sdsadmin on the HP 7xx line does raid 0 striping and works well.
this is also apparently possible on the 8xx machines using LVM.
sdsadmin is due to disappear with hpux 10, replaced by LVM.
I believe the Advanced FS on Alphas can also do raid 0.
(mark hahn, hahn@neurocog.lrdc.pitt.edu, 94/11/17)
ATTO Technology has ExpressStripe, which does software striping on a
Mac.
Cyranix
RAID vendors come and go quickly, OEM each other's equipment, change
names, and other activities that seem aimed at simply obscuring the
market. No list like this could be complete and up to date for long;
I'll gladly take updates.
See
Other reviews are available at
The November '94 issue of _Advanced Imaging_ has a big article on
storage, primarily RAID arrays, with a pretty comprehensive list. This
table is distilled from that. Most of the info appears to be from the
vendors themselves. Almost all of these are fast/wide SCSI; a few are
Fibre Channel, NuBus, PCI or HiPPI (usually with IPI-3 command set).
Most of these vendors have more than one model, only a few are listed
here. (rdv,95/1/18)
Most of these have some web presence; a Lycos search would turn up
their sites.
email me at
rdv@isi.edu
Copyright 1996 Rod Van Meter
7.10. Software Striping {Brief}
www.cyranex.com makes EZRAID PRO (RAID 0,1,4,5) for
OS/2. Voice: +1 613 738 3864 Fax: +1 613 738 3871
7.11. RAID Vendors
www.disktrend.com for one good list of RAID vendors, and
www.sresearch.com for another.
techweb.cmp.com,
techweb.cmp.com and
www.byte.com. (steven@nijenrode.nl
(Steven Hessing), 1996/3/30)
PC = Personal Computer (IBM compatible)
MC = Macintosh
PS = PC Server (Netware, NT et al)
NT = Windows NT
UX = Unix (generic)
PU = Personal Unix
WU = Workstation Unix & workstation servers
MF = mainframe
MI = minicomputer (AS/400)
SU = Supercomputer
FC = Fibre Channel interface (usually SCSI command protocol)
Maker Model RAID Levels Uses
-------------------------------------------------------------------------
AC Technology Concorde 0,3,5 WU
ADJFILE Systems Cougar, Tiger 0,1,3,5 ??
ANDATACO GigaRAID 0,1,3,5 UX,NT
AT&T Global Information Series 3 ?? WU,PS,PC
Systems -- NCR
BusLogic DA-x988 0,1,3,5 PC,PU,PS (PCI)
Canary Communications IDA3500 0,1,3,4,5 ??
Ciprico 6800 Real-Time ?? ??
RAID Array
Cybernetics Xtreme 0 ??
DEC StorageWorks 0,1,5 ??
RAID Array 210
Distributed Processing SmartRAID 0,1,5 PC,PU,PS
Technology
DynaTek Automation AddARRAY 0,1 ??
Systems
Fujitsu Comp. Prod. DynaRAID ?? ??
America
FWB, Inc. SledgHammer*FT 5 MC
IBM Storage Systems 7137 Disk Array 0,5 WU
Legacy Storage Systems SmartArray ?? PC (PCI)
Maximum Strategy Gen5 Storage 0,1,3,5 SU (HiPPI,FC)
Server
Mega Driver Systems MR & MK Series 0,3,5 PC,PU,PS,MC,WU
MicroNet Technology RAIDbank Plus 0,1,5 PC,PS,MC,PU?
Micropolis RAIDION,GANDIVA ?? PC,MC,PS,PU,WU
Microtech Int'l XLerator 0,1 MC
Mylex DAC960S 0,1,5,6?,7? ??
Procom Technology LANForce-5 0,1,3,5 MC,??
Raidtec FlexArray IX 0,1,3,5 ??
Recognition Concepts RDR series ?? ??
Storage Computer RAID 7 7?(4?) ??
Storage Concepts Concept 910 ?? ??
Storage Tek Iceberg ?? MF
XL/Datacomp 9638 5 MI,WU
My Home Page at Caltech