Friday, January 30, 2009

Disk Drive Types

Intro
I am writing this because of the amount of questions I get from SMB (Small to Medium Business) customers about why there are so many different types of disks and one even asked me, "why doesn't Microsoft standardize on a disk type" (please do not take offense to the example if you are reading this).
Preface
The explanations here are provided as an overview and are intended to be fairly simplistic to allow a greater audience to understand the differences in disk drives. Storage design and architecture can be an extremely complex topic. There are many factors that go into designing a corporate storage infrastructure. Items such as, but not limited to the following: desired performance, footprint or space consumed, uptime, budget, disaster recovery, etc.   
Microsoft
First I will tackle the Microsoft question. Microsoft is a software company not a hardware company (I know they do make some hardware, but it is mainly for the home consumer market). Since they are the 10 ton guerilla in the Operating System arena they do have some input into disks, but they do not set the standards. They take advantage of what is available in the market and work with Storage manufacturers to ensure hardware works with their Operating System (i.e. the windows family). The big boys in storage are Hitachi, Seagate, Western Digital, Toshiba and IBM.
Terms Commonly Used with Disk Drives
  • Platter - this refers to the physical platter inside the disk enclosure, all data is read from and written to the disk platter. 
  • RPM - this refers to the revolutions per minute the disk platter spins. The faster the platter spins the faster the disk head can read and write information to the disk. Disk speeds are available anywhere from 5,400 to 15,000 RPMs
  • CACHE - this refers to the buffer mechanism that transfers data from a relatively fast area of the computer (memory or RAM) to the relatively slower areas of the computer (the disk platter). A larger cache will help speed up a computer by buffering information and allowing the computer to continue writing data as opposed to waiting for the data to be written to the platter. Think of it a little like a funnel with a large reservoir allowing you to pour liquid into a container continously as opposed to a little at a time.
  • Interface - this refers to the type of connector used to physically connect a disk to the computer most disk types are named based on their interface type
  • NAS - Network attached storage, meaning the data is accessed via a network sharing mechanism like a windows share (SMB) or NFS
  • SAN - Storage Area Network this is a high end array of disks, that is managed from a central set of software. SANs offer greater management capabilities and superioir performance over other types of storage. This gives business greater control over their data and teriffic disaster recovery capabilities.
Older Disk Types
Just as every other technology changes and sometimes improves over time so do disk drives. Take the TV industry as an example; Plasma, Projection, LCD, etc. For many years there were basically 2 major disk types both of which are being phased out for newer, smaller and faster disks.
  • IDE (used for desktops and low end servers)
  • SCSI -Small Computer Systems Interface (used for servers and high end desktops or workstations)
IDE was used where performance and MTBF (mean time between failure) was not as critical as the cost to purchase; desktops and laptops. Max RPMs are generally limited to 7200 (I have read about 10k but have never personally seen one)
SCSI was used primarily where performance and MTBF was more important than cost; servers, SANs, workstations. Max RPMs as high as 15k
Newer Standards
Although there are more disk types than I am listing I am only going to cover the 4 major players
  • SATA 
  • Fiber Channel 
  • SAS 
  • Solid State
SATA is emerging as the defacto standard for desktops and laptops. There are different class SATA drives that offer better MTBF and performance than others, the ones used in desktops are typically part of the lower end classes and seem to be giving SATA a bad rap in the computer industry. Max RPMs for SATA is 10k but typically people will use a 7200 or 5400 rpm drive. There are several reasons for this becoming a standard and some of those are not obvious to anyone outside the hardware industry.
  • Transfer speeds are much faster than IDE (meaning data can be read from and written to the disk more quickly via the interface cable connecting the drive to the computer itself)
  • Because IDE is going away you will see larger disk drives in SATA than IDE
  • Because SATA uses a much smaller cable and connector than IDE it consumes a smaller total footprint allowing better airflow throughout the machine (this gives a lot of advantage to the performance and MTBF of the machine as a whole). 
  •  Easier to install, because a lack of jumper settings that were typical of IDE drives.
  • I have read they have a higher MTBF than IDE, but this is not evident from what I have seen.
  • Newer SATA drives can offer greater cache sizes than IDE
Fiber Channel (FC) is and has been the standard in servers and SANS for quite some time. FC drives are typically more reliable than SATA and IDE (I don't have enough experience with SAS to say yet). FC drives also offer much higher speeds than SATA drives, not only from the cables that are used to connect the drives but also through the RPMs they deliver. Although SATA drives have been built to sustain 15k they are rare and I would think the MTBF for those drives would be much lower than FC or SAS drives. Fiber channel connected drives use light to transfer data between devices allowing for very fast transfers, an externally connected FC array of disks will typically provide much better performance than an internally connected SATA disk. FC is firmly entrenched in the majority of enterprise SANs and many enterprises have invested an enormous amount of money in their SAN infrastructure which means FC will be around for the foreseeable future. (See topic below comparing SAS to FC)
SAS (Serial Attached SCSI) is the new kid on the block for mid to high end disk drives. SAS is quickly becoming the disk type of choice for servers, high end workstations, NAS devices and mid range SANs. Many experts believe that SAS will eventually take over the number one spot in SAN disks as well, but there are currently limiting factors that will prevent that from happening in the near future. (See topic below comparing SAS to FC). Max RPMs for SAS is currently 15k.
Solid State Drives (SSD) are the newest rave and rightly so. They don't have moving parts which means the MTBF should be extremely high it also means they do not make noise. Although a lot is being said about the new SSD it is not a new technology, SSD emerged as auxillary memory units during the era of vacuum tube computers where replaced by cheaper drum storage units. Cray, Amdahl and IBM used them as far back as the 70's for specialty computers, but it was very expensive and rarely used. SSDs have extremely fast reads and transfer rates of up to 3 Gbits/s, but delivering better performance because of the lack of a spinning disk. Currently it is still expensive to implement and the sizes are typically small. Toshiba will begin manufacturing a 512 gb SSD this quarter (q1 2009). IBM has announced it tested a 4 tb SSD but not much is known about it. This will continue to be a niche device where speed is of the utmost importance. 

SAS vs. FC
Since it seems for the enterprise the big question is what do I go with FC or SAS, I will list some of the differences.
  • SAS spindle speeds are comparable to FC
  • SAS transfer rates are slower, FC offers 4 Gbits/s while SAS currently only offers 3 Gbits/s. 
  • SAS is up to 50% cheaper to manufacture than FC meaning a lower entry cost
  • SAS drives can be smaller (2.5" vs 3.5") and manufacturers claim that the smaller units consume less power.
  • FC is more mature
  • SAS is limited in the number host ports it can connect, making it difficult to use in larger SAN infrastructures. There are vendors making expanders to overcome this supporting thousands of connected devices.
  • SAS is limited to an 8ft connection from its controller. A large 5 frame SAN will have a lot longer runs between the back planes making SAS a non player in that market for the time being. Future development will probably overcome this limitation though.
Summary
For desktop and home consumers SATA will continue to be the disk type of choice. Performance oriented individuals will start to use SAS and SSD in niche performance markets. For the SMB market SAS will be the disk type of choice for servers, NAS and SANs. Manufacturers have a large investment in FC and will likely continue to push FC as the disk type of choice for the large enterprise SANs and for the mean time keep the advancements of SAS slightly behind FC, much the same way IBM keeps the P series development slightly behind the mainframe (there is more money to be made on FC for the time being). Consumers will likely force the manufacturers hands over time as smaller players in the storage arena continue to advance the development of SAS. As the smaller players start to consume market space the larger guys like Hitachi and EMC will go balls to the wall to reclaim that space and provide SAS capabilities in the enterprise arrays.

I hope this was of use to some people trying to understand the amount of choices in disk technology.

No comments:

Post a Comment