Drive capacities continue to increase, 3TB drives are becoming commonplace, and 4TB SATA drives have just been released. These high capacity drives represent fantastic value based on $/GB but that comes at another cost that is not often well understood.
Let’s take five of these 2TB drives and combine them into a typical RAID5 array (4 data, 1 parity) to give us around 7.5 TB of usable space. Because of its large size, in a SAN this array is usually divided into multiple LUN’s which are assigned to multiple hosts. Now you have multiple servers with different I/O patterns accessing a small number of drives. This randomises the disk I/O which is the number one performance enemy. When using virtualised environments like vSphere the problem is compounded because multiple guest VM’s are now accessing a single LUN via a DataStore. When fully broken down you may find you have dozens of servers accessing the single array. This example array can only service around 100 random IOPS (50/50 read/write ratio). Shared amongst 20 servers there is not much I/O to go around.
What this example highlights is a problem we calls high I/O density, we’ve seen numerous customers trying to address space issues only to find they’ve now inherited a new performance problem. These problems can be avoided with careful planning. The choice of disks, their layout and application usage patterns need to be well understood to strike a good balance between performance and space. What runs well now may be close to tipping point of slow response times which can be difficult to resolve easily later. With so many variables affecting performance it’s important to get expert advice early to avoid problems later.