How to do rough Storage Array IOPS estimates

This post is dedicated to trying to get around the mystery factor that Cache and Storage Array algorithms have, and helping you calculating how many disks you should have inside your Storage Array to produce a stable average number of disk operations per second – IOPS.

Face it, it is very hard to estimate performance – more specifically IOPS. Though throughput may be high in sequential patterns, Storage Array face a different challenge when it comes to random IOPS. And from my personal experience, Array-vendor Sales people tend to be over-optimistic when it comes to the maximum IOPS their Array can produce. And even though their Array might actually be able to achieve a certain high maximum value of IOPS with 8KB blocks, that does not mean you will in your environment.

Why IOPS matter

A lot of factors can affect your Storage Array performance. The first typical factor are the very random traffic and high output patterns of databases. It is no wonder  this is the usual first use-case for SSD. Online Transaction Processing (OLTP) workloads, which by having verified writes (write and read back the data) double IOPS, and since it has a high speed demand, can be source of stress for Arrays.

Server Virtualization is also a big contender, which produces the “IO blender effect“. Finally Exchange is also a mainstream contender for high IOPS, though the architecture since Microsoft version 2010 changed the paradigm for storing data in Arrays.

These are just some simple and common of the many examples where IOPS can be even more critical than throughput. This is where your disk count can become a critical factor, and to the rescue when that terabyte of Storage Array cache is lost and desperately crying out for help.

Monkey-math 

So here are some very simplistic grocery-style type of math, which can be very useful to quickly estimate how disks you need in that new EMC/NetApp/Hitachi/HP/… Array.

First of all IOPS variate according to the disk technology you use. So in terms of Back-end these are the average numbers I consider:

  • SSD – 2500 IOPS
  • 15k HDD – 200 IOPS
  • 10k HDD – 150 IOPS
  • 7.2k HDD – 75 IOPS

Total Front-End IOPS = C + B , where:

C stands for total number of successful Cache Hit IOPS on reads, and B for total IOPS you can extract from your disk backend (reads + writes). Their formula is:

C = %Cache-hit * %Read-pattern

B = (Theoretical Raw Sum of Back-end Disk IOPS) * %Read-pattern + (Theoretical Raw Sum of Back-end Disk IOPS)/(RAID-factor) * %Write-pattern

C is the big exclamation mark on every array. It depends essentially on the amount of Cache the Array has, on the efficiency of its Algorithms and code, and in some cases such as in EMC VNX the usage of helping technologies such as FAST Cache. This is where your biggest margin of error lies. I personally use values between 10% up to 50% maximum efficiency, which is quite a big difference, I know.

As for B, you have to take into consideration the penalty that RAID introduces:

  • RAID 10 has a 2 IO front-end penalty: for every write operation you will have one additional Write for data copy. Thus you have to halve all Back-End IOPS, in order to have the true Front-End IOPS
  • RAID 5 has a 4 IO back-end penalty: for every write operation, you have 2 reads (read old data + parity) plus 2 writes (new data and parity)
  • RAID 6 has a 6 IO Back-ned penalty: for every write operation, you have 3 reads (read old data + parity) plus 3 writes (new data and parity)

Say I was offered a mid-range array with two Controllers, and I want to have about 20.000 of IOPS out of 15k SAS HDD. How many disks would I need?

First the assumptions:

  • About 30% of average cache-hit success on reads (which means 70% of reads will go Back-end)
  • Using RAID 5
  • Using 15k HDD, so about 200 IOPS per disk
  • 60/40 % of Read/Write pattern

Out of these 20.000 total Front-End IOPS, Cache-hit percentage will be:

C = 20.000* %Read * %Cache-hit = 20.000 * 0,6 * 0,3 = 3.600

Theoretical Raw Sum of Back-end Disk IOPS = N * 200

Thus, to arrive at the total number of disks needed:

20.000 – 3.600 = (N*200)*0,6 + (N*200/4) *0,4

Thus N = 117.14 Disks.

So about 118 disks.

 

Hope this helped.