Software Defined Storage – marketing boloney or technical Breakthrough?

If you’re as suspicious as Wikipedia and I are about the new marketing buzzwords that have transcended the Networking world into Storage terminology, then this might be a post for you.

OK, so I tried a sort of reverse-engineering approach when investigating about Software Defined Storage (SDS). I tried to figure out how SDN would materialize in a Storage world, and only then did I checked what vendors are saying.

Here it goes. SDN’s architecture decouples the operational control plane from a distributed architecture where each Networking box holds its own, and centralizes it in a single device (for the sake of simplicity, I will not consider HA concerns, nor scalablity details, are those are specifics of a solution, not a model), called the SDN Controller. The goal being to make it easier in terms of Northbound interface to have customized coding, whether from an Administrator or from the application’s provider, and instantly change Networking’s behavior. Thus allowing for swift changes to take place in Networking, and populating new forwarding rules “on the fly“.

Now the way I would like to have sort of the same things map into the Storage world would be something arround the following basic characteristics:

  1. Having a centralized Control Plane (either consisting of a single controller or several), which has an Northbound API against which I can run my own scripts to customize Storage configurations and behavior. The control is not comprised by a data-plane – that stays in Storage Arrays.
  2. Applications being able to request customized Service Levels to the Control Plane, and being able to change those dinamically.
  3. Automatic orchestration and Provisioning of Storage
  4. Ability to react fast to storage changes, such as failures

Now when you talk about Networking devices, one of the advantages of decoupling Control Plane from all switchs in the Network is to have stupid or thin Switchs – and consequently cheaper ones. These minimalistic (dumb) switches would simply support populating their FIB table (whether using OpenFlow or another Protocol) by their Controller, and only a few more basic protocols related to link layer control and negotiation.

However when you try to the same with the Storage Arrays, the concept gets a little more complicated. You need to worry about data redundancy (not just the box redundancy for service), as well as performance. So the only way you can treat Storage Arrays as stupid devices is to add another layer between Arrays and Hosts, where you centralize IO – in other words, a Virtualization Layer. Otherwise, your SDS Controller would just be an orchestration layer for configuration, and we’ve already got a buzzword for that: Cloud.

By having a Virtualization layer in between you can now start mirroring data across different Arrays, locally or in a DR perspective, thus being able to control data redundancy outside your array. You also start having better control of your Storage Service level, being able to stripe a LUN accross different Tiers of Storage (SSD, 15k SAS, 10k SAS, 7,2k NL SAS) in different Arrays, transparently to the host. Please keep in mind that this is all theoratical babel so far; I’m not saying this should be implemented in production at real life scenarios. I’m justing wondering arround the concept.

So, besides having a centralized control plain, another necessity prompts: you need a virtualization layer in between your Storage Arrays and Hosts. You might (and correctly) be thinking: we already have that among various vendors, so the next question being: are we there yet? Meaning is this already an astonishing breakthrough? The answer must be no. This is the same vision of a Federated Storage environment which isn’t new at all. Take Veritas Volume Manager, or VMware VMFS.

Wikipedia states that SDS could ” include any or all of the following non-compulsory features:

  • automation with policy-driven storage provisioning – with SLAs replacing technology details
  • virtual volumes – allowing a more transparent mapping between large volumes and the VM disk images within them, to allow better performance and data management optimizations
  • commodity hardware with storage logic abstracted into a software layer
  • programability – management interfaces that span traditional storage array products, as a particular definition of separating “control plane” from “data plane”
  • abstraction of the logical storage services and capabilities from the underlying physical storage systems, including techniques such as in-band storage virtualization
  • scale-out architecture “

VMware had already pitched its Software Defined Datacenter vision in VMworld 2012, having bought Startups that help sustaining such marketing claims, such as Virsto for SDS, and Nicira for SDN.

But Hardware Vendors are also embracing the Marketing hype. NetApp announced SDS, with Data ONTAP Edge and Clustered Data ONTAP. The way I view it, both solutions consist on using a virtualization layer with common OS. One by using a simple VSA with NetApp’s WAFL OS, that presents Storage back to VMs and Servers.


The other by using a Gateway (V-Series) to virtualize third-party Arrays. This is simply virtualization, still quite faraway a truly SDS concept.

IBM announcing the same, with a VSA.

HP is also leveraging its LeftHand VSA for Block-Storage, as well as a new VSA announced for Backup to Disk – StoreOnce VM. Again, same drill.

Now EMC looks to me (in terms of marketing at least) as the Storage Player who got the concept best. It was announced that EMC will launch soon its Software Defined Storage controller – ViPR. Here is its “Datasheet“. 

To conclusion: in my oppinion SDS is still far far far away (technically speaking) from the SDN developments, so as usual, renew your ACLs for this new marketing hype.

How to do rough Storage Array IOPS estimates

This post is dedicated to trying to get around the mystery factor that Cache and Storage Array algorithms have, and helping you calculating how many disks you should have inside your Storage Array to produce a stable average number of disk operations per second – IOPS.

Face it, it is very hard to estimate performance – more specifically IOPS. Though throughput may be high in sequential patterns, Storage Array face a different challenge when it comes to random IOPS. And from my personal experience, Array-vendor Sales people tend to be over-optimistic when it comes to the maximum IOPS their Array can produce. And even though their Array might actually be able to achieve a certain high maximum value of IOPS with 8KB blocks, that does not mean you will in your environment.

Why IOPS matter

A lot of factors can affect your Storage Array performance. The first typical factor are the very random traffic and high output patterns of databases. It is no wonder  this is the usual first use-case for SSD. Online Transaction Processing (OLTP) workloads, which by having verified writes (write and read back the data) double IOPS, and since it has a high speed demand, can be source of stress for Arrays.

Server Virtualization is also a big contender, which produces the “IO blender effect“. Finally Exchange is also a mainstream contender for high IOPS, though the architecture since Microsoft version 2010 changed the paradigm for storing data in Arrays.

These are just some simple and common of the many examples where IOPS can be even more critical than throughput. This is where your disk count can become a critical factor, and to the rescue when that terabyte of Storage Array cache is lost and desperately crying out for help.


So here are some very simplistic grocery-style type of math, which can be very useful to quickly estimate how disks you need in that new EMC/NetApp/Hitachi/HP/… Array.

First of all IOPS variate according to the disk technology you use. So in terms of Back-end these are the average numbers I consider:

  • SSD – 2500 IOPS
  • 15k HDD – 200 IOPS
  • 10k HDD – 150 IOPS
  • 7.2k HDD – 75 IOPS

Total Front-End IOPS = C + B , where:

C stands for total number of successful Cache Hit IOPS on reads, and B for total IOPS you can extract from your disk backend (reads + writes). Their formula is:

C = %Cache-hit * %Read-pattern

B = (Theoretical Raw Sum of Back-end Disk IOPS) * %Read-pattern + (Theoretical Raw Sum of Back-end Disk IOPS)/(RAID-factor) * %Write-pattern

C is the big exclamation mark on every array. It depends essentially on the amount of Cache the Array has, on the efficiency of its Algorithms and code, and in some cases such as in EMC VNX the usage of helping technologies such as FAST Cache. This is where your biggest margin of error lies. I personally use values between 10% up to 50% maximum efficiency, which is quite a big difference, I know.

As for B, you have to take into consideration the penalty that RAID introduces:

  • RAID 10 has a 2 IO front-end penalty: for every write operation you will have one additional Write for data copy. Thus you have to halve all Back-End IOPS, in order to have the true Front-End IOPS
  • RAID 5 has a 4 IO back-end penalty: for every write operation, you have 2 reads (read old data + parity) plus 2 writes (new data and parity)
  • RAID 6 has a 6 IO Back-ned penalty: for every write operation, you have 3 reads (read old data + parity) plus 3 writes (new data and parity)

Say I was offered a mid-range array with two Controllers, and I want to have about 20.000 of IOPS out of 15k SAS HDD. How many disks would I need?

First the assumptions:

  • About 30% of average cache-hit success on reads (which means 70% of reads will go Back-end)
  • Using RAID 5
  • Using 15k HDD, so about 200 IOPS per disk
  • 60/40 % of Read/Write pattern

Out of these 20.000 total Front-End IOPS, Cache-hit percentage will be:

C = 20.000* %Read * %Cache-hit = 20.000 * 0,6 * 0,3 = 3.600

Theoretical Raw Sum of Back-end Disk IOPS = N * 200

Thus, to arrive at the total number of disks needed:

20.000 – 3.600 = (N*200)*0,6 + (N*200/4) *0,4

Thus N = 117.14 Disks.

So about 118 disks.


Hope this helped.

EMC acquires ScaleIO

EMC acquired the Storage Startup ScaleIO for $200M-$300M.

ScaleIO is a Palo Alto based Startup that competes with Amazon AWS, more specifically with its Elastic Block Storage (EBS). They use an architecture of grid computing, where each computing node has local disks and ScaleIo Software. The Software creates a Virtual SAN with local disks, thus providing a highly parallel computing storage nodes SAN, while maintaining HA Enterprise requirements.

ScaleIO Software is allegedly a lightweight piece of Software, and runs alongside with other applications, such as DBs and hyper-visors. They work with all leading Linux distributions and hyper-visors, and offer additional features such as encryption at rest and quality of service (QoS) for performance.

Here’s ScaleIO own competitive smack down:

ScaleIO vs Amazon

VMware SDS (?) – Virsto VSA Architecture

Though still having limited technical resources available for an indepth deep dive, the available resources already provide an overall picture.

Virsto positions itself as a Software Defined Storage (SDS) product, using Storage virtualization tecnology. Well, besides being 100% Software, I see a huge gap from the Networking Software-Defined concept to the SDS side. I do imagine VMware’s marketing pushing them to step up the SDS marketing, with their Software Defined Datacenter vision. However Virsto failed to make me disagree on Wikipedia’s SDS definition.

Living the SDS story aside, there’s still a story to tell. Not surprisingly, their main selling point is not different from the usual Storage vendors: performance. They constructed their technical marketing focused on solving what they call the “VM I/O blender” issue (well choosen term, have to recognize that). The VM I/O blender effect derivates from having of several VMs in concurrency disputing IOPS from the same Array, thus creating a large randomized pattern. (By the way: this randomized pattern is one of the reasons why you should always lookout for Storage Vendors claiming large Cache-Hit percentages, and always double-check the theoretical IO capacity on the disk side.)

VM IO Blender Virsto

How do you Architect Virsto’s Solution

Virsto uses a distributed Clustered Virtual Storage Appliance (VSA), with a parallel computing architecture (up to 32 nodes on the same VMware cluster latest from what I could check). You install a Virtual Storage Appliance (VSA) on each host, (and from what I could understand) serving as a target for you physical storage Arrays. Virsto VSA then presents storage back to your VMs just like NFS datastores, allowing it to control all VM IO while supporting HA, FT, VM and storage vMotion, DRS. Here is an overview of the architecture.

Virsto VMware Architecture

As usual, one of the VSA’s serves as a Master, and the other VSA on different nodes (i.e. VMware Hosts) as slaves. Each VSA has its own Log file (the vLog file), that serves as the central piece in the whole architecture, as we will see next. They support heterogeneous Block Storage, so yes, you are able to virtualize different Vendor Arrays with it (although in practice that might not be the best practical solution at all).

Virsto Architecture

You can use different Arrays from different providers, and tier those volumes up to four (4) storage tiers (such as SSD, 15k rpm, 10k rpm, and 7.2k rpm). After aggregation on the VSA, Virsto vDisks can be presented in many formats, namely: iSCSI LUN, vmdk, VHD, vVol, or KVM file. So yes, Virsto VSA also works on top of Hyper-V the same way. Here is an interesting Hyper-V demo.

So to solve the VM I/O blender effect, every Write IO from each VM is captured and logged into a local log file (Virsto vLog). Virsto aknowledges the write back to the VM immediately after the commit on the log file, and then asynchronously writes in a sequential manner back to your Storage Array. Having data written in a sequential manner has the advantage of having blocks organized in a contiguous fashion, thus enhancing performance on reads. Their claim is a better 20-30% read performance.

So as a performance sizing best practice, they do recommend using SSD storage for the vLog. Not mandatory, but serious performance booster. Yes please.

The claim is that through the logging architecture they are able to do both thin provisioning and Snapshots without any performance degredation.

As a result, they compare Virsto vmdks to Thick vmdks in terms of performance, and to linked-clones in terms of space efficiency. If you check Virsto’s demo on their site, you’ll see that they claim having better performance than thick eagor-zeroed vmdks, even with Thin Provisioned Virsto-vmdk disks. Note that all Virsto-vmdk are thin.

Finally how do they guarantee HA of the vLog? Well from what I could understand, one major difference from VMware’s own VSA is that Virsto’s VSA will simply leverage already existing shared storage from your SAN. So it will not create the shared storage environment, from what I understood it stands on Shared Storage. When I say it, please take only into consideration the vLog. I see no requirements to have the VSA as well on top of Shared Storage.

Data consistency is achieved by redoing the log file of the VSA that was on the failing Host. VMware’s VSA on the contrary, allows you to aggregate local non-shared disks of each of your VMware Hosts, and present them back via NFS to your VMs or Physical Hosts. It does this while still providing HA (of of the main purposes) by coping data across Clustered nodes in a “Network RAID”, providing continuous operations even on Host failure.

Some doubts that I was not able to clarify:

  • What are the minimum vHardware recommendations for each VSA?
  • What is the expected Virsto VSA’s performance hit on the Host?
  • What is the limit maximum recommended number of VSAs clustered together?

VMware’s VSA vs Virsto VSA

So as side note, I do not think it is fair to claim to near death of VMware’s own VSA, even if Virsto is indeed able to sustain all its technical claims. Virsto has a different architecture, and can only be positioned in more complexed IT environments  where you already have different Array technologies and struggle for performance vs cost.

VMware’s VSA is positioned for SMB customers with limited number of Hosts, and is mainly intended to provide a Shared Storage environment without shared storage. So different stories, different ends.