VMware SDS (?) – Virsto VSA Architecture

Though still having limited technical resources available for an indepth deep dive, the available resources already provide an overall picture.

Virsto positions itself as a Software Defined Storage (SDS) product, using Storage virtualization tecnology. Well, besides being 100% Software, I see a huge gap from the Networking Software-Defined concept to the SDS side. I do imagine VMware’s marketing pushing them to step up the SDS marketing, with their Software Defined Datacenter vision. However Virsto failed to make me disagree on Wikipedia’s SDS definition.

Living the SDS story aside, there’s still a story to tell. Not surprisingly, their main selling point is not different from the usual Storage vendors: performance. They constructed their technical marketing focused on solving what they call the “VM I/O blender” issue (well choosen term, have to recognize that). The VM I/O blender effect derivates from having of several VMs in concurrency disputing IOPS from the same Array, thus creating a large randomized pattern. (By the way: this randomized pattern is one of the reasons why you should always lookout for Storage Vendors claiming large Cache-Hit percentages, and always double-check the theoretical IO capacity on the disk side.)

VM IO Blender Virsto

How do you Architect Virsto’s Solution

Virsto uses a distributed Clustered Virtual Storage Appliance (VSA), with a parallel computing architecture (up to 32 nodes on the same VMware cluster latest from what I could check). You install a Virtual Storage Appliance (VSA) on each host, (and from what I could understand) serving as a target for you physical storage Arrays. Virsto VSA then presents storage back to your VMs just like NFS datastores, allowing it to control all VM IO while supporting HA, FT, VM and storage vMotion, DRS. Here is an overview of the architecture.

Virsto VMware Architecture

As usual, one of the VSA’s serves as a Master, and the other VSA on different nodes (i.e. VMware Hosts) as slaves. Each VSA has its own Log file (the vLog file), that serves as the central piece in the whole architecture, as we will see next. They support heterogeneous Block Storage, so yes, you are able to virtualize different Vendor Arrays with it (although in practice that might not be the best practical solution at all).

Virsto Architecture

You can use different Arrays from different providers, and tier those volumes up to four (4) storage tiers (such as SSD, 15k rpm, 10k rpm, and 7.2k rpm). After aggregation on the VSA, Virsto vDisks can be presented in many formats, namely: iSCSI LUN, vmdk, VHD, vVol, or KVM file. So yes, Virsto VSA also works on top of Hyper-V the same way. Here is an interesting Hyper-V demo.

So to solve the VM I/O blender effect, every Write IO from each VM is captured and logged into a local log file (Virsto vLog). Virsto aknowledges the write back to the VM immediately after the commit on the log file, and then asynchronously writes in a sequential manner back to your Storage Array. Having data written in a sequential manner has the advantage of having blocks organized in a contiguous fashion, thus enhancing performance on reads. Their claim is a better 20-30% read performance.

So as a performance sizing best practice, they do recommend using SSD storage for the vLog. Not mandatory, but serious performance booster. Yes please.

The claim is that through the logging architecture they are able to do both thin provisioning and Snapshots without any performance degredation.

As a result, they compare Virsto vmdks to Thick vmdks in terms of performance, and to linked-clones in terms of space efficiency. If you check Virsto’s demo on their site, you’ll see that they claim having better performance than thick eagor-zeroed vmdks, even with Thin Provisioned Virsto-vmdk disks. Note that all Virsto-vmdk are thin.

Finally how do they guarantee HA of the vLog? Well from what I could understand, one major difference from VMware’s own VSA is that Virsto’s VSA will simply leverage already existing shared storage from your SAN. So it will not create the shared storage environment, from what I understood it stands on Shared Storage. When I say it, please take only into consideration the vLog. I see no requirements to have the VSA as well on top of Shared Storage.

Data consistency is achieved by redoing the log file of the VSA that was on the failing Host. VMware’s VSA on the contrary, allows you to aggregate local non-shared disks of each of your VMware Hosts, and present them back via NFS to your VMs or Physical Hosts. It does this while still providing HA (of of the main purposes) by coping data across Clustered nodes in a “Network RAID”, providing continuous operations even on Host failure.

Some doubts that I was not able to clarify:

  • What are the minimum vHardware recommendations for each VSA?
  • What is the expected Virsto VSA’s performance hit on the Host?
  • What is the limit maximum recommended number of VSAs clustered together?

VMware’s VSA vs Virsto VSA

So as side note, I do not think it is fair to claim to near death of VMware’s own VSA, even if Virsto is indeed able to sustain all its technical claims. Virsto has a different architecture, and can only be positioned in more complexed IT environments  where you already have different Array technologies and struggle for performance vs cost.

VMware’s VSA is positioned for SMB customers with limited number of Hosts, and is mainly intended to provide a Shared Storage environment without shared storage. So different stories, different ends.

One thought on “VMware SDS (?) – Virsto VSA Architecture

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s