Bye Bye VMFS
At its simplest, VVol removes the layer of abstraction that exists between a virtual machine and it’s data. As one of my colleagues has said, it’s akin to providing an a raw-device mapped LUN for every VMDK file without the operational implications of RDM.
It is important to understand that the layer of abstraction that will be removed is the VMFS filesystem, and this has been the middleman between the virtual machine and it’s data. This is now removed and this has many implications.
For starters, it means there is no local filesystem instantiation in vSphere. This has an impact on snapshots, for example, which now cannot be completed without hardware offload.
Why ? If you think about it there is nowhere “local” where vSphere can store the snapshot. And there are some neat benefits of snapshots from a placement, fault isolation and containment perspective. I’ll cover them in later posts …. the concept of a datastore filling up and stopping all I/O to all resident VMs will become a thing of the past.
As I mentioned in my last post there have been consequences from an operational perspective such as the way mapped blocks (used space on disk) are managed separately by the hypervisor and the storage array. This is why operationally there have been problems with the traditional LUN presentation approach to virtual infrastructure.
Consider the diagram:
This approach relies on either the predictive schema (discrete datastore for individual apps, VMs, or VMDKs) or the adaptive schema where many VMs share the same datastore and if a problem arises you move the individual VMDKs to another datastore. So most folks “suck it an see” in the adaptive schema.
That’s not to say you can’t empirically size datastores – You can - but most people don’t know how or don’t do it. And problems many vendor engineers don't know how either.
Some of the operational challenges today are:
- If you want to use the fastest application consistent backup method (hardware snapshot), the snapshot has to snap the entire datastore.
- If the entire contents are being backed up then the period of exposure can be a long time to capture all the objects.
- Creation of these snapshots involves coordinating the point in time a datastore and LUN are snapped to ensure consistency.
- This is not a simple engineering problem and can lead to problems when it stops working. It also is highly dependent on a proven interoperability configuration which is also tricky.
- People feeling that have to adopt single size fits all approach
- Set all datastores the same size for simplicity
- Present datastores to all clusters (despite this being not a good practice or a good idea)
- The VMware consumer has no visibility regarding the capabilities of the datastore and is entirely reliant on the storage administrator to ensure the correct datastore is presented etc.
- He also has no idea where they are. Wouldn’t it be great if this metadata was exposed ?
- Typically more datastores are pre-allocated than are required, normally to multiple clusters “just in case”. This is not a good idea and leads to too large a fault domain.
Now consider the removal of VMFS from the picture. Now we map every VM disk (including config files and swap files) to a separate object on the storage.
ASIDE: One of my NAS colleagues quite funnily refers to block storage being able to finally replicate the same simple features of NAS. It’s not quite that simple but he has a point. Later on I'll show how we still think the we have same structure (folders) on the storage due to the way the Web Client abstracts the Virtual Machine objects. This is quite interesting too !
Now we have an improved operational picture:
- Data services are provided at virtual machine disk level.
- No longer do we need to compromise on making everything the same for the sake of administrative overhead.
- So “Let’s make all datastores 1TB or 2TB …….”
- Abstraction of back-end storage, pools and containers from a virtual machine.
- Now we just place a disk somewhere like “SQL Server Production”, “Dublin Business Critical”. Now you don't need to know where your data resides. Why do you care ? Once your requirements are met, let VM storage profiles help with data placement.
- Use VM storage profiles as the logical container in which virtual machines are stored.
- This is a paradigm shift. What happens now when you change the location of where a VM disk is stored ? We will address that question later but it’s interesting and one you should consider !.
In the next post I will describe the VASA provider and Protocol Endpoints in more detail. Until then …. think about some of these ideas !