Link to Introduction: A first peak into the new IBM VDI Reference Architectures (IBM VDI RA Part 1)
It is well understood that a key inhibitor to VDI solutions (amongst general complexity, technical limitations and migration effort) is the upfront capital cost. As I’ve been leading this project architecturally I want to elaborate a little on the importance of our storage design approach.
Just again this month two of my (larger financial) customers have estimated respectively 40% and 50% of the projected VDI project CAPEX cost to be related to enterprise storage, primarily due to the specific IOPS requirements of VDI.
So let’s have a closer look at the storage approach for our RA … As you can tell from the above, our desired (but of course not only) use case will be the pooled “non-persistent/stateless desktop that enables users to connect to a new/different desktop image every time they login while keeping aspects of user experience persistent (profiles).
This allows the usage of local storage instead of shared storage as no user-associated data will reside persistently in the image, in (the unlikely) case of host failures, users can simply reconnect to a desktop hosted on another system without the need for e.g. VMware HA.
I will here not discuss again each of the architectural approaches in detail (e.g. persistent v non-persistent, positioning VDI v SBC etc.) and I ask for forgiveness for brushing over important alternatives discussed in previous articles on this blog.
Importance of in-depth performance analysis
As stated in the overview, the key design principle of our RA approach is to radically reduce the cost of VDI by utilizing local SSD storage instead of shared storage where possible.
Without going into great detail (see the final publication for details) I want to assure you that we have performed extensive analysis particularly of the storage related aspects in order to validate this approach. I ensured we measured and documented all IOPS performance aspects and monitored latency on all storage components. The collected data does not only validate the local SSD architecture but also gives us unmatched insight into the IOPS distribution and allows us to create sizing models for local and shared storage approaches which we will feed into new sizing tools. The above example illustrates the detail of the storage related data collected for every test (IOPS and latency measurements of a single test on each storage tier).
So, local instead of shared storage for VDI …
This approach is not new but unfortunately still not promoted widely enough.
Ok, allow me to be slightly controversial … review the majority of VDI reference architecture documents out there yourself and you’ll see that they are primarily created with/by major storage vendors … now, would it be in your interest to promote local storage if you goal is to sell enterprise SAN/NAS … I’ll let you be the judge …
And yes you could argue “what about you IBM – you are a storage vendor, no?” – let me say that common sense does sometimes prevail 😉
So why have we decided to make the ‘local storage’ architecture the core of our approach?
- To gain maximum return on investment, non-persistent/stateless virtual desktops should be the default approach in any VDI deployment (reduce storage requirements, minimise size&number of images, enable pooled images etc.). To be blunt – if one argues “I can’t do it with non-persistent, I need all to be dedicated desktops” then VDI’ is most likely the inappropriate approach anyway (e.g. high-end user requirements across the board) or the capability of “stateless” is misunderstood.
- Cost: Shifting IOPS data from shared storage to local storage allows you to significantly reduce cost – get a quote from any storage vendor for the same capacity/performance configured on their Enterprise SAN/NAS and compare it with the equivalent local storage cost and you will get a feel for the massive delta!
- Building blocks (servers) with local storage allow you for simple linear capacity scaling – add another system and you will get a linear capacity gain – no complexity in estimating impact on shared storage.
- So please approach VDI with non-persistent and treat persistent as “exception”! Local storage goes architecturally hand-in-hand with stateless desktops.
Of course reality is that there will be “exceptions” in most deployments and we will absolutely cater for those but let’s be clear, a deployment with 100% persistent desktops has little chances of (financial) success.
So what about those “exceptions”?
We all know that in most deployment you will be asked to provide persistent desktop. I’m sure I made clear that you should validate any “demands” for persistent desktops – Do NOT assume that the requestor has already done that! However, if persistent desktops are indeed required, our architecture will provide a hybrid of persistent and non-persistent desktops with the same building blocks (local SSD removed) simply through the addition of external storage, win/win.
One more comment – you will probably be familiar with interesting 3-rd party ‘SAN caching/optimization’ appliances like Atlantis’ ILIO (with their latest diskless feature) that try to address the storage cost issue. We have tested Atlantis in the past and seen very efficient offload so I am absolutely not discounting solutions like that – there is a place, specifically if primarily persistent desktops are required and we have been working with e.g. Atlantis in the past to provide ILIO based solutions.
So why have we not included SAN caching appliances (at least at this stage)?
- I’m a believer in simplicity – most VDI solutions today are clearly already too complex and non-integrated.
Introducing an additional layer (of 3rd party components) should only be done if the return absolutely justifies this.
From my experience the additional (licensing) cost, additional support layer makes the simple local SSD storage approach the preferred model where appropriate.
Cost for SSDs decrease rapidly, capacity and durability go up with it – arguably becoming a primary storage technology.
- Most major VDI vendors are increasingly integrating caching algorithms; you will be familiar with e.g. XenServer’s IntelliCache, VMware’s announced Storage Accelerator (CBRC) and Verde’s Storage Optimiser.
No, they are functionally not identical to e.g. ILIO (too big a topic to go into detail) – they address the issue in varying ways and primarily only the “read” IOPS – which is great for ‘boot storms’ but less so for the majority of “working state” IOPS (which is write). However they are/will be vendor-integrated, provide at least a subset of the functionality natively as part of the product, are fully compatible with our local SSD approach and I personally prefer to choke one throat in case of any issues.
I suppose the summary of this is that I have yet to see a SAN or SAN optimization appliance based building block that will flat out beat “local SSD” on price and simplicity for non-persistent desktops …
Again, let me be clear, there is no “one-fit” all approach and I am by no means implying that there won’t be cases where primarily persistent desktops, SAN+SAN optimization appliances or of course Terminal Services like solutions are appropriate (or in TS’s case potentially even more appropriate) – I have made my view absolutely clear on this before.
“So what about shared storage then, are you telling me I can get away without it completely?”
Ehhm, I’d be a fool to claim that …so let me be clear. There are types of data even in a non-persistent desktop environment that you need to keep available from any host/desktop and therefore needs to be hosted on shared storage … primarily the bits that give the user the feeling of persistency i.e. the user profile (desktop setting etc.) and any persistent user data (stored documents etc).
This is nothing new and has been achieved through various methods like roaming profiles, folder redirection etc. for ages and is increasingly enhanced through product features like VMware Personas, Citrix personal vDisk and Microsofts UE-V.
Bottom line is that you will need some shared storage …
Our architectural approach on this is clear and should (hopefully) make you happy ..
- As explained above – we absolutely minimize shared storage requirements by placing the heaviest load on local SSD and only use shared storage for persistent user data and profiles (these are typically already on shared storage for physical desktop environments in your environment – so most likely no additional investment at all)
- We understand that most have already a preferred storage platform – continue to use your own if you want to – our building block systems provide IP, FCoE and iSCSI based storage capabilities.
- In order to further minimize shared storage cost and allow maximum flexibility we suggest a file (not block-based) storage system (NFS or CIFS) – again, this will depend on your environment
In the next post we’ll move on to share more of the preliminary results – continuing with “enabling 3D and Aero capabilities in View 5 – impact on user density and step-by step instructions” – coming soon …