How Microsoft Supports Data Deduplication for VDI

I have been in some rather lengthy discussions lately regarding the data deduplication enhancements for Windows Server 2012 R2.  Of course given the nature of what I do, these discussions are focused around utilization of the deduplication feature for VDI workloads.  The most common topic has been what exactly is supported.  I hope to shed some light on this within this article.  If you are unfamiliar with the data deduplication feature you can read more about it here.

Before Windows Server 2012 R2 we did not have the ability to optimize live VHD files.  This is the main feature that allows deduplication to be extended to VDI workloads and more specifically the CSV volumes where the VHD files live.  There are some pretty specific requirements for the architecture to be supported.  The main requirement calls for the storage and compute nodes to be remotely connected as shown below.

(credit to Matthias Wollnik at Microsoft for the image)

More specifically, it seems that the requirement is for the compute note to be connected to the storage node via a Windows 2012 R2 file server and leverage SMB 3.0 before the deduplication of open files is supported.  There does not appear to be any technical limitations that will prevent you from doing this another way but in order for Microsoft to guarantee the performance and officially support the configuration, you must design the architecture as I have mentioned above.  It definitely will not prevent us from testing a few scenarios in our lab.

Check out the following TechNet articles for more information about Windows Server 2012 R2 and data deduplication with VDI:
  1. Plan to Deploy Data Deduplication
  2. Extending Data Deduplication to new workloads in Windows Server 2012 R2
  3. Deploying Data Deduplication for VDI storage in Windows Server 2012 R2

Comments