Compression, deduplication and REFS
In comparison to NTFS, ReFS (Resilient File System) clearly has attributes of scale and is built for storing a large amount of data reliably. The definition of large here is not restricted to TBs but rather peta bytes and further. NTFS did not have the characteristics to scale at that level. Given that a CHKDSK operation to identify and fix file system corruption for a file system large configured as NTFS would take days or weeks, ReFS seems to be the natural fit for these larger environments where the integrity of the file system is constantly being ensured.
So what surprises me is that ReFS does not support compression or deduplication to reduce the storage footprint of common data. NTFS supports file compression and with Windows 2012, NTFS volumes can also be configured for deduplication. I see a common use case of ReFS to be a repository of virtual machine images or large files that could have a lot of common blocks, especially in the case of virtual machines, large ranges of zero or null blocks. Deduplication is perfect in this case as the zero blocks all get effectively compacted to one single block.
ReFS like NTFS supports a pluggable file system architecture and will allow third party products to layer on top of it to offer functionality like deduplication. Also true is the fact that deduplication is resource intensive, but given the fact that ReFS already creates checksums for blocks to ensure integrity and resilience, I was surprised to find this feature missing. It may very well be in the offing sometime soon, let’s wait and watch.