Backup and Deduplication Concepts: Encryption and Physical Storage Density and Pricing

The next post in a series on customer-focused issues associated with deduplication and backup.


Encryption is the process of transforming data to make it unreadable to anyone except those who possess a key that will decrypt that data. In order to make the data unreadable, encryption algorithms attempt to eliminate any discernible patterns in the underlying data. Since compression and deduplication both tend to work on patterns of data, both are negatively impacted by encryption.

Physical Storage Density and Pricing

The rapid increase in available physical storage density and the rapid trend toward decreasing price per terabyte negatively impacts deduplication. In this sense, the increasing physical storage density and decreasing price per terabyte is analogous to encryption – both make conventional deduplication more difficult. The reason is that conventional deduplication requires a great deal of processors, processor cores, and physical memory in order to efficiently map and process each segment of information (which is typically called a “block”.) As the size of the physical storage grows, so does the amount of processors and memory required for deduplication.

The technique that most deduplication device vendors use is to keep the amount of disk space low on a per device basis and to recommend multiple devices in order to scale their aggregate storage capacity higher. This technique works effectively for avoiding forklift upgrades; however, it is costly on a per-storage device because of the additional capital expenditure costs as well as the operational expenditure associated with managing the devices. In addition, if you go this route, you want to make sure you understand federated storage and local versus global deduplication (which will be discussed later in this series.)