One of the interesting things about being a vendor of backup appliances is that your focus has to be broader than only software. One thing that has been getting discussed quite a bit lately are the hints that a 3TB SAS and SATA drive are coming this year from Seagate. In and of itself, that would be interesting only from the perspective that disk drive capacity seems to have resumed its upward trajectory after a bit of a pause between 1.5TB and 2TB. As a practical manner I’m always interested in the projected price per gigabyte – the cross-over point for the higher capacity drives and the lower capacity drives is the basis for one degree of affordability when it comes to backup appliances.
Designers of dedicated deduplication devices count on two things that run counter to the announcement of these drives: limited overall disk capacity because of the memory mapping requirements necessary for many deduplication algorithms and lower capacity drives with multiple spindles. Multiple spindles are an advantage because this type of layout increases the overall number of I/O operations on non-sequential disk I/O activity – the type of activity you see in many deduplication systems. It will be interesting to see the impact of these larger drives on the margins of the deduplication device vendors – my bet is that they’ll have to reduce their pricing. My prediction is that we’ll see a strong negative impact on deduplication device vendors from this.
Many people predicted margin pressure when software-only backup vendors came out with their own versions of deduplication. While there certainly appears to have been some degree of impact, I believe this impact has been muted by the fact that typically the software-only deduplication vendors don’t exhibit very good data reduction ratios due to typical hardware limitations of the backup server.
However, 3TB drives, particularly if these devices are competitive on a price per gigabyte basis, in effect mean that users who move from 1.5TB drives to 3TB drives are seeing an effective 2X increase in retention (in the case of a backup appliance) with absolutely no impact on ingest rates, recovery times, and with no need for an increased spend on processor, memory, and I/O subsystems.