Whilst NonStop remains the world’s No.1 choice for Mission-Critical systems, identifying and retaining resource with the...
Tributary’s View on De-Duplication
By Shawn Sabanayagam, CEO and Glenn Garrahan, Director HPE Business, Tributary Systems
Succinctly put, Disk deduplication is old technology. It was developed and brought to market in the early 2000’s, when disk capacities were relatively low and HDD prices were relatively high, resulting in a cost per TB of stored data significantly greater than today.
The methodology of deduplication as used by major manufacturers of de-duplication products had major problems in real world usage. While de-duplication effectively eliminated multiple copies and disk blocks from being stored, in order to be effective the customer needed to perform a “full backup” from his backup application such as Commvault, NetBackup, TSM / Spectrum Protect, Veeam etc. If a full backup was not performed nightly, the de-duplication ratio went down significantly! Additionally, de-duplication manufacturers generally charge a premium (of 4 to 7 times regular disk storage prices for every TB of raw disk capacity) on the assumption that their product can achieve at least 10:1 de-duplication ratio. Sometimes, on highly repetitive data, such as databases, when repeated full backups are performed, 18:1 or higher de-duplication ratios have been claimed and/or quoted by manufacturers! However, large amounts of certain customer data types, for example medical images in PAX applications or geological data and images for Oil & Gas and related service companies, are unique and will not compress or dedup. When customers have such data, they are paying a large premium for de-duplication disk hardware when compared to regular, simple disk space.
In addition, de-duplication subsystems were designed to replace magnetic tape or other storage methodologies, not complement them. Extrapolating, this means the de-duplication device is the final resting place for data until it is expired by the Backup Management Application. Data portability is virtually non-existent with de-duplication, it was not designed for tiering or sending data to a lower cost archival medium like tape, object storage or a public cloud (object storage using S3 protocol). The ability to move data out of the de-duplication storage subsystem was added much later, as lack of this fundamental capability generated customer complaints. Even then, if data needs to be moved out of the de-duplication subsystem, the product has to fully hydrate the data which is a painfully slow process and in most cases third party software must be employed to move and track the data as it gets written to tape. Certain manufacturers have created a process by which data can be transferred somewhat more efficiently if a customer agrees to a total supplier lock-in, using only that particular vendor’s cloud product, many of which have repeatedly proven to be immature, suboptimal products in a market place with much better choices. And, of course, there have been some high profile failures with these proprietary cloud products.
Finally, all de-duplication disk solutions go through a process called “garbage cleaning” to redirect links to new locations as the product optimizes space on its dedup disk. This process can take minutes to many hours depending on the size of the de-duplication subsystem at a customer site and the size of the backups. During this garbage cleaning process, no backups or restores can be done and the system is effectively down. Many customers have complained about this publicly. The de-duplication methodology also causes performance degradation in backup and restore when the dedup disk space is only 55 percent full.
Enter Tributary System’s Storage Director:
Storage Director (SD) on the other hand has none of the above mentioned issues. It was designed and built from the start as a software defined, ultra-high performance, policy based and tiered backup, archive and cloud vaulting solution. It uses two methods to reduce the incoming backup data at ingestion. First, it works with all backup applications such as Commvault, NetBackup, TSM/Spectrum Protect, Veeam, Data Protector, Networker and all other open Backup Management Applications to do “incrementals forever” or periodic incremental backups. Second, SD employs hardware enabled compression and AES 256 bit encryption with a backup software based compression algorithm to reduce and secure the backup data upon ingestion. The combination of these two methodologies achieves the same data reduction as de-duplication. If a customer then adds Hitachi Content Platform (HCP) and its Geo Dispersal, the combined SD+HCP solution can achieve data reduction superior to a de-duplication storage subsystem. Storage Director plus Hitachi Content Platform benefits include:
- 2-3 times faster backups per node than high end and very expensive de-duplication solutions
- 4-5 times faster restores per node than these same de-duplication solutions because no rehydration of data is needed
- Built-in policy based tiering to HCP, public cloud, tape, NAS etc. as well as replication to SD nodes at a remote site if needed – all at no additional cost to the customer
- Uses industry standard hardware and disk storage for cache which includes Hitachi G350, G600, Supermicro, HPE and IBM disk solutions
- No garbage cleaning and hence 24X7 availability for backups and restores
- SD comes with a price guarantee: To be lower cost than de-duplication subsystems for any given customer backup requirement or data backup footprint; this price guarantee includes free additional SD licenses if SD does not reduce data as described at purchase
Storage Director is currently in production in datacenter environments at four of the 20 largest (Fortune 20) corporations in the US, and for 10 years or more. References are available.
As always, if you’d like additional information on Storage Director, visit www.tributary.com, or contact our Sales Director, Matt Allen, at 817-786-3066 (office) or 713-492-7434 (cell). Matt can also be reached at email@example.com
Storage Director allows NonStop professionals to “augment what they have and use it in creative new ways!”