Data deduplication with adaptive erasure code redundancy (US20160344413A1)
Citation
Wideman, R., Arslan, Ş., Lee, J., & Goker, T. (November 24, 2016). (Patent). Data deduplication with adaptive erasure code redundancy. Pub. No: US20160344413A1. Quantum Corporation, San Jose, CA (USA). pp.1-20Abstract
Example apparatus and methods combine erasure coding with data deduplication to simultaneously reduce the overall redundancy in data while increasing the redundancy of unique data. In one embodiment, an efficient representation of a data set is produced by deduplication. The efficient representation reduces duplicate data in the data set. Redun dancy is then added back into the data set using erasure coding. The redundancy that is added back in adds protec tion to the unique data associated with the efficient repre sentation. How much redundancy is added back in and what type of redundancy is added back in may be controlled based on an attribute (e.g., value, reference count, symbol size, number of symbols) of the unique data. Decisions concern ing how much and what type of redundancy to add back in may be adapted over time based, for example, on observa tions of the efficiency of the overall system.