Product : Datrium, DVX [HCI]/, x86
Feature : Disk Failure Protection, Reads/Writes, Data Availability
Content Owner:  Herman Rutten
Data Nodes: Erasure Coding (N+2)
Compute Nodes: Peer Cache Data Access
Data Nodes: Datrium DVX has always-on erasure coding logic to protect against two disk failures in the storage pool. The probability of 2 drives failures happening concurrently is low. However, if one drive fails, the probability of getting an LSE (latent sector errors or uncorrectable errors) during a rebuild is pretty high. That is the main reason to tolerate double-disk failures. The Datrium DVX system architecture is such, that erasure coding does not come with performance tradeoffs.

Datrium DVX software on each Compute Node appends data into the storage pool in the form of stripes. Each stripe is erasure coded into chunks which are then distributed to the drives according to a canonical set of stripe layouts. The canonical set of stripe layouts is carefully constructed to spread data redundancy, customer workload, drive rebuild load and space usage uniformly across all the drives. This form of declustering or wide striping spreads both redundancy data and spare capacity uniformly across all drives in all Data Nodes of the DVX. Rebuild performance to restore redundancy after a drive failure linearly increases as additional Data Nodes are added.

Erasure Coding (EC) is a general scheme for encoding data by partitioning it into fragments augmented with parity information to enable data recovery on fragment loss. EC is far more efficient than mirroring all data, which becomes more apparent as the storage node configuration scales out.

Compute Nodes: In the Datrium architecture, all compute nodes are completely stateless. That means a full server failure will never result in data loss or require any sort of rebuild activity. For partial host failures, Datrium DVX 4.0 has introduced Peer Cache Data Access. In the case of total flash failure on a host, that host can access data from any hosts in the Datrium DVX Compute cluster that are configured to share flash with other hosts.