Deduplication

Deduplication Overview -- NetApp Storage Efficiency :

Deduplication is a technique which is used to reduce space by discarding duplicate blocks.

when SIS (single instance storage) is run on volume it scans each and every block and assigns digital signature and then it will compare all the signatures. The block which have same signature is considered to be having same data, hence it will keep only one block and discards remaining blocks.

The inodes are adjusted to point only one block.

Deduplication – The NetApp Approach

The goal of the testing here is to compare storage performance of a data set before and after deduplication. Sometimes capacity is the only factor, but sometimes performance matters. The test is random 4KB reads against a 100GB file. The 100GB file represents significantly more data than the test system can fit into its’ 16GB read cache. I am using 4KB because that is the natural block size for NetApp.

To maximize the observability of the results in this deduplication test, the 100GB file is completely full of duplicate data. For those who are interested, the data was created by doing a dd from /dev/zero. It does not get any more redundant than that. I am not suggesting this is representative of a real world deduplication scenario. It is simply the easiest way to observe the effect deduplication has on other aspects of the system.

This is the output from sysstat -x during the first test. The data is being transferred over NFS and the client system has caching disabled, so all reads are going to the storage device. (The command output below is truncated to the right, but the important data is all there.)

Random 4KB reads from a 100GB file – pre-deduplication:

CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk FCP iSCSI FCP kB/s iSCSI kB/s

in out read write read write age hit time ty util in out in out

19% 6572 0 0 6579 1423 27901 23104 11 0 0 7 16% 0% - 100% 0 7 0 0 0 0

19% 6542 0 0 6549 1367 27812 23265 726 0 0 7 17% 5% T 100% 0 7 0 0 0 0

19% 6550 0 0 6559 1305 27839 23146 11 0 0 7 15% 0% - 100% 0 9 0 0 0 0

19% 6569 0 0 6576 1362 27856 23247 442 0 0 7 16% 4% T 100% 0 7 0 0 0 0

19% 6484 0 0 6491 1357 27527 22870 6 0 0 7 16% 0% - 100% 0 7 0 0 0 0

19% 6500 0 0 6509 1300 27635 23102 442 0 0 7 17% 9% T 100% 0 9 0 0 0 0

The system is delivering an average of 6536 NFS operations per second. The cache hit rate hovers around 16-17%. As you can see, the working set does not fit in primary cache. This makes sense. The 3170 has 16GB of primary cache and we are randomly reading from a 100GB file. Ideally, we would like to get a 16% cache hit rate (16GB cache / 100GB working set) and we are very close. The disks are running at 100% utilization and are clearly the bottleneck in this scenario. The spindles are delivering as many operations as the are capable of. So what happens if we deduplicationthis data?

First, we need to activate deduplication, a_sis in NetApp vocabulary, on the test volume and deduplicate the test data. (Before deduplication became the official buzz word, NetApp referred to their technology as Advanced Single Instance Storage.)

fas3170-a> sis on /vol/test_vol

SIS for "/vol/test_vol" is enabled.

Already existing data could be processed by running "sis start -s /vol/test_vol".

fas3170-a> sis start -s /vol/test_vol

The file system will be scanned to process existing data in /vol/test_vol.

This operation may initialize related existing metafiles.

Are you sure you want to proceed (y/n)? y

The SIS operation for "/vol/test_vol" is started.

fas3170-a> sis status

Path State Status Progress

/vol/test_vol Enabled Initializing Initializing for 00:00:04

fas3170-a> df -s

Filesystem used saved %saved

/vol/test_vol/ 2277560 279778352 99%

fas3170-a>

There are a few other files on the test volume that contain random data, but the physical volume size as been reduced by over 99%. This means our 100GB file is now less that 1GB in size on disk. So, let’s do some reads from the same file and see what has changed.

Random 4KB reads from a 100GB file – post-deduplication:

CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk FCP iSCSI FCP kB/s iSCSI kB/s

in out read write read write age hit time ty util in out in out

93% 96766 0 0 96773 17674 409570 466 11 0 0 35s 53% 0% - 6% 0 7 0 0 0 0

93% 97949 0 0 97958 17821 413990 578 764 0 0 35s 53% 8% T 7% 0 9 0 0 0 0

93% 99199 0 0 99206 18071 419544 280 6 0 0 34s 53% 0% - 4% 0 7 0 0 0 0

93% 98587 0 0 98594 17941 416948 565 445 0 0 36s 53% 6% T 6% 0 7 0 0 0 0

93% 98063 0 0 98072 17924 414712 398 11 0 0 35s 53% 0% - 5% 0 9 0 0 0 0

93% 96568 0 0 96575 17590 408539 755 502 0 0 35s 53% 8% T 7% 0 7 0 0 0 0

There has been a noticeable increase in NFS operations. The system has gone from delivering 6536 NFS ops to delivering 96,850 NFS ops. That is nearly fifteen-fold increase in delivered operations. The CPU utilization has gone up roughly 4.9x. The disk reads have dropped to almost 0 and the system is serving out over 400MB/s. This is a clear indication that the operations are being serviced from cache instead of from disk. It is also worth noting that the average latency, as measured from the host, has dropped by over 80%. The improvement in latency is not surprising given that the requests are no longer being serviced from disk.

The cache age has dropped down to 35 seconds. Cache age is the average age of the blocks that are being evicted from cache to make space for new blocks. The test had been running for over an hour when this data was captured, so this is not due to the load ramping. This suggests that even though we are accessing a small number of disk blocks, the system is evicting blocks from cache. I suspect this is because the system is not truly deduplicating cache. Instead, it appears that each logical file block is taking up space in cache even though they refer to the same physical disk block. One potential explanation for this is that NetApp is eliminating the disk read by reading the duplicate block from cache instead of disk. I am not sure how to validate this through the available system stats, but I believe it explains the behavior. It explains why the NFS ops have gone up, the disk ops have gone down, and the cache age has gone down to 35 seconds. While it would be preferable to store only a single copy of the logical block in cache, this is better than reading all of the blocks from disk.

The cache hit percentage is a bit of a puzzle here. It is stable at 53% and I am not sure how to explain that. The system is delivering more than 53% of the read operations from cache. The very small number of disk reads shows that. Maybe someone from NetApp will chime in and give us some details on how that number is derived.

This testing was done on Data ONTAP 7.3.1 (or more specifically 7.3.1.1L1P1). I tried to replicate the results on versions of Data ONTAP prior to 7.3.1 without success. In older versions, the performance of the deduplicated volume is very similar to the original volume. It appears that reads for logically different blocks that point to the same physical block go disk prior to 7.3.1.

Unix-Netapp

Wednesday, 15 August 2012

Deduplication

No comments:

Post a Comment