How memory works in ONTAP: NVLOG and Write-Through (Part 2)

NVRAM/NVMEM & Write-Through

It is important to note that NVRAM/NVMEM technology is widely used in many storage systems, but NetApp ONTAP systems are using it for NVLOGs (HW journaling), while others using it as a block device for write-cache (disk driver level or disk cache) and that simple fact makes some difference in storage architectures. With ONTAP having its architecture with NVLOGs allows system not to switch into Write-Through mode in case one controller in an HA pair dead.

That simple statement is tough to explain in simple words. Let me try. Write-Through is a storage system mode which is not using write-buffer, and all the writes directed straight to the disks, which mean it disables write-buffer, which is a bad idea for many reasons. All the optimizations, all the magic and all the intellectual stuff happening with your data in the write buffers, so disabling write buffers is always a bad idea. For example, some problems you might experience with storage in write-through mode if you are using HDD. HDD drives are significantly slower in performance and memory always way faster than drives, so you can optimize in write-buffer random operations and glue them together in memory and later destage them sequentially on your HDD drives as one big chunk of data on your drives in one operation which is easier to process for HDD drives. Memory cache basically used to trick your hosts and to send them acknowledgment before the data actually placed on disks and in this way to improve performance. In the case of Flash media, you can optimize your data to be written in a way not to wear out your memory cells. Memory also very often used as a place to prepare checksums for RAID (or other types of data protection). So, the bottom line Write-Through is terrible for your storage system performance and all storage vendors trying to avoid that scenario in their systems. When might you need in a storage system architecture to switch to Write-Through? When you are uncertain that your write cache will protect you. The simplest example is when your battery to your write cache is dead.

Let’s examine another more complex scenario, what if you have an HA pair and battery only on one controller die? Well, since all storage systems from all A-brands doing HA, your writes should be protected. What happens if you’re in your HA pair will lose one controller and survived one will have a battery? Many of you might think, according to described logic above, that your storage system will not switch to Write-Through, right? Well, the answer to that question, “it depends.” In ONTAP world since we have NVLOGs used only for data protection purposes in dedicated NVRAM/NVMEM device, data always presented as they were placed there in the unchanged state with no architectural ability to change it, the only thing which is architecturally allowable is to write data to an empty half and then when needed clear all the NVLOGs from first one, so in this architecture there is no need to switch your ONTAP system to Write-Through even though you have only a single controller working. While all other architectures even though they are also using NVRAM/NVMEM, all the data stored in one place.

Both systems either ONTAP and other storage vendors using memory for data optimization, in other words, they are changing data from its original state. And changing your data is a big threat to your data consistency, once you have only one controller survived, even though your battery on that controller functioning properly. So, that’s why all the rest storage systems have to switch to Write-Through, because there is no way to guarantee your data will not be corrupted while in the middle of data optimization, especially after your data been (half-) processed with your RAID algorithm and you will have an unexpected reboot for a survived node. Therefore, all other platforms and systems, all other NetApp AFF/FAS competitors I know, they all will switch to Write-Trough mode once there will be only one node left. There obviously some tricks, like some vendors allow you to disable Write-Through once you get in such a situation, but of course it is not recommended, they just give you the ability to make a bad choice on your own and will lead you to data corruption on entire storage, ones the survived node will unexpectedly reboot. Another example is HPE 3Par systems, wherein 4-node configuration if you will lose only one controller, your system will continue to function normally, but once you lose 50% of your nodes, it again switches to Write-Trough, the same happens if you have a 2-node configuration.

Thanks to the fact that ONTAP stores data in NVLOGs as they were, in unchanged form, it is possible to roll back earlier unfinished transaction of data been already half-processed by RAID, restore your data back to MBUF from NVLOGs and finish that transaction. Each transaction to write new data to disk drives from MBUF executed as part of system snapshot called CP. Each transaction can be easily rolled back and after that single controller will boot it will restore data from NVLOGs to MBUF, again process it with RAID in memory, rollback last unfinished CP and write data to disks, which allows ONTAP systems to be always consistent (from the storage system perspective) and never stitch to Write-Through mode.

Continue to read

How ONTAP cluster works?

Zoning for ONTAP Cluster

Disclaimer

Please note in this article I will describe my own understanding of the internal organization of system memory in ONTAP systems. Therefore, this information might be either outdated, or I simply might be wrong in some aspects and details. I will greatly appreciate any of your contribution to make this article better, please leave any of your ideas and suggestions about this topic in the comments below.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.