What is NetApp ASA?

ASA stands for All-flash SAN Array. ASA based on low-end & high-end AFF systems that are using ONTAP.

ONTAP architecture in ASA systems remains the same, with no changes. The only change is in the access to the storage over SAN protocols.

In (non-ASA) Unified ONTAP systems SAN protocols like FC and iSCSI are using ALUA which stands for Asymmetrical Logical Unit Access so, this type of connection called active/active but uses ”active optimized” and ”active non-optimized” paths. NVMe ANA works similar to ALUA for SCSI-Based protocols. Both with ANA & ALUA in case of one storage controller failure, the host waits for a timeout, before the host switches to the active non-optimized path. Which works perfectly fine. See more in the section ”Share-nothing architecture”, and “Network access” in a series of articles ”How ONTAP Cluster works”.

But there are some customers who were:

  1. Used to the idea of symmetric active/active connectivity
  2. Looking for a product that will provide fewer notifications to the host at the event of a path loss

NetApp listened to its customers, evaluated both requests and provided ASA products that give the solution they have been looking for.

Video with Skip Shapiro about ASA:

New NetApp platform & Licensing improvements in ONTAP 9.6 (Part 1)

A320

All flash A320 2U platform introduced, here are a few important details for this new AFF system:

  • From the performance perspective of view most notable is ~100 microseconds latency on SQL SLOB workload. If true, that is a notable improvement because previously we’ve seen only sub 1 millisecond (1,000 microseconds) latency and new latency basically a few times (in the best-case scenario ~10 times) faster
    • About 20% better IOPS performance than A300
  • NVDIMM instead of traditional NVRAM in high-end/mid-range platforms. This is the second NetApp AFF platform after A800 system which adopted NVDIMM instead of PCIe-based NVRAM. Strictly speaking, NVDIMM has been around in entry FAS/AFF systems for an extended period of time, but only because of luck of PCIe slots & space in the controllers
  • No disk drives in the controller chassis
  • No RoCE support for hosts. Yet
  • End to End NVMe
  • Rumors from Insight 2018 confirmed about new disk shelves
    • NS224 directly connected over RoCE
    • 2 disk shelves maximum
    • 1.9 TB, 3.6 TB, and 7.6 TB drives supported
    • With an upcoming ONTAP release disk shelves connected to controllers over a switch will be supported and thus more disk shelves than just two
  • Not very important to customers, but interesting update from engineer theoretical perspective: with this new platform HA and Cluster Interconnect connectivity now combined, unlike in any other appliances before.
  • 8x Onboard 100 GbE ports per controller:
    • 2 for cluster interconnect 100 GbE ports (and HA)
    • 2 for the first disk shelf and optionally another 2 for the second disk shelf
    • it leaves 2 or 4 100 GbE ports for host connection
  • 2 optional PCIe cards per controller with next ports:
    • FC 16/32 Gb ports
    • RoCE capable 100/40 GbE
    • RoCE capable 25 GbE
    • Or 10 Gb BASE-T ports

Entry Level Systems

Previously released A220 system now available with 10G BASE-T ports, thanks to increase popularity of 10G BASE-T switches.

MCC IP for low-end platforms

MCC IP becomes available for low-end platforms: A220 & FAS2750 (not for 2720 though) in ONTAP 9.6 and requires a 4-node configuration (as all MCC-IP configs). New features made in a way to reduce cost for such small configurations.

  • All AFF systems with MCC-IP supports partitioning, including A220
  • Entry-level systems do not require special iWRAP cards/ports like other storage systems
  • Mixing MCC IP & other traffic allowed (with all the MCC-IP configs?)
    • NetApp wants to ensure customers to get great experience with their solutions so there will be some requirements your switch must meet to maintain high performance to be qualified for such MCC IP configuration.

Brief history of MCC IP:

  • In ONTAP 9.5 mid-range platforms FAS8200 & A300 added support for MCC IP
  • In ONTAP 9.4 MCC IP becomes available on high-end A800
  • And initially MCC IP introduced in ONTAP 9.3 for high-end A700 & FAS9000 systems.

New Cluster Switches

Two new port-dense switches from Cisco and Brocade with 48x 10/25 GbE SFP ports and a few 40 GbE or 100GbE QSFP ports. You can use same switches for MCC IP. Here is Brocade-based BES-53248 which will replace CN1610:

And new Cisco Nexus 92300YC with 1.2U height.

NVMe

New OS supported with ONTAP 9.6: Oracle Linux, VMware 6.7, and Windows Server 2012/2016. Previously in ONTAP 9.5 were supported SUSE Linux 15 and RedHat Enterprise Linux 7.5/7.6, RedHat still doesn’t have ANA support. New FlexPod config with A800 connected over FC-NVMe to SUSE Linux. Volume move now available with NVMe namespaces.

NVMe protocol becomes free. Again

In ONTAP 9.6 NVMe protocol become free. It was free when firstly introduced in 9.4 without ANA (Analog for SAN ALUA multipathing), and then it became not free in 9.5.

SnapMirror Synchronous licensing adjusted

In 9.6 simplified licensing, SM-S included in Premium Bundle. NetApp introduced SM-S in ONTAP 9.5 and previously licensed it by TB. If you not going to use a secondary system as the source to another system, SM-S do not need licensing on the secondary system.

New services

  • SupportEdge Prestige
  • Basic, Standard and Advanced Deployment options
  • Managed Upgrade Service

Read more

Disclaimer

All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. No one is sponsoring this article.

ONTAP improvements in version 9.6 (Part 2)

Starting with ONTAP 9.6 all releases are long-term support (LTS). Network auto-discovery from a computer for cluster setup, no need to connect with the console to set up IP. All bug fixes available in P-releases (9.xPy), where “x” is a minor ONTAP version and “y” is P-version with a bunch of bug fixes. P-releases going to be released each 4 weeks.

New OnCommand System Manager based on APIs

First, System Manager no longer carrying OnCommand last name now it is ONTAP System Manager. ONTAP System Manager shows failed disk position in a disk shelf and network topology. Like some other All-Flash vendors, the new dashboard shows storage efficiency with a single number, which includes clones and snapshots, but you still can find information separately for each efficiency mechanism.

Two system managers available simultaneously for ONTAP 9.6:

  • The old one
  • New API-based one (on the image below)
    • Press “Try the new experience” button from the “old” system manager

NetApp will base system Manager and all new Ansible modules on REST APIs only which means NetApp is taking it rather seriously. With 9.6 ONTAP NetApp brought proprietary ZAPI functionality via REST APIs access for cluster management (see more here & here). ONTAP System manager shows the list of ONTAP REST APIs that have been invoked for the performed operations which allows to understand how it works and use APIs in day to day basis. REST APIs available through System Manager web interface at https://ONTAP_ClusterIP_or_Name/docs/API, the page includes:

  • Try it out feature
  • Generate the API token to authorize external use
  • And built-in documentation with examples.

List of cluster management available through REST APIs in ONTAP 9.6:

  • Cloud (object storage) targets
  • Cluster, nodes, jobs and cluster software
  • Physical and logical network
  • Storage virtual machines
  • SVM name services such as LDAP, NIS, and DNS
  • Resources of the storage area network (SAN)
  • Resources of Non-Volatile Memory Express.

APIs will help service providers and companies where ONTAP deployed many instances in an automated fashion. System Manager will save historical performance info, while before 9.6 you can see only data from the moment you have opened the statistic window and after you close it, it would lose statistics. See ONTAP guide for developers.

Automation is the big thing now

All new Ansible modules will use only REST APIs. Python SDK will be available soon as well for some other languages.

OCUM now AUM

On Command Unified Manager renamed to ActiveIQ Unified Manager. Renaming show Unified Manager going to work with ActiveIQ in NetApp cloud more tightly.

  • In this tandem Unified Manager gives a detailed, real-time analytics, simplifies key performance indicator and metrics so IT generalists can understand what’s going on, it allows to troubleshoot and to automate and customize monitoring and management
  • While ActiveIQ is cloud-based intelligence engine, to provide predictive analytics, actionable intelligence, give recommendations to protect, and optimize NetApp environment.

Unified Manager 9.6 provides REST APIs, not just proactively identifying risks but, most importantly, now provide remediation recommendations. And also gives recommended to optimize workload performance and storage resource utilization:

  • Pattern recognition eliminates manual efforts
  • QoS monitoring and management
  • Realtime events and maps key components
  • Built-in analytics for storage performance optimizations

SnapMirror

SnapMirror Synchronous (SM-S) do not have automatic switchover yet as MetroCluster (MCC), and this is the key difference, which still keeps SM-S as a DR solution rather than HA.

  • New configuration supported: SM-S and then cascade SnapMirror Async (SM-A)
  • Automatic TLS encryption over the wire between ONTAP 9.6 and higher systems
  • Workloads that have excessive file creation, directory creation, file permission changes, or directory permission changes are suitable (these are referred to as high-metadata workloads) for SM-S
  • SM-S now supports additional protocols:
    • SMB v2 & SMB v3
    • NFS v4
  • SM-S now support qtree & fpolicy.

FlexGroup

Nearly all important FlexGroup limitations compare FlexVols now removed:

  • SMB Continuous Availability (CA) support allows running MS SQL & Hyper-V on FlexGroup
  • Constituent volume (auto-size) Elastic sizing & FlexGroup resize
    • If one constituent out of space, the system automatically take space from other constituent volumes and provision it to the one needs it the most. Previously it might result at the end of space error, while some space was available in other volumes. Though it means you probably short in space, and it might be a good time to add some more 😉
  • FlexGroup on MCC (FC & IP)
  • FlexGroup rename & re-size in GUI & CLI

FabricPool

Alibaba and Google Cloud object storage support for FabricPool and in GUI now you can see cloud latency of the volume.

Another exciting for me news is a new “All” policy in FabricPool. It is excited for me because I was one of those whom many times insisted it is a must-have feature for secondary systems to write-through directly to cold tier. The whole idea in joining SnapMirror & FabricPool on the secondary system was about space savings, so the secondary system can also be All Flash but with many times less space for the hot tier. We should use secondary system in the role of DR not as Backup because who wants to pay for the backup system as for flash, right? Then if it is a DR system, it assumes someday secondary system might become primary and once trying to run production on the secondary you most probably going to have not enough space on that system for hot tier, which means your DR no longer working. Now once we get this new “All” policy, this idea of joining FabricPool with SnapMirror while getting space savings and fully functional DR going to work.

This new “All” policy replaces “backup” policy in ONTAP 9.6, and you can apply it on primary storage, while the backup policy was available only on SnapMirror secondary storage system. With All policy enabled, all data written to FabricPool-enabled volume written directly to object storage, while metadata remains on performance tier on the storage system.

SVM-DR now supported with FabricPool too.

No more fixed ratio of max object storage compare to hot tier in FabricPool

FabricPool is a technology for tiering cold data to object storage either to the cloud or on-prem, while hot data remain on flash media. When I speak about hot “data,” I mean data and metadata, where metadata ALWAYS HOT = always stays on flash. Metadata stored in inode structure which is the source of WAFL black magic. Since FabricPool introduced in ONTAP till 9.5 NetApp assumed that hot tier (and in this context, they mostly were thinking not about hot data itself but rather metadata inodes) will always need at least 5% on-prem which means 1:20 ratio of hot tier to object storage. However, turns out it’s not always the case and most of the customers do not need that much space for metadata, so NetApp re-thought that and removed hard-coded 1:20 ratio and instead introduced 98% aggregate consumption model which gives more flexibility. For instance, if storage will need only 2% for metadata, then we can have a 1:50 ratio, this is of the cause will be the case only with low-file-count environments & SAN. That means if you have 800 TiB aggregate, you can store 39.2 PiB in cold object storage.

Additional:

  • Aggregate-level encryption (NAE), help cross-volume deduplication to gain savings
  • Multi-tenant key management allows to manage encryption keys within SVM, only external managers supported, previously available on cluster admin level. That will be great news for service providers. Require Key-manager license on ONTAP
  • Premium XL licenses for ONTAP Select allows consuming more CPU & memory to ONTAP which result in approximately 2x more performance.
  • NetApp support 8000 series and 2500 series with ONTAP 9.6
  • Automatic Inactive Data Reporting for SSD aggregates
  • MetroCluster switchover and switchback operations from GUI
  • Trace File Access in GUI allows to trace files on NAS accessed by users
  • Encrypted SnapMirror by default: Primary & Secondary 9.6 or newer
  • FlexCache volumes now managed through GUI: create, edit, view, and delete
  • DP_Optimized (DPO) license: Increases max FlexVol number on a system
  • QoS minimum for ONTAP Select Premium (All-Flash)
  • QoS max available for namespaces
  • NVMe drives with encryption which unlike NSE drives, you can mix in a system
  • FlashCache with Cloud Volumes ONTAP (CVO)
  • Cryptographic Data Sanitization
  • Volume move now available with NVMe namespaces.

Implemented SMB 3.0 CA witness protocol by using a node’s HA (SFO) partner LIF, which improve switchover time:

If two FabricPool aggregates share a single S3 bucket, volume migration will not rehydrate data and move only hot tier

We expect 9.6RC1 around the second half of May 2019, and GA comes about six weeks later.

Read more

Disclaimer

All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. No one is sponsoring this article.