Greetings friends, today I bring you a new post about Grafana, today’s post is special for me because I have been using NetApp for a long time, but I had never stopped to think about how to monitor in detail this fantastic Hardware.
Testing the latest Release, I realized that all the APIs that the system uses for any action are exposed in a very simple way, so I decided it was time to give it its section in this, your, series of In Search of the Perfect Dashboard.
Dashboard for NetApp ONTAP
When we finish the entry we will have something similar to that Dashboard that will allow you to visualize:
Dashboard – Summary.
This first dashboard of the series, because I intend to create maybe some more with other details such as SVM, etc., contains:
- ONTAP Cluster Overview– Cluster Name, Version and Management IP.
- ONTAP Cluster metrics – Three graphs that are similar to the ones ONTAP displays, with Latency, IOPS and Throughput.
- ONTAP Cluster Aggregate Storage – Table with aggregate space detail.
- ONTAP Cluster SVM – A very complete table showing all our Storage VMs.
- ONTAP Volumes – A very complete table showing all our Volumes.
- ONTAP Cluster LUN– A very complete table showing all our LUNs.
- ONTAP Cluster LUN– A very complete table showing all our LUNs.
- ONTAP Shares– A very complete table showing all our Shares.
Topology with all the logical components
This entry is similar to the previous ones since in this case, we will use a combination of a shell script to collect Veeam ONE metrics using RESTful API and InfluxDB. The design would look something like this:As we can see, the shell script will download the metrics from NetApp ONTAP using the RESTful API, which will send all the data to InfluxDB, from where we can view them comfortably with Grafana.
Download, and configure the netapp_ontap.sh script.
We have almost everything ready, we have one last step, the script that will make all this work, we will download the latest version from the Github repository:
This shell script can be downloaded and run from the telegraf server, or influxDB, or any other Linux. We will have to edit the configuration parameters:
netappInfluxDBURL="http://YOURINFLUXSERVERIP" #Your InfluxDB Server, http://FQDN or https://FQDN if using SSL netappInfluxDBPort="8086" #Default Port netappInfluxDB="telegraf" #Default Database netappInfluxDBUser="USER" #User for Database netappInfluxDBPassword='PASSWORD' #Password for Database #Endpoint URL for login action netappUsername="YOURONTAPUSER" #Your username with privileges to login into the ONTAP netappPassword='YOURONTAPPASSPASSWORD' #NetappAuth=$(YOURONTAPPASSWORD) netappAuth=$$(echo -ne "$netappUsername:$netappPassword" | base64); netappRestServer="YOURONTAPSERVER"; netappRestServer="YOURONTAPSERVER" netappMetrics="20" #They came in interval of 15 seconds, so 20 will be equal to the metrics of the last 5 minutes. If you want to run your script every 5 minutes, let it like this, if not, change it accordingly.
Once the changes are done, make the script executable with chmod:
chmod +x netapp_ontap.sh
We run it, and the output of the command should look something like the following, with no errors:
Writing netapp_SVM_overview to InfluxDB HTTP/1.1 204 No Content Content-Type: application/json Request-Id: bf99d74a-95a5-11eb-a1c2-0050569017a8 X-Influxdb-Build: OSS X-Influxdb-Version: 1.8.4 X-Request-Id: bf99d74a-95a5-11eb-a1c2-0050569017a8 Date: Mon, 05 Apr 2021 00:27:48 GMT Writing netapp_LUN_overview to InfluxDB HTTP/1.1 204 No Content Content-Type: application/json Request-Id: bfe3d80e-95a5-11eb-a1c3-0050569017a8 X-Influxdb-Build: OSS X-Influxdb-Version: 1.8.4 X-Request-Id: bfe3d80e-95a5-11eb-a1c3-0050569017a8 Date: Mon, 05 Apr 2021 00:27:48 GMT
If so, please now add this script to your crontab, like for example every 5minutes, I don’t think our Dashboards are updated more often, but good to download it if so:
*/5 * * * * * /home/oper/netapp_ontap.sh >> /var/log/netapp.log 2>&1
We can check that we have data on the Grafana Explorer: We are ready to go to the next step.
Grafana Dashboards
I created a Dashboard from scratch by selecting the best requests to the database, finalizing the colors, thinking about the graphics and how to display them, and everything is automated to fit our environment without any problems and without having to edit anything manually. The Dashboard can be found here, once imported, you can use the top drop-down menus to select between Cluster, SVM, etc:
Importing the Grafana Dashboard the easy way
So you don’t have to waste hours configuring a new Dashboard, and ingesting and debugging what you want, I have already created a wonderful Dashboard with everything you need to monitor our environment in a very simple way, it will look like the image I showed you above. Select the name you want and enter the ID: 14179, which is the unique ID of the Dashboard, or the URL:
With the menus above, we can move between Cluster, SVM, etc.:
Do you want something more extensive, with more Dashboards, etc?
Since I put the image on Twitter, and even with the first steps, there are some who have told me to take a look at https://nabox.org/ a very interesting opensource project, which includes a lot of dashboards and an appliance, etc.
Positive points:
- It has an infinite number of dashboards, certainly more polished than this one
- It comes in an appliance and seems simple to make work
Negative points:
- It comes in an appliance, which entails deploying something additional to the system that we already have from Grafana, as well as making it difficult to upgrade internal packages without knowing if we’re going to break something.
- In my series, we make use of InfluxDB, Telegraf, and Grafana. But nabox uses Graphite, which is more of a technology to learn and maintain.
- I think it requires us to additionally install NetApp Harvest and the NMSDK in our environment.
I think it is good to have alternatives, and mine is a simple bash shell that calls the API directly from NetApp Nodes, without installing anything else. nabox seems to be much more complex, maybe ok if we have nothing installed, but if we already have our system, complicated to tie everything together.
Please leave your comments here, or on GitHub, thanks a lot for reading!
I hope you like it, and I would like to leave you the complete series here, so you can start playing with the plugins that I have been telling you about all these years:
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part I (Installing InfluxDB, Telegraf, and Grafana on Ubuntu 20.04 LTS)
- En busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte II (Instalar agente Telegraf en Nodos remotos Linux)
- En busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte III Integración con PRTG
- En busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte IV (Instalar agente Telegraf en Nodos remotos Windows)
- En busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte V (Activar inputs específicos, Red, MySQL/MariaDB, Nginx)
- En busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte VI (Monitorizando Veeam)
- En busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte VII (Monitorizar vSphere)
- En busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte VIII (Monitorizando Veeam con Enterprise Manager)
- En busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte IX (Monitorizando Zimbra Collaboration)
- En busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte X (Grafana Plugins)
- En busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte XI – (Monitorizando URL e IPS con Telegraf y Ping)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XII (Native Telegraf Plugin for vSphere)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XIII (Veeam Backup for Microsoft Office 365 v4)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XIV – Veeam Availability Console
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XV (IPMI Monitoring of our ESXi Hosts)
- Looking for Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XVI (Performance and Advanced Security of Veeam Backup for Microsoft Office 365)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XVII (Showing Dashboards on Two Monitors Using Raspberry Pi 4)
- En busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte XVIII – Monitorizar temperatura y estado de Raspberry Pi 4
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XIX (Monitoring Veeam with Enterprise Manager) Shell Script
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXIV (Monitoring Veeam Backup for Microsoft Azure)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXV (Monitoring Power Consumption)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXVI (Monitoring Veeam Backup for Nutanix)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXVII (Monitoring ReFS and XFS (block-cloning and reflink)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXVIII (Monitoring HPE StoreOnce)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXIX (Monitoring Pi-hole)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXIX (Monitoring Veeam Backup for AWS)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXXI (Monitoring Unifi Protect)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXXII (Monitoring Veeam ONE – experimental)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXXIII (Monitoring NetApp ONTAP)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXXIV (Monitoring Runecast)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXXV (GPU Monitoring)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXXVI (Monitoring Goldshell Miners – JSONv2)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XXXVII (Monitoring Veeam Backup for Google Cloud Platform)
- En Busca del Dashboard perfecto: InfluxDB, Telegraf y Grafana – Parte XXXVIII (Monitorizando Temperatura y Humedad con Xiaomi Mijia)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XL (Veeam Backup for Microsoft 365 – Restore Audit)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XLI (Veeam Backup for Salesforce)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XLII (Veeam ONE v12 Audit Events)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XLIII (Monitoring QNAP using SNMP v3)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XLIV (Monitoring Veeam Backup & Replication API)
- Looking for the Perfect Dashboard: InfluxDB, Telegraf, and Grafana – Part XLV (Monitoring Synology using SNMP v3)
Dsru Bin says
When I run the script, it gives me some errors:
[root@dashboard ~]# /etc/telegraf/telegraf.d/netapp_ontap.sh
Writing netapp_cluster_overview to InfluxDB
HTTP/1.1 204 No Content
Content-Type: application/json
Request-Id: 2f12f198-11a2-11ec-922b-0050568ba5de
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.8.9
X-Request-Id: 2f12f198-11a2-11ec-922b-0050568ba5de
Date: Thu, 09 Sep 2021 19:14:41 GMT
jq: error (at :7): Cannot iterate over null (null)
jq: error (at :7): Cannot iterate over null (null)
jq: error (at :7): Cannot iterate over null (null)
jq: error (at :7): Cannot iterate over null (null)
Writing netapp_cluster_metrics to InfluxDB
HTTP/1.1 204 No Content
Content-Type: application/json
Request-Id: 3028870f-11a2-11ec-9231-0050568ba5de
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.8.9
X-Request-Id: 3028870f-11a2-11ec-9231-0050568ba5de
Date: Thu, 09 Sep 2021 19:14:43 GMT
This is for ONTAP 9.6P7.
The Dashboard ends up with some data, but no data for Capacity, Volumes, SVMs, LUNs, or Shares. I started out by using a user that only had http/readonly rights to ONTAP, but I get the same results when I use a user that has admin rights to all NetApp connection methods (console, http, ontapi, ssh, service-processor).
jorgeuk says
Hello,
I will try with 9.6P7 and let you know.
thank you
Jhony Hidayat Nasution says
is there any update? I gor problem when running the script as follows:
[root@zabbixsvr telegraf.d]# /etc/telegraf/telegraf.d/netapp_ontap.sh
Writing netapp_cluster_overview to InfluxDB
HTTP/1.1 400 Bad Request
Content-Type: application/json
Request-Id: f0f14e45-9f79-11ec-a0a9-6608728fa5b2
X-Influxdb-Build: OSS
X-Influxdb-Error: unable to parse ‘netapp_cluster_overview,clustername=null,uuid=null,clusterversion=null,managementnetwork=null versiongeneration=null,versionmajor=null,versionminor=null’: invalid number
X-Influxdb-Version: 1.7.1
X-Request-Id: f0f14e45-9f79-11ec-a0a9-6608728fa5b2
Date: Wed, 09 Mar 2022 07:24:22 GMT
Content-Length: 199
{“error”:”unable to parse ‘netapp_cluster_overview,clustername=null,uuid=null,clusterversion=null,managementnetwork=null versiongeneration=null,versionmajor=null,versionminor=null’: invalid number”}
jq: error (at :1): Cannot iterate over null (null)
jq: error (at :1): Cannot iterate over null (null)
jq: error (at :1): Cannot iterate over null (null)
jq: error (at :1): Cannot iterate over null (null)
jq: error (at :1): Cannot iterate over null (null)
jq: error (at :1): Cannot iterate over null (null)
[root@zabbixsvr telegraf.d]#
Rodrigo R says
Doesn’t work on influx2
Are there any configuration to make it works on influx2?
jorgeuk says
Hello,
If interested, I can spend some time on the lab, but basically we will need to adjust the script to go to a bucket, and then the dsahboard to read from bucket, etc.
Salehuddin says
Hello,
I’m having issues with the script as below.. is there any solution?
Writing netapp_cluster_overview to InfluxDB
HTTP/1.1 401 Unauthorized
Content-Type: application/json; charset=utf-8
X-Platform-Error-Code: unauthorized
Date: Sun, 29 Oct 2023 18:25:06 GMT
Content-Length: 48
{“code”:”unauthorized”,”message”:”Unauthorized”}
Running on influx 2.0.8
Saleh says
Hello Jorge,
I have a problem as below and I’m running on influxdb v2.0. is there any solution for this?
[XL226810@vblpgraprdap001 netapp_ontap-grafana-main]$ ./netapp_ontap.sh
Writing netapp_cluster_overview to InfluxDB
HTTP/1.1 400 Bad Request
Content-Type: application/json; charset=utf-8
X-Platform-Error-Code: invalid
Date: Tue, 31 Oct 2023 15:44:44 GMT
Content-Length: 192
{“code”:”invalid”,”message”:”unable to parse ‘netapp_cluster_overview,clustername=,uuid=,clusterversion=,managementnetwork= versiongeneration=,versionmajor=,versionminor=’: missing tag value”}
[XL226810@vblpgraprdap001 netapp_ontap-grafana-main
jorgeuk says
Hello, the script was created to be used with normal influxql, and v1. Need to check this script in influx2 and influx3
Saleh says
Hello Jorge,
By chance, do you have the configuration that works on influxdb2?
jorgeuk says
Hello, I didn’t had time yet to port it, no. Will see if I can do it soon. Doesn’t this workm in InfluxDB v3 by default?
Saleh says
Hello Jorge,
I don’t see any OSS InfluxDB v3 download available.