Resources Monitoring

V1.1 – December 2023

Version	Author	Description
V1.0 – 2023-12-20	Diogo Hatz 50037923	Initial Version
V1.0 – 2023-12-21	Wisley da Silva 00830850	Document Review

Introduction

Cloud Eye (CES) is a free tool for monitoring Huawei Cloud resources. In addition to resource monitoring, Cloud Eye can also be used to create event- or metric-based alarms, identify resource malfunctions, and quickly react to resource changes. It is worth noting that, although Cloud Eye is a free service, charges for sending notifications when alarms are triggered are charged.

This document aims to describe the main functionalities of the Cloud Eye service and guide the reader to use CES for monitoring cloud resources, such as ECSs, VPNs, and CBRs, etc. In addition, it also describes how to create event- or metric-based alarms and customize dashboards for resource monitoring.

Cloud Eye on the console

Overview

When you open Cloud Eye on the console, the home page that will load is the Overview, where you can see an overview of all resources used in Huawei Cloud, the overall network, CPU, memory, and disk utilization, and which resources have recently triggered alarms and need further attention.

Resource Overview: Allows you to view the total number of monitored resources and the alarms generated for these resources.
Alarm Statistics: Shows the alarms triggered in the last seven days by alarm severity.
Server Monitoring: Allows you to view the overall CPU and memory utilization of monitored servers and a list of the top 5 ECSs ranked by CPU or memory utilization.
Network Monitoring: Shows the overall bandwidth utilization of EIPs and a list of the top 5 EIPs ranked by bandwidth utilization.
Storage Monitoring: Allows you to view the overall disk utilization (EVS) by read and write IOPS and a list of the top 5 disks ranked by IOPS.

You can see what the Cloud Eye home page looks like in the images below:

Server Monitoring

Server Monitoring (ECSs and BMSs) can be viewed in the Server Monitoring section. It is worth noting that for server monitoring, installing the agent (Telescope) is recommended, since it provides more specific and accurate metrics, according to appendix 4.1.

The agent can be installed in three different ways: manually, automatically, or in batch mode. Regardless of the installation method chosen, you must configure the permissions for the agent in advance: in the server monitoring section, click Configure on the warning that the agent permission has not been configured for the current region.

Automatic:

To install the agent automatically, simply click on the puzzle piece in the server monitoring section and in the agent status column in the corresponding ECS/BMS and wait for the agent to install.

Manual:

To install the agent manually, first go to the section related to ECS or BMS, depending on the type of server on which the agent will be installed.

Select Remote Login to log in to the desired server

Log in to the server by entering the username and password configured when the server was created and then enter the following command, if the region where the server is located is LA-Santiago:

cd /usr/local && curl -k -O https://uniagent-la-south-2.obs.la-south-2.myhuaweicloud.com/script/agent_install.sh && bash agent_install.sh

If the region where the server is located is different from LA-Santiago, you can find the list of commands by region in the following link: https://support.huaweicloud.com/intl/en-us/usermanual-ces/ces_01_0029.html

If the red message above appears at the end of the installation, the agent has been successfully installed.

Dashboard

The Dashboard section concerns the area where custom charts can be created for monitoring selected services and resources, with the chosen metrics.

To create a dashboard, navigate to the My Dashboards section in Dashboards and click Create Dashboard.

Choose a name for the dashboard in Name and click OK.

To add graphs for monitoring specific metrics, graphs can be added to dashboards. To add a graph, click on the created dashboard and click Add Graph.

Choose the type of chart to create and click OK.

Certain settings can be made when adding a chart to a dashboard, such as whether the same chart will have multiple metrics or only one metric, the period in which the data was collected, the type of data to be displayed (raw data, maximum, minimum, average or sum) and the metrics to be displayed.

Under Metric Display, select One graph for a single metric to add a single metric to the graph, or select One graph for multiple metrics to add multiple metrics to the graph.

Click Select Resource and Metric to select the resource to be monitored and the metric for that resource.

Select the type of service to be monitored on the left side of the Select Resource and Metric page, the specific resource to be monitored in the middle area of the page, and the metrics for that resource on the right. In this example, CPU, disk, memory, and network usage will be monitored on “ecs-9152”.

Adjust the data collection time in the upper right corner of the page Add Graph.

A sample of the generated graph will appear on the page. Click Save to confirm and add the graph to the dashboard.

On the dashboard, you can create a legend for the graph, edit it, make it full screen, reload the data shown in the graph, and move the graph.

In Cloud Eye, you can create numerous dashboards with several graphs in each dashboard, and each graph can show multiple monitoring metrics. In addition, as described in topic 3.1, in the CES Overview section, you can have an overview of the monitored resources with the main metrics used, such as CPU, memory and disk usage on servers; network usage and a total of alarms triggered in Cloud Eye.

Cloud service monitoring

In the Cloud Service Monitoring section, dashboards for each resource of the ECS, EIP and bandwidth, NAT and VPN services are automatically created during the creation of these resources. The main monitoring metrics of these services are added in the form of a graph in this section for quick and general monitoring of these services.

In addition to viewing the graphs related to the main monitored metrics, it is also possible to export the collected data by clicking the Export Data button.

Attachments

Server Monitoring Metrics

Metrics	Agentless	Agent Installed
CPU Usage	Yes	Yes / Dedicated
Disk Usage	Yes	Yes
Memory Usage	Yes	Yes / Dedicated
Disk Write Bandwidth	Yes	Yes
Disk Read Bandwidth	Yes	Yes
Disk Write IOPS	Yes	Yes
Disk Read IOPS	Yes	Yes
Bandwidth Input Rate	Yes	Yes
In-band egress rate	Yes	Yes
Out-of-band egress rate	Yes	Yes
Out-of-band egress rate	Yes	Yes
CPU credit usage	Yes	Yes
CPU credit balancing	Yes	Yes
CPU credit balancing surplus	Yes	Yes
CPU credit loaded surplus	Yes	Yes
Network connections	Yes	Yes
Inbound bandwidth per server	Yes	Yes
Outbound bandwidth per server	Yes	Yes
Inbound PPS	Yes	Yes
Outbound PPS	Yes	Yes
New connections	Yes	Yes
Aggregate ECC uncorrectable errors	Yes	Yes
Pages retired with single bit errors	Yes	Yes
Pages retired with double bit errors	Yes	Yes
GPU health status	Yes	Yes
GPU encoder usage	Yes	Yes
GPU decoder usage	Yes	Yes
ECC volatile correctable errors	Yes	Yes
ECC volatile uncorrectable errors	Yes	Yes
Idle CPU	No	Yes / Dedicated
User space CPU usage	No	Yes / Dedicated
Kernel space CPU usage	No	Yes / Dedicated
Other processes CPU usage	No	Yes / Dedicated
Optimal processes CPU usage	No	Yes / Dedicated
Time CPU is waiting for I/O operations	No	Yes / Dedicated
CPU interrupt time	No	Yes / Dedicated
Software CPU interrupt time	No	Yes / Dedicated
Available memory	No	Yes / Dedicated
Idle memory	No	Yes / Dedicated
Buffer	No	Yes / Dedicated
Cache	No	Yes / Dedicated
Inbound bandwidth per NIC	No	Yes / Dedicated
Outbound bandwidth per NIC	No	Yes / Dedicated
Packet rate sent per NIC	No	Yes / Dedicated
Packet rate received per NIC	No	Yes / Dedicated
Packet rate with error received per NIC	No	Yes / Dedicated
Packet rate with error transmitted per NIC	No	Yes / Dedicated
Packet rate received dropped per NIC	No	Yes / Dedicated
Packet rate transmitted dropped per NIC	No	Yes / Dedicated
Processes running	No	Yes / Dedicated
Idle processes	No	Yes / Dedicated
Zombie processes	No	Yes / Dedicated
Blocked processes	No	Yes / Dedicated
Sleeping processes	No	Yes / Dedicated
Total processes	No	Yes / Dedicated
TCP retransmission rate	No	Yes / Dedicated
TCP SYS_SENT	No	Yes / Dedicated
TCP SYS_RECV	No	Yes / Dedicated
TCP FIN_WAIT1	No	Yes / Dedicated
TCP FIN_WAIT2	No	Yes / Dedicated
TCP CLOSE	No	Yes / Dedicated
TCP LAST_ACK	No	Yes / Dedicated
TCP LISTEN	No	Yes / Dedicated
TCP CLOSING	No	Yes / Dedicated
Average CPU load in the last minute	No	Yes / Dedicated
Average CPU load in the last 15 minutes	No	Yes / Dedicated
Average CPU load in the last 5 minutes	No	Yes / Dedicated
TCP ESTABLISHED	No	Yes / Dedicated
TCP TOTAL	No	Yes / Dedicated
UDP TOTAL	No	Yes / Dedicated
NTP Offset	No	Yes / Dedicated
Total files processed	No	Yes / Dedicated

VPN Gateway Monitoring Metrics

Metrics	Supported
Inbound Packet Rate	Yes
Outbound Packet Rate	Yes
Inbound Bandwidth	Yes
Outbound Bandwidth	Yes
Inbound Bandwidth Usage	Yes
Number of Connections	Yes
Outbound Bandwidth Usage	Yes

VPN Connection Monitoring Metrics

Metrics	Supported
Tunnel Average RTT	Yes
Tunnel Max RTT	Yes
Tunnel Packet Loss Rate	Yes
Link Average RTT	Yes
Link Max RTT	Yes
Link Packet Loss Rate	Yes
VPN Connection Status	Yes
Packet Receive Rate	Yes
Packet Send Rate	Yes
Traffic Receive Rate	Yes
Traffic Send Rate	Yes
SA Packet Send Rate	Yes
SA Packet Receive Rate	Yes
SA Traffic Send Rate	Yes
SA Traffic Receive Rate	Yes

NAT Monitoring Metrics

Metrics	Supported
SNAT Connections	Yes
Inbound Bandwidth	Yes
Outbound Bandwidth	Yes
Inbound PPS	Yes
Outbound PPS	Yes
Inbound Traffic	Yes
Outbound Traffic	Yes
SNAT Connection Usage Rate	Yes
Ingress bandwidth usage rate	Yes
Egress bandwidth usage rate	Yes
Total egress bandwidth (UDP)	Yes
Total egress bandwidth (TCP)	Yes
Total ingress bandwidth (UDP)	Yes
Total ingress bandwidth (TCP)	Yes
Packets lost due to excessive SNAT connections	Yes
Packets lost due to excessive PPS	Yes
Packets lost by all allocated EIP ports	Yes

References

CES documentation: https://support.huaweicloud.com/intl/en-us/function-ces/index.html
CES limitations: https://support.huaweicloud.com/intl/en-us/productdesc-ces/ces_07_0007.html
FAQ: https://support.huaweicloud.com/intl/en-us/ces_faq/ces_faq_0059.html
CES agent batch installation: https://support.huaweicloud.com/intl/en-us/usermanual-ces/ces_01_0033.html