Difficulty: beginner
Estimated Time: 50 minutes

An important aspect of Cloud Ops (Operations) is knowing what is happening in the platform and applications under scrutiny. Key metrics provide insight in current state of and trends in performance. Having access to the right metrics and being to trigger automated alarms in case of violations of specific conditions is a necessity for efficient and effective monitoring.

Metrics are published by virtually all OCI services- about the health, capacity, and performance of your cloud resources. These metrics get published when functions are invoked, files are written, the API Gateway handles a request, events are published, a user is created and a network transfers a packet. By querying the Monitoring service for this data, you can understand how well the systems and processes are working to achieve the service levels you commit to your customers. For example, you can monitor the CPU utilization and disk reads of your Compute instances . You can then use this data to determine when to launch more instances to handle increased load, troubleshoot issues with your instance, or better understand system behavior.

You can publish your own metrics to Monitoring using the API. You can view charts of your published metrics using the Console , query metrics using the API, and set up alarms using the Console or API.You can access your published custom metrics the same way you access any other metrics stored by the Monitoring service.

Metrics are retained for 14 days.

In this scenario, you will look at standard OCI Resource metrics as well as custom metrics. You will perform some activities that make sure some metrics are produced and then you will inspect these metrics through the OCI Monitoring facilities. Subsequently, you will look at Alarms and Notifications and cause these to be triggered. Notification topics can be subscribed to - by email subscribers, Slack and PagerDuty, WebHooks and OCI Functions. All of these channels can be triggered by a message published to the Notification Topic.

You will also briefly look at the Audit service, that also provides insight in activity on the OCI tenancy from the perspective of Who did What at Which moment. And finally you will make a brief acquaintance with the Health Checks service that allows us to monitor the health of services anywhere in the world from anywhere in the world.


OCI Documentation on Metrics and Monitoring

OCI Documentation - Publishing Custom Metrics: https://docs.cloud.oracle.com/en-us/iaas/Content/Monitoring/Tasks/publishingcustommetrics.htm

CLI Reference for publishing custom metrics: https://docs.cloud.oracle.com/en-us/iaas/tools/oci-cli/2.9.1/oci_cli_docs/cmdref/monitoring/metric-data/post.html

REST API reference for publishing custom metrics: https://docs.cloud.oracle.com/en-us/iaas/api/#/en/monitoring/latest/MetricData/PostMetricData

OCI Documentation on Audit Service

OCI Documentation on Health Checks


This completes your introduction to Metrics, Monitoring, Alarms and Notifications on out of the box metrics emitted by most OCI resources as all as on custom metrics that may represent functional behavior of application components.

You have inspected the metrics that are produced when files are uploaded or downloaded, created an alarm to trap specific conditions (large number of file downloads) and associated the alarm with a notification topic to which your mail email address is subscribed. You then caused the alarm to be triggered and received at least one email as a result. Subsequently you produced custom metrics (regarding product orders), inspected those in the metrics explorer and created a second alarm that is triggered by special product order circumstances (over one hundred products ordered in five minutes). Again, after creating the business condition that this alarm was created to find, an email was sent to inform you of that condition. Notifications can also trigger PagerDuty and Slack, as well as a webhook and OCI Functions - for automated reactions to the situation.

You should now have an understanding of how metrics represent activity in Oracle Cloud Infrastructure and how these metrics can be inspected - in the console, through the CLI and through alarms that perform continuous evaluations on special, usually undesired situations. Metrics are available on what is happening, on how things are happening and on the status of resources in general over time.

Using metrics, monitoring tools, alarms and notifications you can set up a consistent, far reaching, highly automated monitoring process that helps you to assure performance, availability and in general proper functioning of your cloud resources.

Monitoring, Metrics, Alarms and Notifications, Audit and Health Checks

Step 1 of 11

Step 1 - Introduction Metrics and Monitoring

Some of the steps in this scenario require the use of the OCI Command Line Interface.

Execute the following command to install the OCI CLI:

curl -L https://raw.githubusercontent.com/oracle/oci-cli/master/scripts/install/install.sh > install-oci-cli.sh
chmod +777 install-oci-cli.sh
sudo ./install-oci-cli.sh --accept-all-defaults

# add this line to ~/.profile - to make oci a recognized shell command
echo 'oci() { /root/bin/oci "[email protected]"; }' >> ~/.profile
# reload ~/.profile
. /root/.profile

You need to provide details on the OCI tenancy you will work in and the OCI user you will work as. Please open the IDE tab and edit these two files:

  • ~/.oci/config
  • ~/.oci/oci_api_key.pem

Paste the contents that you prepared in the OCI Tenancy preparation scenario.

Finalizing the Environment

Set the environment variable LAB_ID to 1 - unless you are in a workshop with multiple participants and each uses their own number.

export LAB_ID=1

Try out the following command to get a list of all namespaces you currently have access to - based on the OCI Configuration defined above.

oci os ns get

If you get a proper response, the OCI is configured correctly and you can proceed. If you run into an error, ask for help from your instructor.

Environment Preparation

Prepare a number of environment variables. Note: the assumptions here are that you are working in a tenancy in the Ashburn region and a compartment called lab-compartment exists as well as an API Gateway lab-apigw in that same compartment as well as an API Deployment called MY_API_DEPL# on the API Gateway. We need to get references to these resources in order to create new resources in the right place.

export REGION=$(oci iam region-subscription list | jq -r '.data[0]."region-name"')
export REGION_KEY=$(oci iam region-subscription list | jq -r '.data[0]."region-key"')
export USER_OCID=$(oci iam user list --all | jq -r  '.data |sort_by(."time-created")| .[0]."id"')
export TENANCY_OCID=$(oci iam user list --all | jq -r  '.data[0]."compartment-id"') 
cs=$(oci iam compartment list)
export compartmentId=$(echo $cs | jq -r --arg display_name "lab-compartment" '.data | map(select(."name" == $display_name)) | .[0] | .id')