Difficulty: intermediate
Estimated Time: 30 minutes

Welcome!

Welcome to the Yellow Belt DevOps Dojo Monitoring module.

Purpose

The primary objective of the Monitoring module is to explain the benefits of implementing comprehensive automated Monitoring of IT systems.

In some of the references and resources which cover this topic, the term Telemetry and Feedback is used instead of Monitoring the two terms are effectively equivalent. Logging and Alerting are closely related topics which are also introduced in this module.

The secondary objective is to illustrate how the Pet Clinic team plans to implement Monitoring for their application and the benefit they expect to gain from it.

If you have not completed the Welcome module of the Yellow Belt DevOps Dojo - Stripe 2, you should do so before continuing with the Monitoring module.

By the end of the module you will be able to:

  • Explain the different types of metrics which are beneficial to monitor.

  • Articulate the benefits of Monitoring - to your development team, to the IT admins / SRE's and to business teams.

  • Describe the differences between Logging, Monitoring and Alerting.

  • Understand how best to approach Monitoring of your own applications.

Conclusion

Congratulations, you have completed the Monitoring module of the Yellow Belt DevOps Dojo Stripe 2

How does what you have covered in the module apply to your team's application, its environment and its components? Are these being monitored? If not you some of the things you may want to consider monitoring include

  • Server load and availability.

  • Disk space usage.

  • Memory consumption.

  • Application Performance.

  • Application uptime.

  • Application–specific metrics.

  • Cloud-related issues.

  • Potential security breaches.

  • User experience.

Monitoring

Step 1 of 12

The Challenge

The Pet Clinic team has made significant progress on their DevOps journey. They have all of their code under Version Control and they have automated tests which based on the test type are run from the Continuous Integration and Continuous Delivery pipelines.

As a result, the team has more confidence when making application changes in order to deliver additional business benefit. Most of these benefits are delivered in the form of incremental changes which are released frequently and deployed without incident.

However, there are still occasions when performance degrades or outages occur in Production. Sometimes

  • Code changes have unexpected consequences which were not highlighted by the automated tests.

  • There are changes or failures in the wider IT environment which affect the functionality or performance of the Pet Clinic application.

  • Changes to the way the application is used can cause operational issues - for example when there is a peak usage at a particular time of day or during particular events.

When these outages happen. there can be a significant impact on the operation of the Pet Clinic business. These outages cause additional work for the team and frustration for the customers.

💡 TIP: Adjust the window size vertical scroller to make the module easier to read ◀▶