The Google Cloud Operations suite (formerly Stackdriver) includes a wide variety of tools to help you monitor and debug your GCP-hosted application
Once you have created your GPC account and implemented your infrastructure you need to understand better the environment, set up appropriate performance and availability indicators, be proactive and not reactive, so the first thing you’ll want to do is set up a monitoring system that will alert you when there are major problems. The achieve this, you can use Cloud Operations, Google’s powerful monitoring, logging, and debugging tool.
Procedure
To get started with Monitoring, do the following:
- Go to the Cloud Console: Go to Cloud Console
- Select an existing project or create a project.
- In the navigation panel, select Monitoring.
It’s not necessary to install an agent to be able to use Monitoring, but if you want to get more information about an instance, then you will need to do it.
Create uptime checks
A classic use case is that you want to monitor a server and get notified if it goes down, so the first thing you need to do is create an Uptime Check.
To create an uptime check by using the Google Cloud Console, do the following:
1- In the Cloud Console, Select Monitoring: Go to Monitoring
2- Click Uptime checks.
3- Click Create Uptime check.
4- Enter a descriptive title for the Uptime check and then click Next.
5- Specify the target of the uptime check:
a) Select the protocol. You have the options of HTTP, HTTPS, and TCP.
b) Choose one resource types you want to monitor:
URL: Any IPv4 address or hostname
App Engine: App Engine applications (modules).
Instance: Compute Engine or AWS EC2 instances.
Elastic Load Balancer: AWS load balancer.
c) Enter the protocol-specific fields:
– For TCP checks, enter the port.
– For HTTP and HTTPS checks
d) Enter the resource-specific fields: For URL resources, enter the host name in the Hostname field.
e) For App Engine resources, enter the service name in the Service field.
f) For Elastic Load Balancer and Instance resources, complete the following fields:
– To issue an uptime check to a single instance or load balancer, in the Applies to field, select Single and then use the menu to select the specific instance or load balancer.
– To issue an uptime check to a Monitoring group, in the Applies to field, select Group, and then use the menu to select the group name.
e) The field Check frequency controls how often the uptime check executes. You can leave at the default value or select a value from the menu of options.
6- Configure the response requirements:
– Select the Response Timeout from the menu of options. You can choose any value between 1
to 60
seconds. An uptime check fails if no response is received from more than one location within this timeout.
– For content matching, ensure that the toggle label is Content matching is enabled (it comes disabled by default)
Note: If you don’t want uptime checks sent to Cloud Logging, then uncheck Log check failures.
Click Next
7- Alert & Notification
Create an alerting policy. When your uptime check is monitored by an alerting policy, if the uptime fails, then an incident is created and a notification is sent to all notification channels attached to the policy.
If you don’t want to create an alerting policy as part of this flow, then ensure the text of the toggle button is Do not create an alert. Click the button to change the toggle state
To verify your uptime check configuration, click Test. If the result isn’t what you expect.
Click Create. If required data is missing, the save action fails and a list of fields that require data is displayed next to the dialog buttons. After you save your changes, the Uptime check created dialog is displayed.
To see the results of the uptime check, go to the menu on the left and select Uptime Checks. It’ll take a while before it runs for the first time, so don’t worry if you don’t see anything in the dashboard right away.
For the purpose of this post, we stopped our VM instance so we can see how Google monitoring triggers the alarm and it’s displayed in the Alerting dashboard
Logging, Error Reporting and Debugging, tracing will be covered in our posts, so stay tuned!!
Additional Resources
https://cloud.google.com/products/operations
https://cloud.google.com/stackdriver/docs
https://cloud.google.com/monitoring/docs/monitoring-overview