Viewing Metrics in Datadog

This is the last post in the Datadog series. Previous two: PostgreSQL Metrics in Datadog and Sending Metrics to Datadog with Java showed a need to report data and how to do it. This one will focus on how those metrics can be observed and used.

Understanding what is measured

The example introduced previously was built around PostgreSQL data related to autovacuum. Autovacuum will be triggered when the following condition is met:

dead tuples > vacuum threshold

where

vacuum threshold = vacuum base threshold +
    vacuum scale factor * live tuples

vacuum base threshold – is a constant number of the minimum quantity of dead tuples before the vacuum starts. It protects against situations whereby vacuuming is triggered for only a few dead rows, which isn’t very efficient. Let’s assume it is 50.

vacuum scale factor – is a constant fraction of live tuples. Let’s assume it is 0.01 (1%). Usually it is a good idea to keep it low and make vacuum more aggressive, especially in databases where data changes frequently.

When should the autovacuum be triggered?

The table below gives an idea of how many dead tuples can be in a database table before it will be vacuumed. The vacuum threshold is in a sense the maximum number of dead tuples allowed in a table.

base threshold scale factor live tuples vacuum threshold
50 0.01 100 51
50 0.01 1 500 65
50 0.01 20 000 250
50 0.01 350 000 3 550
50 0.01 2 000 000 20 050

Displaying values in Datadog

Let’s assume there is some table which the dead tuples and live tuples gauges will be reported for – the example presented previously already sends such metrics.

Metrics in Datadog can be observed in a number of ways.

Metrics Summary is a basic tool that can be used for searching and viewing metrics:

Datadog metrics summary example

Metrics Explorer offers search plus instant visualisation:

Datadog metrics explorer example

Dashboards offer more sophisticated way of visualising data:

Datadog dashboard example

To find out more about visualisations, visit the Graphing docs website.

Receiving notifications

Dashboards are great for visualising metrics, however it is the alerting functionality that comes in handy when it comes to monitoring. Who would like to watch their screens all the time? Datadog uses so called Triggered Monitors, and Alert is one of the states such a monitor can be in.

Manage Monitors option under the Monitors category allows adding new ones. Press the New Monitor New metric button button and choose Metric. This type of monitor compares values of a metric with a user defined threshold.

Define a Threshold Alert metric, which means that an alert will be triggered whenever a metric crosses a threshold:

Defining Datadog alert

Since the condition to trigger autovacuum can be defined as the following inequation

dead tuples > vacuum base threshold + 
    vacuum scale factor * live tuples

shows that the Alert threshold visible on the screenshot above, would have to vary depending on the number of live tuples. This cannot be achieved since it has to be a constant value. In order to find such a metric, the inequation can be transformed into the following

((dead tuples - vacuum base threshold) 
    / live tuples) > vacuum scale factor

vacuum scale factor is constant, hence can be used as the Alert threshold. The metric and the threshold used in the example are defined exactly like that.

Now, if the metric (dead tuples – vacuum base threshold) / live tuples is above the vacuum scale factor (0.01), then this monitor will fall into the Alert state:

Datadog monitor in alert state

To make sure that a notification is sent to a recipient and has some meaningful title and content, user can define it in the Say what’s happening section while creating or editing a monitor:

Defining Datadog alert message

The result of such a template is an email that looks like the following:

datadog-alert-notification.png

When the alert is resolved, i.e. the alert condition no longer holds, the monitor returns to the OK state. Email notification in this case looks like the following:

datadog-alert-recovery-notification1.png

This completes the monitoring configuration of the autovacuum process in the PostgreSQL database using Datadog. If you would like to find out more about Datadog, visit their docs website.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s