This is the last post in the Datadog series. Previous two: PostgreSQL Metrics in Datadog and Sending Metrics to Datadog with Java showed a need to report data and how to do it. This one will focus on how those metrics can be observed and used.
Understanding what is measured
The example introduced previously was built around PostgreSQL data related to autovacuum. Autovacuum will be triggered when the following condition is met:
dead tuples > vacuum threshold
vacuum threshold = vacuum base threshold + vacuum scale factor * live tuples
vacuum base threshold – is a constant number of the minimum quantity of dead tuples before the vacuum starts. It protects against situations whereby vacuuming is triggered for only a few dead rows, which isn’t very efficient. Let’s assume it is 50.
vacuum scale factor – is a constant fraction of live tuples. Let’s assume it is 0.01 (1%). Usually it is a good idea to keep it low and make vacuum more aggressive, especially in databases where data changes frequently.
When should the autovacuum be triggered?
The table below gives an idea of how many dead tuples can be in a database table before it will be vacuumed. The vacuum threshold is in a sense the maximum number of dead tuples allowed in a table.
|base threshold||scale factor||live tuples||vacuum threshold|
|50||0.01||350 000||3 550|
|50||0.01||2 000 000||20 050|
Displaying values in Datadog
Let’s assume there is some table which the dead tuples and live tuples gauges will be reported for – the example presented previously already sends such metrics.
Metrics in Datadog can be observed in a number of ways.
Metrics Summary is a basic tool that can be used for searching and viewing metrics:
Metrics Explorer offers search plus instant visualisation:
Dashboards offer more sophisticated way of visualising data:
To find out more about visualisations, visit the Graphing docs website.
Dashboards are great for visualising metrics, however it is the alerting functionality that comes in handy when it comes to monitoring. Who would like to watch their screens all the time? Datadog uses so called Triggered Monitors, and Alert is one of the states such a monitor can be in.
Manage Monitors option under the Monitors category allows adding new ones. Press the New Monitor button and choose Metric. This type of monitor compares values of a metric with a user defined threshold.
Define a Threshold Alert metric, which means that an alert will be triggered whenever a metric crosses a threshold:
Since the condition to trigger autovacuum can be defined as the following inequation
dead tuples > vacuum base threshold + vacuum scale factor * live tuples
shows that the Alert threshold visible on the screenshot above, would have to vary depending on the number of live tuples. This cannot be achieved since it has to be a constant value. In order to find such a metric, the inequation can be transformed into the following
((dead tuples - vacuum base threshold) / live tuples) > vacuum scale factor
vacuum scale factor is constant, hence can be used as the Alert threshold. The metric and the threshold used in the example are defined exactly like that.
Now, if the metric (dead tuples – vacuum base threshold) / live tuples is above the vacuum scale factor (0.01), then this monitor will fall into the Alert state:
To make sure that a notification is sent to a recipient and has some meaningful title and content, user can define it in the Say what’s happening section while creating or editing a monitor:
The result of such a template is an email that looks like the following:
When the alert is resolved, i.e. the alert condition no longer holds, the monitor returns to the OK state. Email notification in this case looks like the following:
This completes the monitoring configuration of the autovacuum process in the PostgreSQL database using Datadog. If you would like to find out more about Datadog, visit their docs website.