I currently work on a project were AWS RDS PostgreSQL is used as a data storage. Since Postgres is using a mechanism called Multiversion Concurrency Control – MVCC, an UPDATE or DELETE command does not remove old versions of a row immediately. These are left on a disk, waiting to be collected and cleaned by a vacuum process. Vacuum can be automated and autovacuum serves that purpose.
Once autovacuum is configured, how do I know it works as expected? Is it triggered when I expect it to be?
If you are using Datadog, like it is in my case, metrics related to autovacuum can be presented there. Datadog is a monitoring & analytics tool. It can receive any numeric metric, either provided by Datadog integrations out of the box, or a custom defined by a user.
It can be achieved in a number of ways. I chose to use Datadog API in Java and send my custom metrics from code. Operations come in the following order: FETCH – TRANSFORM – SEND.
This step creates a connection to Postgres instance and gathers statistics of your choice. Mine was pgstattuple extension and the pgstattuple_approx function. It gives exact numbers in terms of dead tuples and approximated data when it comes to live tuples. In order to make sure that autovacuum does its job it was sufficient.
Since pgstattuple_approx returns data in bytes, it was more convenient to convert it to MiB. It looks better in Datadog charts because the scale is smaller. I also added some custom metrics that are calculated in Java – since Datadog offers only basic arithmetic operations I found it more convenient to do some computation beforehand.
Last step is to send all the metrics to Datadog using Datadog API. Define dashboards, charts and set up alerts when autovacuum has not been triggered.
Styed tuned for implementation details!
Update: see Sending Metrics to Datadog with Java for continuation.