HOW TO: Purge Mistral Executions (garbage collection)

nmaludy · October 23, 2018, 8:43pm

StackStorm has the ability to garbage collect itself, detailed here: Purging Old Operational Data — StackStorm 2.9.1 documentation

Mistral also has this ability, but is not enabled by default.

The following settings change be changed in /etc/mistral/mistral.conf to enable Mistral’s internal garbage collector:

[execution_expiration_policy]

#
# From mistral.config
#

# How often will the executions be evaluated (in minutes). For example
# for value 120 the interval will be 2 hours (every 2 hours). (integer
# value)
#evaluation_interval = <None>

# Evaluate from which time remove executions in minutes. For example
# when older_than = 60, remove all executions that finished a 60
# minutes ago or more. Minimum value is 1. Note that only final state
# execution will remove ( SUCCESS / ERROR ). (integer value)
#older_than = <None>

Example

[execution_expiration_policy]
# 10 minutes
evaluation_interval = 10
# 30 days
older_than = 43200

After changing the config, simply restart Mistral:

systemctl restart mistral

nmaludy · October 26, 2018, 2:36pm

When doing this we ran into a server that had 100,000s of records older than our 30 day window. We tried letting Mistral delete these by itself, but it never finished. Instead we ran the following steps to have it complete in seconds:

Edit the Postgres config: /var/lib/pgsql/data/postgres.conf

#------------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#------------------------------------------------------------------------------

# - Memory -

shared_buffers = 1024MB
temp_buffers = 512MB
work_mem = 512MB
maintenance_work_mem = 1024MB           # min 1MB
max_stack_depth = 6MB                   # min 100kB
effective_cache_size = 1024MB            # default 128MB


#shared_buffers = 32MB # min 128kB

Restart Postgres

systemctl restart postgresql

Login to Postgres, create indexes on columns that don’t have them (otherwise delete takes FOREVER) and do a bulk delete.

$ sudo -s -u mistral
$ psql

> CREATE INDEX workflow_executions_v2_task_execution_id ON workflow_executions_v2(task_execution_id);
> CREATE INDEX action_executions_v2_task_execution_id on action_executions_v2(task_execution_id);
>
> SELECT count(*) FROM workflow_executions_v2 WHERE updated_at < NOW() - INTERVAL '30 days';
>
> DELETE FROM workflow_executions_v2 WHERE updated_at < NOW() - INTERVAL '30 days';
> DELETE FROM task_executions_v2 WHERE updated_at < NOW() - INTERVAL '30 days';
> DELETE FROM action_executions_v2  WHERE updated_at < NOW() - INTERVAL '30 days';

kami · May 20, 2019, 12:33pm

Per discussion with a user on our Slack community, I updated the post above to also delete operational data from task_executions_v2 and action_executions_v2 tables.

The user had an issue with those two tables growing very large and slowing things down.

public.task_executions_v2     | 6592 MB
public.action_executions_v2   | 4725 MB

olgahm · May 20, 2019, 2:28pm

Hi! Thanks for updating the post. We managed to free some space after doing also a vacuum full in the followin tables:
vacuum full workflow_executions_v2;

vacuum full task_executions_v2;

vacuum full action_executions_v2;