HOW TO: Purge Mistral Executions (garbage collection)


(Nick Maludy) #1

StackStorm has the ability to garbage collect itself, detailed here: Purging Old Operational Data — StackStorm 2.9.1 documentation

Mistral also has this ability, but is not enabled by default.

The following settings change be changed in /etc/mistral/mistral.conf to enable Mistral’s internal garbage collector:


# From mistral.config

# How often will the executions be evaluated (in minutes). For example
# for value 120 the interval will be 2 hours (every 2 hours). (integer
# value)
#evaluation_interval = <None>

# Evaluate from which time remove executions in minutes. For example
# when older_than = 60, remove all executions that finished a 60
# minutes ago or more. Minimum value is 1. Note that only final state
# execution will remove ( SUCCESS / ERROR ). (integer value)
#older_than = <None>


# 10 minutes
evaluation_interval = 10
# 30 days
older_than = 43200

After changing the config, simply restart Mistral:

systemctl restart mistral

(Nick Maludy) #2

When doing this we ran into a server that had 100,000s of records older than our 30 day window. We tried letting Mistral delete these by itself, but it never finished. Instead we ran the following steps to have it complete in seconds:

Edit the Postgres config: /var/lib/pgsql/data/postgres.conf


# - Memory -

shared_buffers = 1024MB
temp_buffers = 512MB
work_mem = 512MB
maintenance_work_mem = 1024MB           # min 1MB
max_stack_depth = 6MB                   # min 100kB
effective_cache_size = 1024MB            # default 128MB

#shared_buffers = 32MB # min 128kB

Restart Postgres

systemctl restart postgresql

Login to Postgres, create indexes on columns that don’t have them (otherwise delete takes FOREVER) and do a bulk delete.

$ sudo -s -u mistral
$ psql

> CREATE INDEX workflow_executions_v2_task_execution_id ON workflow_executions_v2(task_execution_id);
> CREATE INDEX action_executions_v2_task_execution_id on action_executions_v2(task_execution_id);
> SELECT count(*) FROM workflow_executions_v2 WHERE updated_at < NOW() - INTERVAL '30 days';
> DELETE FROM workflow_executions_v2 WHERE updated_at < NOW() - INTERVAL '30 days';