We are using Stackstorm policies to limit the number of executions of a particular workflow as documented
in the top of this page : Policies — StackStorm 3.1.0
documentation
like this:
name: my_action.concurrency
description: Limits the concurrent executions for my action.
enabled: true
resource_ref: demo.my_action
policy_type: action.concurrency
parameters:
action: delay
threshold: 10
Works great - we get to 10 executions running, the 11th, 12th, etc go to “delayed”, until some of the first 10 finish.
The app consumes messages from Kafka. There are two options I can see to consume from Kafka -
A. read everything from Kafka, invoke a stackstorm execution per message. Hit the limit of 10
executions, then the rest go to ‘delayed’ until they can be worked by ST2. This uses ST2 as a queue, and
we have hit a limit (several hundred) delayed messages, after which ST2 almost crashed.
OR
B. Detect when ST2 is able to accept more work (backpressure). For each Kafka message, we would invoke
ST2 then check if the execution went to ‘delayed’. If so, sleep and poll ST2 until the execution went to
‘running’ or a terminal state, then get the next Kafka message, repeat.
Questions:
- Which is best. - Option A or Option B? Is there a better way to do this?
- For Option B - how long the ‘delay’ will hold / is there a timeout for the delay? Can we set it? What state to the workflows go to after the timeout is reached?