Sensor Behavior with StackStorm HA Cluster in Kubernetes - BETA

We are planning to implement StackStorm HA Cluster in Kubernetes - BETA, but have few questions:

  • Can we Horizontally Scale pods?
  • How can we make sure the sensors are always running?
  • When polling events using sensors, if due to any reason (bug/instance or process dies) Sensor stops working, how the events will be handled? what will be the behavior?

Please take a look at High Availability Deployment — StackStorm 3.1.0 documentation which describes the current features and limitations behind the HA sensors.

Talking about Helm chart we’re providing stackstorm/stackstorm-ha, you have 3 options:

  1. Default configuration, no HA - runs all sensors in single st2sensorcontainer pod, they’re forked from the main process.
  2. no HA, distributed - run every single sensor in a dedicated pod, configure liveness/readiness probes and rely on K8s rescheduling capabilities. We’re using this in our internal ST2 K8s infrastructure. GitHub - StackStorm/stackstorm-ha: K8s Helm Chart (βeta!) that codifies StackStorm (aka "IFTTT for Ops" Highly Availability fleet as a simple to use reproducible infrastructure-as-code app highlights example configuration.
  3. HA - design your sensors to rely on external locking and then run several replicas of every sensor. See Feature Request: `LockingPollingSensor` to make HA aware polling sensors · Issue #4301 · StackStorm/st2 · GitHub feature request for this and some examples.

How do we handle sensor partitioning in case of option 2?