Kubernetes HA not working for me

I downloaded the the stackstorm HA using “helm install prudhveer . --namespace development”. But most of the pods are not running.


Can you help me with this.

I hope you installed it as said in StackStorm HA Cluster in Kubernetes - BETA — StackStorm 3.2.0 documentation

what does the logs says? You can describe the pods that are in CrashLoopBackOff with the following command.

kubectl describe pod <pod-name>

Hi, Thank You for replying. I just deleted the repo and tried to re-install it using the same command, but now it shows “Error: failed post-install: timed out waiting for the condition” error. What should I do ?

These are the outputs for " kubectl describe " and “kubectl log”.

I see that you are running on minikube, make sure you have enough resources to run all of these services. Try increasing the minikube vm resources. you can do that with, make sure you update the values.

minikube start --memory=8192 --cpus=4 --disk-size=50g

Another easy way to get started is using docker-compose. Docker — StackStorm 3.2.0 documentation

StackStorm Services rely on RabbitMQ, MongoDB, etcd clusters to work properly.
If any of those are unavailable st2 pods keep restart-looping trying to re-establish connection until these backends are back up.

Try to debug further MongoDB and etcd, which had issues based on the screenshots.

@Sheshagiri also made a good point about the resources constraints, especially on MBP. You can increase the helm timeout to give ST2-HA cluster more time, example stackstorm-ha/config.yml at master · StackStorm/stackstorm-ha · GitHub

@Sheshagiri I tried it with “minikube start --memory=8192 --cpus=4 --disk-size=50g” but still not working. I deleteted the cluster and restarted it, but still dosent work. And can you be more specific on what values need to be updated ? I am installing the very basic Stackstorm as mentioned in StackStorm HA Cluster in Kubernetes - BETA — StackStorm 3.2.0 documentation.
@armab I am attaching few screen shots of logs for mongo and rabitmq. Can you please check it and let me know.
Thank you.

Based on screenshot, did you notice that etcd cluster didn’t start for some reason? As said, StackStorm services depend on those backends at a first instance.

@armab
No I cant find any reasons for that. Here is a Screenshot of " kubectl describe pod etcd-cluster-hzx57tnl4
"

The output for “kubectl logs etcd-cluster-hzx57tnl4p” is

With No replica set members found yet symptoms in logs, the following bug report was created in StackStorm-HA Helm chart: MongoDB cluster issue - No replica set members found yet · Issue #139 · StackStorm/stackstorm-ha · GitHub which is related to MongoDB cluster.

The issue got solved for me. In my case, I have done two things

  1. I changed the mongodb-replicaset version to 3.12.0. Now, it also works fine with 3.16.2.
  2. If you are using minikube with driver=none, please make sure that the SC, PV, PVC are working correctly. I personally deleted the “standard” sc that minikube provides and created a new one.
    Also make sure to check if the storage - provisioner pod for minikube is installed and running in the kube-system namespace.

The stackstorm HA works completely fine, unless there is a problem with your system.

If you have any further questions, do join Stackstorm on stackstorm-community.slack.com, they are quite helpful.

@armab @Sheshagiri Thank You.

2 Likes

Thanks a lot of sharing the solution @prudhveer!