Your Cassandra cluster failed to deploy. Replica State changed to PERMANENTLY_FAILING. Replica was unhealthy 2 consecutive times

强颜欢笑 提交于 2020-01-05 15:18:06

问题


I tried to deploy a Cassandra Cluster using Google Compute Engine, no success. I tried several times, the error was always the same:

module: DEPLOYMENT_FAILED
Replica module-1234 failed with status PERMANENTLY_FAILING: Replica State
changed to PERMANENTLY_FAILING. Replica was unhealthy 2 consecutive times.

After following this short troubleshooting guidelines: https://cloud.google.com/solutions/cassandra/click-to-deploy#troubleshooting, the log is the following:

antoniogallo88_gmail_com@cassandra-coord-v8ip:/gagent/metaOutput$ tail $(ls -1tr /gagent/metaOutput/stderr.*.txt | 
tail -n 1)
Still waiting for resourceview cassandranode-4da4e to have 3 members ...
Still waiting for resourceview cassandranode-4da4e to have 3 members ...
Still waiting for resourceview cassandranode-4da4e to have 3 members ...
Still waiting for resourceview cassandranode-4da4e to have 3 members ...
Still waiting for resourceview cassandranode-4da4e to have 3 members ...
Still waiting for resourceview cassandranode-4da4e to have 3 members ...
Still waiting for resourceview cassandranode-4da4e to have 3 members ...
Still waiting for resourceview cassandranode-4da4e to have 3 members ...
Still waiting for resourceview cassandranode-4da4e to have 3 members ...
[ERROR] resourceview cassandranode-4da4e does not have 3 members after 60 attempts.

Do you have any idea how to fix this?

Thanks.

Antonio


回答1:


Can you check that the instance type you've chosen (in # of cores) and the number of cluster members exceed the cpu quota for the project you're using? Also check the disk capacity value and your overall disk quota.

You can check max allowable disk and CPU quota in the console under Compute Engine > Quotas.

This sounds like a quota issue even though the console is not surfacing a quota error.

Another thing you can do is create another deployment, then quickly switch over to the instance list page and look for an instance called "Cassandra-coord-foo" which is a short-lived instance that manages disk creation. If you ssh into that node during deployment and run the following command, you may see a disk or CPU quota warning:

tail -f /gagent/metaOutput/*

Chris



来源:https://stackoverflow.com/questions/27283139/your-cassandra-cluster-failed-to-deploy-replica-state-changed-to-permanently-fa

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!