'Unable to connect Net/http: TLS handshake timeout' — Why can't Kubectl connect to Azure Kubernetes server? (AKS)

前端 未结 4 1437
温柔的废话
温柔的废话 2020-12-03 08:09

My question (to MS and anyone else) is: Why is this issue occurring and what work around can be implemented by the users / customers themselves as opposed to

4条回答
  •  旧时难觅i
    2020-12-03 08:29

    Workaround 1 (May Not Work for Everyone)

    An interesting solution (worked for me) to test is scaling the number of nodes in your cluster up, and then back down...

    1. Log into the Azure Console — Kubernetes Service blade.
    2. Scale your cluster up by 1 node.
    3. Wait for scale to complete and attempt to connect (you should be able to).
    4. Scale your cluster back down to the normal size to avoid cost increases.

    Alternately you can (maybe) do this from the command line:

    az aks scale --name --node-count --resource-group

    Since it is a finicky issue and I used the web interface I am uncertain if the above is identical or would work.

    Total time it took me ~2 minutes — for my situation that is MUCH better than re-creating / configuring a Cluster (potentially multiple times...)

    That being Said....

    Zimmergren brings up some good points that Scaling is not a true Solution:

    "It worked sometimes, where the cluster self-healed a period after scaling. It failed sometimes with the same errors. I don't consider scaling a solution to this problem, as that causes other challenges depending on how things are set up. I wouldn't trust that routine for a GA workload, that's for sure. In the current preview, it's a bit wild west (and expected), and I'm happy to blow up the cluster and create a new one when this fails continuously." (https://github.com/Azure/AKS/issues/268#issuecomment-395299308)

    Azure Support Feedback

    Since I had a support ticket open at the time I ran into the above scaling solution I was able to get feedback (or rather a guess) on what the above might have worked, here's a paraphrased response:

    "I know that scaling the cluster can sometimes help if you get into a state where the number of nodes is mismatched between “az aks show” and “kubectl get nodes”. This may be similar."

    Workaround References:

    1. GitHub user Scaled nodes from console and fixed the problem: https://github.com/Azure/AKS/issues/268#issuecomment-375722317

    Workaround Didn't Work?

    If this DOES NOT work for you, please post a comment below as I am going to try to keep an up to date list of how often the issue crops up, whether it resolves itself, and whether this solution works across Azure AKS users (looks like it doesn't work for everyone).

    Users Scaling Up / Down DID NOT work for:

    1. omgsarge (https://github.com/Azure/AKS/issues/112#issuecomment-395231681)
    2. Zimmergren (https://github.com/Azure/AKS/issues/268#issuecomment-395299308)
    3. sercand — scale operation itself failed — not sure if it would have impacted connectability (https://github.com/Azure/AKS/issues/268#issuecomment-395301296)

    Scaling Up / Down DID work for:

    1. Me
    2. LohithChanda (https://github.com/Azure/AKS/issues/268#issuecomment-395207716)
    3. Zimmergren (https://github.com/Azure/AKS/issues/268#issuecomment-395299308)

    Email Azure AKS Specific Support

    If after all the diagnosis you still suffer from this issue, please don't hesitate to send email to aks-help@service.microsoft.com

提交回复
热议问题