Stop doing LoadBalancer tests in gce-master-scale-correctness#36720
Stop doing LoadBalancer tests in gce-master-scale-correctness#36720k8s-ci-robot merged 1 commit intokubernetes:masterfrom
Conversation
The tests do not pass reliably at that scale.
|
/lgtm |
|
/assign @wojtek-t |
I'm actually pretty worried about that. If we remove those tests completely from large scale tests, we will be on a slipper slope to newer degradations in this area. |
OK, but note that load balancing is expected to be unusable in a 5000-node cluster on GCP, when using load balancers configured in the way k8s configures them by default. We are testing a configuration that Google does not support (and then sending out release-informing alerts when it turns out that the configuration that wasn't expected to work actually doesn't work).
IIRC @bowei said our load balancer configuration should be fully supported on 1000-node clusters.
Kube-proxy doesn't do anything that would make LoadBalancer Services scale any differently than ClusterIP and NodePort Services, so we shouldn't need the LB tests to catch kube-proxy scaling problems. And if there are cloud-provider-gcp scaling problems, then those should trigger cloud-provider-gcp alerts, not k/k alerts.
But right now it's the E2E Test That Cried "Wolf!". We ignore the failures anyway, because they're always GCP's fault... |
OK - fair. That I agree with.
We ignore them, because they are flakes. |
|
But I guess the argument about "it's not k/k itself issue anyway" - is a valid one and kind of over-weights all my arguments. /approve |
|
/assign |
|
Ok, so the problem here is not for the jobs itself or the tests or the cloud provider angle, there are multiple jobs that are cloud provider specific or feature specific. The problem is that the job is informing and that has some expectations of stability since there is a group "ci-signal" that reviews them, and we can not special cast a job on informing to say "well, it can flake, just check if it fail always", so the right way to proceed IMHO is to keep the job in informing stable (remove the loadbalancer tests) and for keeping the current coverage find owners that maintain #36773 /hold cancel /lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea, danwinship, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@danwinship: Updated the
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
gce-master-scale-correctnessperiodically starts flaking because the LoadBalancer tests do not reliably pass under GCE at that scale, at least in the way this test configures the cluster.Discussion with @bowei and @serathius at KubeCon led to the conclusion that we should just drop the LB tests from this particular job.
(Which is to say, we should change the skip rule from "skip any tests with
[Feature:...]tags that aren't[Feature:LoadBalancer]" to "skip any tests with[Feature:...]tags".)Fixes kubernetes/kubernetes#131863