Skip to content

Commit 2c5f476

Browse files
RahuldrabitCopilot
andcommitted
fix: handle None return from get_container_runtime()
The get_container_runtime() method was returning None when called immediately after cluster creation, before any nodes reached Ready state. This caused a TypeError: argument of type 'NoneType' is not iterable when the code tried to check if 'docker' was in the runtime string. Changes: - Add retry logic with 60s timeout to get_container_runtime() to wait for at least one node to become Ready - Add explicit None check in SymptomFaultInjector.__init__() with a clear error message if container runtime cannot be detected Fixes network_delay, pod_failure, and other chaos mesh injection tasks that were crashing on initialization. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 6c38435 commit 2c5f476

2 files changed

Lines changed: 23 additions & 5 deletions

File tree

aiopslab/generators/fault/inject_symp.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,12 @@ def __init__(self, namespace: str):
2828

2929
container_runtime = self.kubectl.get_container_runtime()
3030

31+
if container_runtime is None:
32+
raise ValueError(
33+
"Could not detect container runtime. "
34+
"Ensure the cluster is running and at least one node is Ready."
35+
)
36+
3137
if "docker" in container_runtime:
3238
pass
3339
elif "containerd" in container_runtime:

aiopslab/service/kubectl.py

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -54,15 +54,27 @@ def get_cluster_ip(self, service_name, namespace):
5454
service_info = self.core_v1_api.read_namespaced_service(service_name, namespace)
5555
return service_info.spec.cluster_ip # type: ignore
5656

57-
def get_container_runtime(self):
57+
def get_container_runtime(self, max_wait: int = 60, poll_interval: int = 2):
5858
"""
5959
Retrieve the container runtime used by the cluster.
6060
If the cluster uses multiple container runtimes, the first one found will be returned.
61+
62+
Args:
63+
max_wait: Maximum seconds to wait for a Ready node (default: 60)
64+
poll_interval: Seconds between checks (default: 2)
65+
66+
Returns:
67+
Container runtime version string, or None if no Ready node found within max_wait.
6168
"""
62-
for node in self.core_v1_api.list_node().items:
63-
for status in node.status.conditions:
64-
if status.type == "Ready" and status.status == "True":
65-
return node.status.node_info.container_runtime_version
69+
elapsed = 0
70+
while elapsed < max_wait:
71+
for node in self.core_v1_api.list_node().items:
72+
for status in node.status.conditions:
73+
if status.type == "Ready" and status.status == "True":
74+
return node.status.node_info.container_runtime_version
75+
time.sleep(poll_interval)
76+
elapsed += poll_interval
77+
return None
6678

6779
def get_pod_name(self, namespace, label_selector):
6880
"""Get the name of the first pod in a namespace that matches a given label selector."""

0 commit comments

Comments
 (0)