NodeOutOfMemory
Kubernetes Node Out Of Memory
Overview
Resolve the issue of a Kubernetes node running out of memory to ensure the stability of the cluster.
Initial Response
Alert received indicating node out of memoy.
Acknowledge the alert and assign yourself as the incident owner.
Notify the team about the ongoing incident using the primary communication channel.
Update the incident status on the incident tracking system.
Detailed Steps
1) Identify Affected Node
Use the kubectl get nodes command to list all nodes in the cluster and identify the node that is running out of memory.
kubectl get nodes2) Check Node Resource Usage
Check the resource usage of the affected node to understand what's consuming the memory. Use the following command to get detailed information about the node's resource utilization:
kubectl describe node <node-name>3) Check Pods Resource Usage
Identify the pods running on the affected node that might be consuming excessive memory. Use the following command to list all pods on the node:
kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=<node-name>Escalation:
If the issue persists or is severe, escalate to a senior SRE engineer for additional support and guidance.
Further Information
Last updated