On Monday, Jan 8, 20:04, some of the Appfarm Customer solutions running on a specific Kubernetes node became unstable. At 21:05, the node was back to a healthy state.
The root cause of the downtime was attributed to a Kubernetes pod that excessively consumed system memory, consequently rendering the entire node unhealthy.
Changes we have done to prevent this from happening again:
- Identified and addressed the specific pods causing memory exhaustion. - Implemented measures to prevent pods from overwhelming a node.
We sincerely apologize for any inconvenience this may have caused.