Skip to content
Snippets Groups Projects
Unverified Commit b54ad2ea authored by Arunprasad Rajkumar's avatar Arunprasad Rajkumar
Browse files

Adjust NodeFilesystemSpaceFillingUp thresholds according default kubelet GC behavior

Previously[1] we attempted to do the same, but there was a
misunderstanding about the GC behavior and it caused the alert to be
fired even before GC comes into play.

According to[2][3] kubelet GC kicks in only when `imageGCHighThresholdPercent` is hit which is set to 85% by default. However `NodeFilesystemSpaceFillingUp` is set to fire as soon as 80% usage is hit.

This commit changes the `fsSpaceFillingUpWarningThreshold` to 15% so
that we give ample time to GC to reclaim unwanted images. This commit
also changes `fsSpaceFillingUpCriticalThreshold` to 10% which gives more time to admins to react to warning before sending critical alert.

[1] https://github.com/prometheus-operator/kube-prometheus/pull/1357
[2] https://docs.openshift.com/container-platform/4.10/nodes/nodes/nodes-nodes-garbage-collection.html#nodes-nodes-garbage-collection-images_nodes-nodes-configuring
[3] https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/



Signed-off-by: default avatarArunprasad Rajkumar <arajkuma@redhat.com>
(cherry picked from commit 6ff8bfbb)
parent 125fb56d
No related branches found
No related tags found
No related merge requests found
......@@ -35,9 +35,12 @@ local defaults = {
// GC values,
// imageGCLowThresholdPercent: 80
// imageGCHighThresholdPercent: 85
// GC kicks in when imageGCHighThresholdPercent is hit and attempts to free upto imageGCLowThresholdPercent.
// See https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/ for more details.
fsSpaceFillingUpWarningThreshold: 20,
fsSpaceFillingUpCriticalThreshold: 15,
// Warn only after imageGCHighThresholdPercent is hit, but filesystem is not freed up for a prolonged duration.
fsSpaceFillingUpWarningThreshold: 15,
// Send critical alert only after (imageGCHighThresholdPercent + 5) is hit, but filesystem is not freed up for a prolonged duration.
fsSpaceFillingUpCriticalThreshold: 10,
diskDeviceSelector: 'device=~"mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+"',
runbookURLPattern: 'https://runbooks.prometheus-operator.dev/runbooks/node/%s',
},
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment