- Jul 25, 2020
-
-
tafkam authored
-
- Jul 23, 2020
-
-
Frederic Branczyk authored
Remove instance:node_filesystem_usage:sum
-
Frederic Branczyk authored
jsonnet: update component versions
-
Frederic Branczyk authored
Regenerate dashboards and prometheus alerts
-
Adin Hodovic authored
Merged https://github.com/kubernetes-monitoring/kubernetes-mixin/pull/463 to remove duplicate entries for memory usage, however I'd like to move these changes to the Prometheus-Operator helm chart(https://github.com/helm/charts/pull/23024#issuecomment-661967101). I've regenerated the dashboards/alerts.
-
Simon Pasquier authored
-
Simon Pasquier authored
-
paulfantom authored
-
paulfantom authored
-
- Jul 15, 2020
-
-
Frederic Branczyk authored
Add PrometheusOperatorListErrors and fix PrometheusOperatorWatchErrors threshold
-
Lili Cosic authored
-
Lili Cosic authored
Watch error alert
-
- Jul 14, 2020
-
-
Frederic Branczyk authored
chore(jsonnet):
update jsonnet to master -
Frederic Branczyk authored
prometheus-operator.libsonnet: Add PrometheusOperatorWatchErrors alert
-
- Jul 13, 2020
-
-
Lili Cosic authored
-
Lili Cosic authored
-
Weston McNamee authored
pulls in recent performance improvement changes to speed up rendering resolves #537
-
- Jul 09, 2020
-
-
Lili Cosic authored
jsonnet/kube-prometheus: Bump default versions of prometheus and alertmanager
-
Lili Cosic authored
-
Lili Cosic authored
-
- Jul 03, 2020
-
-
Frederic Branczyk authored
enable etcd latency metrics in kube-apiserver
-
Abu Kashem authored
kube-apiserver has a histogram etcd_request_duration_seconds that measures latency between the kube-apiserver and etcd instance. This metrics is currently dropped by cluster-prometheus. Enable this metrics so we have visibility into etcd latency. We ensured that this does not enable other unwanted metrcis count by(name) ({name=~"etcd_request.+"}) etcd_request_duration_seconds_bucket etcd_request_duration_seconds_count etcd_request_duration_seconds_sum
-
- Jun 30, 2020
-
-
Matthias Loibl authored
Update the Issue templates to redirect to GitHub Discussions.
-
Matthias Loibl authored
-
Frederic Branczyk authored
Update kubernetes-mixin to remove KubeAPILatencyHigh & KubeAPIErrorsHigh
-
- Jun 29, 2020
-
-
Matthias Loibl authored
-
- Jun 26, 2020
-
-
Lucas Servén Marín authored
Fix typo
-
André Sterba authored
-
- Jun 24, 2020
-
-
Simon Pasquier authored
Bump Grafana to v6.7.4
-
Simon Pasquier authored
-
Simon Pasquier authored
-
- Jun 23, 2020
-
-
Frederic Branczyk authored
Updated prometheus adapter deployment to use a multi arch image repo
-
Frederic Branczyk authored
Make node-exporter listening address configurable
-
- Jun 22, 2020
-
-
Tom Quinn authored
-
- Jun 21, 2020
-
-
Kristoffer Dalby authored
-
Kristoffer Dalby authored
-
- Jun 19, 2020
-
-
Frederic Branczyk authored
Fix AlertmanagerConfigInconsistent alert
-
Frederic Branczyk authored
Update prometheus-adapter endpoint
-
Simon Pasquier authored
-
Simon Pasquier authored
Previously the alert would fire when the number of Alertmanager pods didn't match the number of replicas defined in the Alertmanager spec even though all the running pods had the same configuration hash. This type of issue is already covered by KubeStatefulSetUpdateNotRolledOut (and possibly KubePodNotReady), having AlertmanagerConfigInconsistent also active in this situation creates unnecessary noise. With this change, the alert expression only returns when Alertmanager pods have different configuration hash values irrespective of the number of pod replicas. The message annotation has also been enhanced to report the configuration hash for each pod. Signed-off-by:
Simon Pasquier <spasquie@redhat.com>
-