diff --git a/README.md b/README.md index 679302e2e7ffdd54883609cec9b52a3040b5005b..044979881701bf94011dc08a67920a2b9f812e4b 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,8 @@ # kube-prometheus This repository collects Kubernetes manifests, dashboards, and alerting rules -combined with documentation and scripts to deploy them to get a full cluster -monitoring setup working. +combined with documentation and scripts to provide single-command deployments +of end-to-end Kubernetes cluster monitoring. ## Prerequisites @@ -12,62 +12,20 @@ instructions of [bootkube](https://github.com/kubernetes-incubator/bootkube) or repository are adapted to work with a [multi-node setup](https://github.com/kubernetes-incubator/bootkube/tree/master/hack/multi-node) using [bootkube](https://github.com/kubernetes-incubator/bootkube). -Prometheus discovers targets via Kubernetes endpoints objects, which are automatically -populated by Kubernetes services. Therefore Prometheus can -automatically find and pick up all services within a cluster. By -default there is a service for the Kubernetes API server. For other Kubernetes -core components to be monitored, headless services must be setup for them to be -discovered by Prometheus as they may be deployed differently depending -on the cluster. - -For the `kube-scheduler` and `kube-controller-manager` there are headless -services prepared, simply add them to your running cluster: - -```bash -kubectl -n kube-system create manifests/k8s/ -``` - -> Hint: if you use this for a cluster not created with bootkube, make sure you -> populate an endpoints object with the address to your `kube-scheduler` and -> `kube-controller-manager`, or adapt the label selectors to match your setup. - -Aside from Kubernetes specific components, etcd is an important part of a -working cluster, but is typically deployed outside of it. This monitoring -setup assumes that it is made visible from within the cluster through a headless -service as well. - -> Note that minikube hides some components like etcd so to see the extend of -> this setup we recommend setting up a [local cluster using bootkube](https://github.com/kubernetes-incubator/bootkube/tree/master/hack/multi-node). - -An example for bootkube's multi-node vagrant setup is [here](/manifests/etcd/etcd-bootkube-vagrant-multi.yaml). - -> Hint: this is merely an example for a local setup. The addresses will have to -> be adapted for a setup, that is not a single etcd bootkube created cluster. - -Before you continue, you should have endpoints objects for: - -* `apiserver` (called `kubernetes` here) -* `kube-controller-manager` -* `kube-scheduler` -* `etcd` (called `etcd-k8s` to make clear this is the etcd used by kubernetes) - -For example: - -```bash -$ kubectl get endpoints --all-namespaces -NAMESPACE NAME ENDPOINTS AGE -default kubernetes 172.17.4.101:443 2h -kube-system kube-controller-manager-prometheus-discovery 10.2.30.2:10252 1h -kube-system kube-scheduler-prometheus-discovery 10.2.30.4:10251 1h -monitoring etcd-k8s 172.17.4.51:2379 1h -``` - ## Monitoring Kubernetes The manifests used here use the [Prometheus Operator](https://github.com/coreos/prometheus-operator), -which manages Prometheus servers and their configuration in your cluster. To install the -controller, the [node_exporter](https://github.com/prometheus/node_exporter), -[Grafana](https://grafana.org) including default dashboards, and the Prometheus server, run: +which manages Prometheus servers and their configuration in a cluster. With a single command we can install + +* The Operator itself +* The Prometheus [node_exporter](https://github.com/prometheus/node_exporter) +* [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) +* The [Prometheus specification](https://github.com/coreos/prometheus-operator/blob/master/Documentation/prometheus.md) based on which the Operator deploys a Prometheus setup +* A Prometheus configuration covering monitoring of all Kubernetes core components and exporters +* A default set of alerting rules on the cluster component's health +* A Grafana instance serving dashboards on cluster metrics + +Simply run: ```bash export KUBECONFIG=<path> # defaults to "~/.kube/config" @@ -86,23 +44,24 @@ hack/cluster-monitoring/teardown ``` > All services in the manifest still contain the `prometheus.io/scrape = true` -> annotations. It is not used by the Prometheus controller. They remain for +> annotations. It is not used by the Prometheus Operator. They remain for > pre Prometheus v1.3.0 deployments as in [this example configuration](https://github.com/prometheus/prometheus/blob/6703404cb431f57ca4c5097bc2762438d3c1968e/documentation/examples/prometheus-kubernetes.yml). ## Monitoring custom services The example manifests in [/manifests/examples/example-app](/manifests/examples/example-app) -deploy a fake service into the `production` and `development` namespaces and define -a Prometheus server monitoring them. +deploy a fake service exposing Prometheus metrics. They additionally define a new Prometheus +server and a [`ServiceMonitor`](https://github.com/coreos/prometheus-operator/blob/master/Documentation/service-monitor.md), +which specifies how the example service should be monitored. +The Prometheus Operator will deploy and configure the desired Prometheus instance and continiously +manage its life cycle. ```bash -kubectl --kubeconfig="$KUBECONFIG" create namespace production -kubectl --kubeconfig="$KUBECONFIG" create namespace development hack/example-service-monitoring/deploy ``` -After all pods are ready you can reach the Prometheus server monitoring your services -on node port `30100`. +After all pods are ready you can reach the Prometheus server on node port `30100` and observe +how it monitors the service as specified. Teardown: @@ -120,15 +79,58 @@ sidecar container aims to emulate the behavior, by keeping the Grafana database with the provided ConfigMap. Hence, the Grafana pod is effectively stateless. This allows managing dashboards via `git` etc. and easily deploying them via CD pipelines. -In the future, a separate Grafana controller should support gathering dashboards from multiple -ConfigMaps, which are selected by their labels. -Prometheus servers deployed by the Prometheus controller should be automatically added as -Grafana data sources. +In the future, a separate Grafana opeartor will support gathering dashboards from multiple +ConfigMaps based on label selection. ## Roadmap -* Incorporate [Alertmanager controller](https://github.com/coreos/kube-alertmanager-controller) -* Grafana controller that dynamically discovers and deploys dashboards from ConfigMaps +* Alertmanager Operator automatically handling HA clusters +* Grafana Operator that dynamically discovers and deploys dashboards from ConfigMaps * KPM/Helm packages to easily provide production-ready cluster-monitoring setup (essentially contents of `hack/cluster-monitoring`) * Add meta-monitoring to default cluster monitoring setup +* Build out the provided dashboards and alerts for cluster monitoring to have full coverage of all system aspects + +## Monitoring other Cluster Components + +Discovery of API servers and kubelets works the same across all clusters. +Depending on a cluster's setup several other core components, such as etcd or the +scheduler, may be deployed in different ways. +The easiest integration point is for the cluster operator to provide headless services +of all those components to provide a common interface of discovering them. With that +setup they will automatically be discovered by the provided Prometheus configuration. + +For the `kube-scheduler` and `kube-controller-manager` there are headless +services prepared, simply add them to your running cluster: + +```bash +kubectl -n kube-system create manifests/k8s/ +``` + +> Hint: if you use this for a cluster not created with bootkube, make sure you +> populate an endpoints object with the address to your `kube-scheduler` and +> `kube-controller-manager`, or adapt the label selectors to match your setup. + +Aside from Kubernetes specific components, etcd is an important part of a +working cluster, but is typically deployed outside of it. This monitoring +setup assumes that it is made visible from within the cluster through a headless +service as well. + +> Note that minikube hides some components like etcd so to see the extend of +> this setup we recommend setting up a [local cluster using bootkube](https://github.com/kubernetes-incubator/bootkube/tree/master/hack/multi-node). + +An example for bootkube's multi-node vagrant setup is [here](/manifests/etcd/etcd-bootkube-vagrant-multi.yaml). + +> Hint: this is merely an example for a local setup. The addresses will have to +> be adapted for a setup, that is not a single etcd bootkube created cluster. + +With that setup the headless services provide endpoint lists consumed by +Prometheus to discover the endpoints as targets: +```bash +$ kubectl get endpoints --all-namespaces +NAMESPACE NAME ENDPOINTS AGE +default kubernetes 172.17.4.101:443 2h +kube-system kube-controller-manager-prometheus-discovery 10.2.30.2:10252 1h +kube-system kube-scheduler-prometheus-discovery 10.2.30.4:10251 1h +monitoring etcd-k8s 172.17.4.51:2379 1h +``` \ No newline at end of file diff --git a/manifests/prometheus/prometheus-k8s.yaml b/manifests/prometheus/prometheus-k8s.yaml index a2b97753fc498936d3ebb53d43b1dcf8935d40cf..0d13f1e152797301a29abf252b672c068312176c 100644 --- a/manifests/prometheus/prometheus-k8s.yaml +++ b/manifests/prometheus/prometheus-k8s.yaml @@ -4,4 +4,5 @@ metadata: name: prometheus-k8s labels: prometheus: k8s -spec: {} +spec: + version: v1.3.0