diff --git a/docs/src/concepts/images/monitoring.excalidraw.png b/docs/src/concepts/images/monitoring.excalidraw.png
new file mode 100644
index 0000000000000000000000000000000000000000..8c129e9f1ec55f46a2c86456b20b62e8baf66419
Binary files /dev/null and b/docs/src/concepts/images/monitoring.excalidraw.png differ
diff --git a/docs/src/concepts/monitoring.md b/docs/src/concepts/monitoring.md
new file mode 100644
index 0000000000000000000000000000000000000000..e0b4f197fdb28b253612e32f77c38531aa603e2f
--- /dev/null
+++ b/docs/src/concepts/monitoring.md
@@ -0,0 +1,104 @@
+# Monitoring
+
+The Shivering-Isles Infrastrcture provides various services and tries to achieve a good Service Level. To validate the achievement of these service Levels, internal and external monitorings systems constantly check the status of the system and notify administrators if something goes wrong.
+
+Since monitoring systems are supposed to notify about outages, it's important that they continue to function during outages. While also keeping costs in check.
+
+## The overall setup
+
+![Overview over the connections of the Kubernetes Cluster internal monitoring and the external services like Grafana Cloud, StatusCake, Uptime Robot and SI-GitLab.](images/monitoring.excalidraw.png)
+
+The Shivering-Isles infrastructure monitoring is split between internal and external monitoring.
+
+For internal monitoring the `kube-prometheus-stack` is used and provides insights into all running applications and overall cluster health.
+
+External monitoring uses a multituide of providers to regularly check the availabilty of externally available services such as the Shivering-Isles Blog or Microblog.
+
+## Internal Monitoring
+
+The internal monitorings is defined using the prometheus-operator resources such as `ServiceMonitors` or `PodMonitors` in combination with `PrometheusRules`.
+
+```yaml
+---
+apiVersion: monitoring.coreos.com/v1
+kind: ServiceMonitor
+metadata:
+  name: example
+  namespace: example
+spec:
+  selector:
+    matchLabels:
+      app: example
+  namespaceSelector:
+    matchNames:
+    - example
+  endpoints:
+  - port: metrics
+---
+apiVersion: monitoring.coreos.com/v1
+kind: PodMonitor
+metadata:
+  name: example
+  namespace: example
+  labels:
+    app.kubernetes.io/name: example
+spec:
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: example
+  podMetricsEndpoints:
+    - port: metrics
+---
+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
+metadata:
+  name: example
+  namespace: example
+spec:
+  groups:
+  - name: example
+    rules:
+    - alert: ExampleAlert
+      annotations:
+        description: Very examplish Alert that will trigger for some reason. Just ignore it, it's just an example.
+        summary: Examplish Alert, please ignore.
+      expr: absent(prometheus_sd_discovered_targets{config="serviceMonitor/example/example/0"})
+      for: 10m
+      labels:
+        issue: Just ignore it, it's just an example.
+        severity: info
+```
+
+To view metrics and details, an internal Grafana instance exists that provides Dashboards, that are directly created from configmaps along with the applications.
+
+Finally there is an alert Manager that sends all critical alerts off to the external systems as well as keeping a hearthbeat with the external Alertmanager to make sure the cluster monitoring is still functional and the SI-GitLab to open issues for critical alerts, so they aren't missed.
+
+## External Monitoring
+
+The external Monitoring is setup across various external systems. Most importantly Grafana Cloud, but also StatusCake and UptimeRobot.
+
+### UptimeRobot, StatusCake and Synthetic Monitoring
+
+UptimeRobot, StatusCake and Synthetic Monitoring are cloud service that allow to send Requests to public endpoints and measure the results from various locations in an interval. Providing external visibility for the infrastructure.
+
+In the Shivering-Isles Infrastructure this monitoring allows to validate external connectivity indepentent of the internal monitorings. This is especially important since the [Ingress Termination](./ingress-termination.md) allows externally available services to be fully available, while being at home, while external connectivity is interrupted. This is not a theoretical scenario, it has taken place many times in the past.
+
+UptimeRobot and StatusCake send their outage reports via E-Mail.
+
+### SI-GitLab
+
+GitLab runs outside the home infrastructure on an external VPS. This makes it independent of the home infrastructure and just keeps track of issues send by the internal alertmanager.
+
+### Grafana Cloud Alertmanager and Prometheus
+
+Besides Synthetic Monitoring, which is already discussed in a previous section, Grafana Cloud also provides internal Prometheus instances, which isn't used for anything that the metrics of the Synthetic Monitoring. It is acompanied by an Alertmanager that is triggered by Prometheus alerts, when Synthetic Monitoring reports outages of websites and services.
+
+### Grafana OnCall
+
+Grafana OnCall is the center for all critical alerts. It monitors the Grafana Cloud Alertmanager as well as the Alertmanager running in the Kubernetes cluster for hearthbeats. Further the Alertmanagers forward critical alerts to the OnCall instance which then triggers an escalation to notify an Admin via SMS and the Grafana OnCall app about outages.
+
+This is particularly relevant, since the SI-Infrastructure also runs mailserver which can and do become unavailable. This prevents UptimeRobot and StatusCake from reporting outages.
+
+## SLOs and SLAs
+
+Topics around SLOs and SLAs are described in the [SRE-Sektion](sre.md)