Amazon Elastic Kubernetes Service is a service provided by AWS which is used to run Kubernetes on AWS without requiring to install and operate control planes or nodes. Monitoring the EKS cluster is an essential requirement to ensure the smooth performance of the application. In this blog, we will see how to monitor the EKS cluster with the help of Prometheus.
Prometheus is an open-source system monitoring and alerting tool which collects and stores its metrics as time series data. A number of metrics that are helpful for monitoring and analysis are exposed by the Kubernetes API server. Internally, these metrics are accessible via a metrics endpoint that uses the /metrics HTTP API. This endpoint is exposed on the Amazon EKS control plane like other endpoints. In this blog, we will see how to use Prometheus to scrape exposed endpoints and collect data.
To deploy Prometheus in the cluster with helm follow the below steps. The Prometheus stack to be installed with helm contains Kubernetes manifests file including Prometheus, Alertmanager, and Grafana. Using Grafana we will be able to visualize the data collected by Prometheus and using Alertmanager we will be able to automatically trigger alarms in case of problems.
Step1: Install helm
- Use the following commands to install helm.
curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 > get_helm.sh chmod 700 get_helm.sh ./get_helm.sh
- Verify you have successfully installed helm by checking the version.
helm version
Step2: Add helm repositories
- Add the helm repositories with the following commands.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo add stable https://charts.helm.sh/stable helm repo update
Step3: Install Prometheus
- First, create a namespace for monitoring with the following command.
kubectl create ns monitoring
- Install Prometheus stack with the following command.
helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring
- Check if all Kubernetes objects are deployed in the monitoring namespace with the following command.
Step4: Access Prometheus and Grafana dashboard
- By default, all services are defined in ClusterIP configuration. To access Prometheus we will edit service manifest files and change ClusterIP to Load Balancer.
- Use the following command to edit the Prometheus service file.
kubectl edit svc prometheus-kube-prometheus-prometheus -n monitoring
- Scroll down and replace ClusterIP with LoadBalancer.
- Save the file and run the following command.
kubectl get svc -n monitoring
- Now, copy the Load Balancer dns name and paste it into the browser with port 9090 and access Prometheus.
- Similarly, edit the Grafana service manifest file to access the Grafana dashboard.
kubectl edit svc prometheus-grafana -n monitoring
- Copy the Load Balancer dns name for Grafana and paste it into the browser for accessing the Grafana dashboard.
- The default username and password for the Grafana dashboard are “admin” and “prom-operator” respectively.
Step5: Set up Alertmanager
- Now we will configure Alertmanager and integrate it with Slack to receive notifications.
- Create a new alerts channel and then click on Add Apps. Search Incoming webhooks and click Add to Slack.
- Choose the alerts channel that we created.
- Next copy the webhook URL.
- Now go to the server and create a file with the name “alertmanager.yaml” and then paste the following code into the file.
global: slack_api_url: 'enter your webhook URL here. Leave the quote marks in place' route: group_by: ['alertname'] group_wait: 5s group_interval: 1m repeat_interval: 10m receiver: 'slack' receivers: - name: 'slack' slack_configs: - channel: '#alerts' icon_emoji: ':bell:' send_resolved: true text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}\nmessage: {{ .CommonAnnotations.message }}"
- Now run the following command.
kubectl get secret -n monitoring
- Next, we will delete the default Alertmanager secret file with the following command.
kubectl delete secret -n monitoring alertmanager-prometheus-kube-prometheus-alertmanager
- Now create a new secret with the alertmanager.yaml file using the following command.
kubectl create secret generic --from-file=alertmanager.yaml -n monitoring alertmanager-prometheus-kube-prometheus-alertmanager
- Now in some time, you will start receiving notifications on your slack channel.
Please contact our technical consultants if you have anything related to cloud infrastructure to be discussed.