Prometheus and Service Discovery: A Perfect Match for Dynamic Environments
Introduction to Service Discovery
Service discovery is a mechanism by which services or applications dynamically identify and connect to networked services or resources. In the context of monitoring with Prometheus, service discovery automates the process of finding and scraping targets for metrics collection. Instead of manually configuring each target, service discovery allows Prometheus to detect and start monitoring new targets as they appear or change within the environment.
Why Service Discovery is Crucial
In modern dynamic environments, such as those using microservices architecture or running on cloud platforms, the number of services and their instances can frequently change. Manually updating the monitoring configuration every time a service instance is added, removed, or relocated is not feasible. This is where service discovery becomes extremely beneficial.
Key Benefits of Service Discovery
- Automation: Automatically detect and configure new services without manual intervention.
- Scalability: Efficiently handle environments with a large number of services and instances.
- Adaptability: Quickly adapt to changes in the infrastructure, such as scaling up or down.
Relabeling
Relabeling is a powerful feature in Prometheus that allows users to dynamically rewrite the labels of targets before they are scraped or stored. It provides a flexible mechanism to modify and filter the metadata of discovered targets, ensuring that the labels align with the desired monitoring and querying requirements.
Types of Service Discovery in Prometheus
Prometheus supports several service discovery mechanisms, including file-based discovery and integrations with cloud providers like AWS EC2. Let’s delve into these methods and understand how they work.
File-based Service Discovery
File-based service discovery uses a file containing target configurations that Prometheus reads and scrapes for metrics. This method is simple and effective for environments where services are relatively static or changes are scripted.
- Create One Instance for Prometheus Server and other two instance for Targets (node-exporter).
Step-by-Step Guide for File-based Service Discovery
- Create a Target File: Create a JSON or YAML file listing the targets to be monitored on prometheus server instance.
vim File-SD.yml
- targets: ["<target-node-ip>:9100"]
labels:
region: "India"
team: "Testing"
platform: "AWS"
- targets: ["<target-node-ip>:9100"]
labels:
region: "US"
team: "Development"
platform: "AZURE"
2. Configure Prometheus: Modify the prometheus.yml
configuration file to include the file-based service discovery.
vim prometheus.yml
- job_name: "FILE-SD"
file_sd_configs:
- files:
- File-SD.yml
relabel_configs:
- source_labels: [team]
regex: "Test.*"
replacement: "QAteam"
target_label: team
3. Run Prometheus: Start or restart Prometheus to apply the new configuration.
./prometheus &
Prometheus will now read the target file at regular intervals and update its list of scrape targets accordingly.
4. Verify Targets: Access the Prometheus web UI (http://public-ip:9090/targets
) to verify that the targets are being discovered and scraped.
EC2 Service Discovery
EC2 service discovery is particularly useful for environments running on AWS. Prometheus can automatically discover EC2 instances using the AWS API, making it ideal for dynamic cloud environments where instances are frequently created and terminated.
We will use Other AWS account for this demo, not the one where prometheus is running.
- Configure Prometheus: Modify the
prometheus.yml
configuration file to include EC2 service discovery.
- job_name: "EC2-SD"
ec2_sd_configs:
- access_key: <Your-Access-Key>
secret_key: <Your-Secret-Key>
region: us-east-2
2. Launch EC2 Instances: Ensure you have EC2 instances running with the appropriate tags (e.g., Name=node_exporter
) and that they expose metrics (e.g., running Node Exporter on port 9100).
- For setting up node exporter, we will put below code in user-data.
#!/bin/bash
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz
tar -xvzf node_exporter-1.8.2.linux-amd64.tar.gz
node_exporter-1.8.2.linux-amd64/node_exporter &
- Launch Instance, the node exporter will automatically configured.
- Verify at
http://public-ip:9100/metrics
3. Start Prometheus: Run Prometheus with the updated configuration.
./prometheus &
4. Verify Targets: Access the Prometheus web UI (http://<prometheus-server-ip>:9090/targets
) to verify that the EC2 instances are being discovered and scraped.
- The configuration has been successfully done but the instance is still in
UNKNOWN
state.
- This is because the Endpoint that it captures is the private IP of the exporter instance with 80 port number. But the actual endpoint is
http://public-ip:9100/metrics.
So we need to do relabeling here. - We will put the
__meta_ec2_public_ip
in the__address__
label instead of__meta_ec2_private_ip
relabel_configs:
source_labels: [__meta_ec2_public_ip]
regex: "(.*)"
replacement: "${1}:9100"
target_label: __address__
- To reload the prometheus process, we will use below command:
kill -HUP `pgrep prometheus`
- The changes have been made to prometheus after reloading and hence the state of the instance is
UP
.
5. Giving labels to Dynamic Targets:
- For giving the labels to the targets, we first have to give the tags to the AWS ec2 instance while launching them.
- Now we want to put these AWS ec2 tags to our target labels, so that it will align with desired quering requirements.
- Go to the
prometheus.yml
and provide the relabel configs there.
- source_labels: [__meta_ec2_tag_Group]
regex: "(.*)"
replacement: "QATeam"
target_label: Group
- source_labels: [__meta_ec2_tag_Project]
regex: "(.*)"
replacement: "${1}"
target_label: Project
- source_labels: [__meta_ec2_tag_Team]
regex: "(.*)"
replacement: "${1}"
target_label: Team
- Save the configuration file and reload the prometheus.
- Now we can quering all the instance with similar labels.
Conclusion
Service discovery in Prometheus significantly enhances the efficiency and scalability of monitoring in dynamic environments. By automating the detection of targets, it reduces the operational burden and ensures that your monitoring setup adapts to changes in real-time. Whether using file-based service discovery for static environments or leveraging EC2 service discovery for dynamic cloud infrastructure, Prometheus offers robust solutions to keep your monitoring up-to-date and accurate.