Prometheus and Service Discovery: A Perfect Match for Dynamic Environments

@Harsh
6 min readJul 19, 2024

--

Introduction to Service Discovery

Service discovery is a mechanism by which services or applications dynamically identify and connect to networked services or resources. In the context of monitoring with Prometheus, service discovery automates the process of finding and scraping targets for metrics collection. Instead of manually configuring each target, service discovery allows Prometheus to detect and start monitoring new targets as they appear or change within the environment.

Why Service Discovery is Crucial

In modern dynamic environments, such as those using microservices architecture or running on cloud platforms, the number of services and their instances can frequently change. Manually updating the monitoring configuration every time a service instance is added, removed, or relocated is not feasible. This is where service discovery becomes extremely beneficial.

Key Benefits of Service Discovery

  1. Automation: Automatically detect and configure new services without manual intervention.
  2. Scalability: Efficiently handle environments with a large number of services and instances.
  3. Adaptability: Quickly adapt to changes in the infrastructure, such as scaling up or down.

Relabeling

Relabeling is a powerful feature in Prometheus that allows users to dynamically rewrite the labels of targets before they are scraped or stored. It provides a flexible mechanism to modify and filter the metadata of discovered targets, ensuring that the labels align with the desired monitoring and querying requirements.

Types of Service Discovery in Prometheus

Prometheus supports several service discovery mechanisms, including file-based discovery and integrations with cloud providers like AWS EC2. Let’s delve into these methods and understand how they work.

File-based Service Discovery

File-based service discovery uses a file containing target configurations that Prometheus reads and scrapes for metrics. This method is simple and effective for environments where services are relatively static or changes are scripted.

  • Create One Instance for Prometheus Server and other two instance for Targets (node-exporter).

Step-by-Step Guide for File-based Service Discovery

  1. Create a Target File: Create a JSON or YAML file listing the targets to be monitored on prometheus server instance.
vim File-SD.yml
- targets: ["<target-node-ip>:9100"]
labels:
region: "India"
team: "Testing"
platform: "AWS"

- targets: ["<target-node-ip>:9100"]
labels:
region: "US"
team: "Development"
platform: "AZURE"

2. Configure Prometheus: Modify the prometheus.yml configuration file to include the file-based service discovery.

vim prometheus.yml
 - job_name: "FILE-SD"
file_sd_configs:
- files:
- File-SD.yml
relabel_configs:
- source_labels: [team]
regex: "Test.*"
replacement: "QAteam"
target_label: team

3. Run Prometheus: Start or restart Prometheus to apply the new configuration.

./prometheus &

Prometheus will now read the target file at regular intervals and update its list of scrape targets accordingly.

4. Verify Targets: Access the Prometheus web UI (http://public-ip:9090/targets) to verify that the targets are being discovered and scraped.

EC2 Service Discovery

EC2 service discovery is particularly useful for environments running on AWS. Prometheus can automatically discover EC2 instances using the AWS API, making it ideal for dynamic cloud environments where instances are frequently created and terminated.

We will use Other AWS account for this demo, not the one where prometheus is running.

  1. Configure Prometheus: Modify the prometheus.yml configuration file to include EC2 service discovery.
- job_name: "EC2-SD"
ec2_sd_configs:
- access_key: <Your-Access-Key>
secret_key: <Your-Secret-Key>
region: us-east-2

2. Launch EC2 Instances: Ensure you have EC2 instances running with the appropriate tags (e.g., Name=node_exporter) and that they expose metrics (e.g., running Node Exporter on port 9100).

  • For setting up node exporter, we will put below code in user-data.
#!/bin/bash
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz
tar -xvzf node_exporter-1.8.2.linux-amd64.tar.gz
node_exporter-1.8.2.linux-amd64/node_exporter &

3. Start Prometheus: Run Prometheus with the updated configuration.

./prometheus &

4. Verify Targets: Access the Prometheus web UI (http://<prometheus-server-ip>:9090/targets) to verify that the EC2 instances are being discovered and scraped.

  • The configuration has been successfully done but the instance is still in UNKNOWN state.
  • This is because the Endpoint that it captures is the private IP of the exporter instance with 80 port number. But the actual endpoint is http://public-ip:9100/metrics. So we need to do relabeling here.
  • We will put the __meta_ec2_public_ip in the __address__ label instead of __meta_ec2_private_ip
relabel_configs:
source_labels: [__meta_ec2_public_ip]
regex: "(.*)"
replacement: "${1}:9100"
target_label: __address__
  • To reload the prometheus process, we will use below command:
kill -HUP `pgrep prometheus`
  • The changes have been made to prometheus after reloading and hence the state of the instance is UP.

5. Giving labels to Dynamic Targets:

  • For giving the labels to the targets, we first have to give the tags to the AWS ec2 instance while launching them.
  • Now we want to put these AWS ec2 tags to our target labels, so that it will align with desired quering requirements.
  • Go to the prometheus.yml and provide the relabel configs there.
 - source_labels: [__meta_ec2_tag_Group]
regex: "(.*)"
replacement: "QATeam"
target_label: Group
- source_labels: [__meta_ec2_tag_Project]
regex: "(.*)"
replacement: "${1}"
target_label: Project
- source_labels: [__meta_ec2_tag_Team]
regex: "(.*)"
replacement: "${1}"
target_label: Team
  • Save the configuration file and reload the prometheus.
  • Now we can quering all the instance with similar labels.

Conclusion

Service discovery in Prometheus significantly enhances the efficiency and scalability of monitoring in dynamic environments. By automating the detection of targets, it reduces the operational burden and ensures that your monitoring setup adapts to changes in real-time. Whether using file-based service discovery for static environments or leveraging EC2 service discovery for dynamic cloud infrastructure, Prometheus offers robust solutions to keep your monitoring up-to-date and accurate.

--

--

@Harsh
@Harsh

Written by @Harsh

A devOps engineer from India

Responses (1)