others-how to solve the problem of `prometheus alertmanager failed to start `?

1. Purpose

problem

prometheus alertmanager failed to start

^C[bswen@iZ235g0vxqjZ alertmanager]$ tail -f nohup.out
ts=2022-08-31T07:33:46.444Z caller=main.go:232 level=info build_context="(go=go1.17.8, user=root@265f14f5c6fc, date=20220325-09:31:33)"
ts=2022-08-31T07:33:46.445Z caller=cluster.go:185 level=info component=cluster msg="setting advertise address explicitly" addr=10.117.134.201 port=9094
ts=2022-08-31T07:33:46.455Z caller=cluster.go:680 level=info component=cluster msg="Waiting for gossip to settle..." interval=2s
ts=2022-08-31T07:33:46.495Z caller=coordinator.go:113 level=info component=configuration msg="Loading configuration file" file=./alertmanager.yml
ts=2022-08-31T07:33:46.496Z caller=coordinator.go:126 level=info component=configuration msg="Completed loading of configuration file" file=./alertmanager.yml
ts=2022-08-31T07:33:46.499Z caller=main.go:535 level=info msg=Listening address=:9093
ts=2022-08-31T07:33:46.499Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false
ts=2022-08-31T07:33:48.456Z caller=cluster.go:705 level=info component=cluster msg="gossip not settled" polls=0 before=0 now=1 elapsed=2.000595828s
ts=2022-08-31T07:33:49.955Z caller=main.go:574 level=info msg="Received SIGTERM, exiting gracefully..."
ts=2022-08-31T07:33:49.955Z caller=cluster.go:689 level=info component=cluster msg="gossip not settled but continuing anyway" polls=1 elapsed=3.500199861s

start.sh

nohup /opt/alertmanager/alertmanager-0.24.0.linux-amd64/alertmanager --config.file=./alertmanager.yml > nohup.out &



2. Solution

2.1 The basics

Prometheus Alertmanager is a component of the Prometheus monitoring and alerting toolkit. It handles alerts sent by Prometheus server by taking the following actions:

  1. Notification Deduplication: It ensures that the same alert is not sent repeatedly.
  2. Silencing: It allows you to silence alerts for a certain period of time.
  3. Inhibit Rules: It can inhibit firing of certain alerts if other alerts are already firing.
  4. Routing: It can route different types of alerts to different notification systems or endpoints.
  5. Templates: It supports custom notification templates for various media like email, Slack, PagerDuty, etc.

Alertmanager is designed to work with Prometheus, but it can be used with any system that can send webhook payloads.

To start using Prometheus Alertmanager on Linux, follow these steps:

  1. Download and Install: You can download the binary for Alertmanager from the Prometheus GitHub releases page. For example, if you are using a 64-bit Linux system, you might download the alertmanager-<version>.linux-amd64.tar.gz file.

  2. Extract the Binary: Once downloaded, extract the binary and move it to a directory that’s in your system’s PATH environment variable, or simply keep it in a directory and specify the full path when you run it.

  3. Configuration: Before you can start Alertmanager, you need to create a configuration file. This file defines how alerts should be routed and what notification methods to use. A basic configuration file (alertmanager.yml) might look like this:

global:
  resolve_timeout: 5m

route:
  group_by: ['alertname', 'job']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 1h
  receiver: 'webhook-receiver'

receivers:
- name: 'webhook-receiver'
  webhook_configs:
  - url: 'http://your-webhook-url'
  1. Run Alertmanager: With the configuration file in place, you can start Alertmanager by running the binary with the -config.file flag pointing to your configuration file:
./alertmanager --config.file=alertmanager.yml
  1. Verify Installation: Open a web browser and navigate to http://<server-ip>:9093, where <server-ip> is the IP address of the machine where Alertmanager is running. You should see the Alertmanager web interface.

  2. Integration with Prometheus: To integrate Alertmanager with Prometheus, you’ll need to edit the Prometheus configuration file (usually named prometheus.yml) and add the Alertmanager configuration under the alerting section:

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - '<alertmanager-server-ip>:9093'

Replace <alertmanager-server-ip> with the IP address or hostname of the server where Alertmanager is running.

  1. Reload Prometheus: After updating the Prometheus configuration, you’ll need to reload Prometheus for the changes to take effect. This can typically be done by restarting the Prometheus service.

Remember to secure your Alertmanager instance, especially if it’s accessible over the internet, by setting up authentication and encryption as necessary. Also, be aware that the configuration and usage may vary depending on your specific requirements and environment.

2.2 The solution

Here is the solution: disable alertmanager cluster mode:

Turn off high availability If running Alertmanager in high availability mode is not desired, setting –cluster.listen-address= prevents Alertmanager from listening to incoming peer requests.

the final start.sh

nohup /opt/alertmanager/alertmanager-0.24.0.linux-amd64/alertmanager --config.file=./alertmanager.yml --cluster.listen-address= > nohup.out &

now alertmanager started ok:

[bswen@iZ235g0vxqjZ alertmanager]$ ./start.sh
[bswen@iZ235g0vxqjZ alertmanager]$ nohup: redirecting stderr to stdout

[bswen@iZ235g0vxqjZ alertmanager]$
[bswen@iZ235g0vxqjZ alertmanager]$ tail -f nohup.out
ts=2022-08-31T07:37:19.252Z caller=main.go:231 level=info msg="Starting Alertmanager" version="(version=0.24.0, branch=HEAD, revision=f484b17fa3c583ed1b2c8bbcec20ba1db2aa5f11)"
ts=2022-08-31T07:37:19.252Z caller=main.go:232 level=info build_context="(go=go1.17.8, user=root@265f14f5c6fc, date=20220325-09:31:33)"
ts=2022-08-31T07:37:19.300Z caller=coordinator.go:113 level=info component=configuration msg="Loading configuration file" file=./alertmanager.yml
ts=2022-08-31T07:37:19.300Z caller=coordinator.go:126 level=info component=configuration msg="Completed loading of configuration file" file=./alertmanager.yml
ts=2022-08-31T07:37:19.304Z caller=main.go:535 level=info msg=Listening address=:9093
ts=2022-08-31T07:37:19.305Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false



3. Summary

In this post, I demonstrated how to solve the startup problem of prometheus alertmanager. That’s it, thanks for your reading.