Prometheus Metrics without Prometheus Server

January 06, 2018

At Latency.at we measure performance and availability of your sites and services from multiple global locations and provides the results as Prometheus metrics.

But that doesn’t mean you have to use a Prometheus Server to consume them. While we’re big fans of the Prometheus Server, the simple and open exposition format can be consumed by many other systems and is on it’s way to become the defacto standard of metrics exposition.

DataDog

DataDog already supports retrieving metrics from Prometheus endpoints for some of their checks. While they are working on generic support, currently a custom check is needed. For this we wrote datadog-agent-prometheus which allows you to consume arbitrary Prometheus metrics in DataDog.

Once installed, you can retrieve Latency.at metrics into DataDog with a config like this:

init_config:
  
instances:
  - target: https://sfo1.do.mon.latency.at/probe?target=https://latency.at
    config: &config
      headers: &headers
        Authorization: "Bearer your-token"
      drop:
        - probe_.*
      keep:
        - probe_http_.*
  - target: https://nyc1.do.mon.latency.at/probe?target=https://latency.at
    config: *config
  - target: https://fra1.do.mon.latency.at/probe?target=https://latency.at
    config: *config

InfluxDB/Kapacitor

InfluxDB’s Kapacitor supports scraping Prometheus metrics without any changes. All you need is to configure Kapacitor to scrape Latency.at with your token:

[[static-discovery]]
  enabled = true
  id = "lat-sfo1"
  targets = ["https://sfo1.do.mon.latency.at/probe?target=https://latency.at"]
  bearer-token = "your token"
  [static.labels]
    region = "sfo1.do"
[[static-discovery]]
  enabled = true
  id = "lat-nyc1"
  targets = ["https://nyc1.do.mon.latency.at/probe?target=https://latency.at"]
  bearer-token = "your token"
  [static.labels]
    region = "nyc1.do"
[[static-discovery]]
  enabled = true
  id = "lat-fra1"
  targets = ["https://fra1.do.mon.latency.at/probe?target=https://latency.at"]
  bearer-token = "your token"
  [static.labels]
    region = "fra1.do"

Sensu

Sensu supports scraping Prometheus metrics by using the Sensu Prometheus Collector. We submitted a PR to add support for the Authorization header. You can compile this yourself or wait until this gets merged and release. After that, the Latency.at metrics can be consumed as any other Prometheus endpoint, just make sure to specify the Authorization header in the check definition:

  "checks": {
    "prometheus_metrics": {
      "type": "metric",
      "command": "sensu-prometheus-collector -export-authorization 'Bearer your-token' \
        -exporter-url https://nyc1.do.mon.latency.at/probe?module=http_2xx&target=https%3A%2F%2Fapi.latency.at%2F",
      "subscribers": ["app_tier"],
      "interval": 10,
      "handler": "influx"
    }

Zabbix/Nagios/Custom monitoring

Thanks to the simple text based exposition format, you can integrate Latency.at with your monitoring system even if it doesn’t have first class support for Prometheus metrics.

All you need to retrieve the current health and performance metrics for your site is curl/wget:

$ echo 'Authorization: Bearer your-token' | curl -H@- \
  'https://nyc1.do.mon.latency.at/probe?module=http_2xx&target=https%3A%2F%2Fapi.latency.at%2F'

The returned metrics should look something like this:

# HELP content_length Length of http content response
# TYPE content_length gauge
content_length 29
# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup
in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 0.000847059
# HELP probe_duration_seconds Returns how long the probe took to complete in
seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 0.111374556
# HELP probe_failed_due_to_regex Indicates if probe failed due to regex
# TYPE probe_failed_due_to_regex gauge
probe_failed_due_to_regex 0
# HELP probe_http_duration_seconds Duration of http request by phase, summed
over all redirects
# TYPE probe_http_duration_seconds gauge
probe_http_duration_seconds{phase="connect"} 0.018879332
probe_http_duration_seconds{phase="processing"} 0.021480235
probe_http_duration_seconds{phase="resolve"} 0.000847059
probe_http_duration_seconds{phase="tls"} 0.088617142
probe_http_duration_seconds{phase="transfer"} 5.8852e-05
# HELP probe_http_redirects The number of redirects
# TYPE probe_http_redirects gauge
probe_http_redirects 0
# HELP probe_http_ssl Indicates if SSL was used for the final redirect
# TYPE probe_http_ssl gauge
probe_http_ssl 1
# HELP probe_http_status_code Response HTTP status code
# TYPE probe_http_status_code gauge
probe_http_status_code 200
# HELP probe_http_version Returns the version of HTTP of the probe response
# TYPE probe_http_version gauge
probe_http_version 1.1
# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 4
# HELP probe_ssl_earliest_cert_expiry Returns earliest SSL cert expiry in
unixtime
# TYPE probe_ssl_earliest_cert_expiry gauge
probe_ssl_earliest_cert_expiry 1.522358649e+09
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gauge
probe_success 1

You can grep 'probe_success 1' to see if your site was reachable. You can turn this into a cronjob to get notified should your site become unresponsive:

* * * * * nobody echo 'Authorization: Bearer your-token' \
  | curl -H@- 'https://nyc1.do.mon.latency.at/probe?target=https%3A%2F%2Fapi.latency.at'|grep 'probe_success 1'

Some monitoring systems allow you to run custom commands like this for health checks. With Zabbix’s User parameters you can use this to define a custom check:

UserParameter=latency.probe[*],echo 'Authorization: Bearer your-token'|curl -H@- "https://$2.do.mon.latency.at/probe?target=$1'|grep -c 'probe_success 1'

Now this check can be used to get the availability for a target in a given region:

latency.probe[https%3A%2F%2Fapi.latency.at,nyc1.do]

If you’re using Nagios, you can write a custom plugin or Nagios to execute the check_http like this to check for availability:

./check_http -k 'Authorization: Bearer your-token' -S -H nyc1.do.mon.latency.at \
  -u '/probe?module=http_2xx&target=https%3A%2F%2Fapi.latency.at%2F' -s 'probe_success 1'