Observability

An API gateway is meant to be a central point of management for ingress traffic to a variety of destinations. It can also be a central point of observance, since it is uniquely qualified to know about all traffic traveling between clients and services. Gloo Gateway is built on the Envoy proxy, which exposes a wealth of metrics providing a view into the health of your system as a whole and a detailed look at each Upstream.


Grafana and Prometheus

The default installation of Gloo Gateway Enterprise includes an instance of both Prometheus and Grafana, as well as the Gloo Gateway Observability service.

The Gloo Gateway Observability service is a Gloo Gateway Enterprise feature.

Prometheus is an open-source systems monitoring and alerting toolkit. The Envoy proxy managed by Gloo Gateway publishes metrics to on port 19000 and the Gloo Gateway pods publish metrics on port 9091. You can run your own instance of Prometheus to harvest the metrics or use the instance of Prometheus created as part of the Gloo Gateway Enterprise installation.

Grafana is an open source analytics and monitoring solution that allows you to query, visualize, alert on and understand metrics. Grafana can use Prometheus as a data source to generate its dashboards. You can run your own instance of Grafana and connect it to the instance of Prometheus that is harvesting metrics from Gloo Gateway.

Gloo Gateway Enterprise’s deployment of Prometheus is configured to scrape metrics from all of the Gloo Gateway pods including the Envoy proxy. The default Grafana deployment uses Prometheus as a data source to generate dashboards and visualizations. The Gloo Gateway Observability service interacts with Grafana to create dynamically generated dashboards for the cluster and individual Upstreams.

While Gloo Gateway Enterprise includes an installation of Prometheus and Grafana, it is possible to use your own existing instances of either application. Please reference the configuration guides for Grafana and Prometheus for more information.


Tracing

Tracking the life of a request as it passes through the API gateway and to other services can be challenging. You want to understand how a flow traversed your system, where there is latency, and how the request was processed. Envoy has built-in tracing capabilities to enable system wide tracing using request ID generation, client trace ID joining, and external trace service integration. Gloo Gateway makes it simple to enable and configure tracing in your environment.

Envoy will send its tracing information to an external trace service, such as Zipkin or Lightstep. The tracing service provider settings for Envoy can be set during installation by editing the Helm chart values, or post installation by updating the ConfigMap that holds the Envoy configuration.

Once a tracing service provider has been configured, tracing can be enabled on a per-listener basis in Gloo Gateway. To assist in identifying the path of a flow, a tracing annotation can be added by each route in a Virtual Service.

Please refer to the tracing guide for more information on setup and configuration.


Stats and Admin Ports

Envoy Admin

The admin port for Envoy is set to 19000 by Gloo Gateway. Through the admin port you can view the metrics for Envoy as well as a large number of other features. You can find more information about the Envoy admin port in the Envoy docs. Gloo Gateway configures port 8081 on the Envoy proxy for metric scraping by Prometheus. If you plan to use your own instance of Prometheus, you will be connecting to port 8081 for metrics collection.

Gloo Gateway Admin

The admin port for all of the Gloo Gateway pods is 9091. If the START_STATS_SERVER environment variable is set to true in Gloo Gateway’s pods, they will listen on port 9091. Functionality available on that port includes Prometheus metrics at /metrics (see more on Gloo Gateway metrics here), as well as admin functionality like changing the logging levels and getting a stack dump.


Next Steps

Now that you have an understanding of how Gloo Gateway supports observability we have a few suggested paths: