Prometheus metrics

Anbox Cloud gathers various performance metrics that you can access through API endpoints to create a monitoring solution. See How to monitor Anbox Cloud for detailed information.

The reference implementation for Prometheus (see Example: Collect metrics) provides some basic dashboards. You can, however, update them to fit your needs.

The following sections list all metrics returned to Prometheus by each component of Anbox Cloud.

Endpoints

The following table lists the API endpoints for every service that supports Prometheus metrics. You can use these endpoints to customise your monitoring solution.

The network ports might be different on your deployment. Run the associated commands to get the appropriate ports.

Service Endpoint Command to get the current port
ams http://ams.example.com:9104/internal/1.0/metrics juju config ams prometheus_target_port
anbox-stream-gateway https://gateway.example.com:9105/internal/1.0/metrics juju config anbox-stream-gateway prometheus_port
lxd https://lxd.example.com:8443/1.0/metrics lxc config get core.https_address

AMS

Metrics prefixed with ams_cluster_ give information about the Anbox management system (AMS). They keep you informed about the status of your cluster.

Name Description Available since
ams_cluster_nodes_total Number of nodes in the cluster 1.0.0
ams_cluster_applications_total Number of applications 1.0.0
ams_cluster_containers_total Number of containers currently in the cluster 1.0.0
ams_cluster_container_boot_time_seconds_count Number of container boot time measured 1.0.0
ams_cluster_container_boot_time_seconds_sum Sum of all container boot times. Can be used to compute the average boot time 1.0.0
ams_cluster_container_boot_time_seconds_bucket Container boot times bucket. Can be used for alerting when above a threshold. See the Prometheus documentation for more information 1.0.0
ams_cluster_containers_per_application_total Number of containers per application 1.0.0
ams_cluster_containers_per_status_total Number of containers per container status 1.0.0
ams_cluster_available_cpu_total Total CPUs available in each worker node 1.0.0
ams_cluster_used_cpu_total Used CPUs in each worker node 1.0.0
ams_cluster_available_memory_total Total memory available in each worker node 1.0.0
ams_cluster_used_memory_total Used memory in each worker node 1.0.0

Anbox Stream Gateway

Metrics prefixed with anbox_stream_gateway_ give you information on your cluster related to streaming: number of sessions, agents, etc.

Name Description Available since
anbox_stream_gateway_sessions_total Total number of sessions, categorised by status 1.7.2
anbox_stream_gateway_accounts_total Total number of accounts 1.7.2
anbox_stream_gateway_agents_total Number of active and unresponsive agents 1.7.2

WebRTC metrics

Metrics prefixed with webrtc_ give you detailed insight of the WebRTC protocol for every streaming instance.
See the official W3C reference for more information.

Name Description Available since
webrtc_frames_encoded Total number of frames successfully encoded 1.8.0
webrtc_key_frames_encoded Total number of key frames, such as key frames in VP8 or IDR-frames in H.264. webrtc_key_frames_encoded - webrtc_frames_encoded will give you the number of delta frames 1.8.0
webrtc_total_encode_time Total number of seconds that has been spent encoding the webrtc_frames_encoded frames. The average encode time can be calculated by dividing this value with webrtc_frames_encoded 1.8.0
webrtc_target_bitrate Reflects the current encoder target in bits per second 1.8.0
webrtc_bytes_sent Total number of bytes sent for a specific SSRC (a SSRC represents one resource sent over a WebRTC track. It can be video, audio or binary data) 1.8.0
webrtc_retransmitted_bytes_sent The total number of bytes that were re-transmitted for a specific SSRC, only including payload bytes 1.8.0
webrtc_retransmitted_packets_sent The total number of packets that were re-transmitted for a specific SSRC 1.8.0
webrtc_total_packet_send_delay The total number of seconds that packets have spent buffered locally before being transmitted onto the network 1.8.0
webrtc_packets_sent Total number of RTP packets sent for this SSRC. This includes re-transmissions 1.8.0
webrtc_nack_count Count the total number of Negative Acknowledgement (NACK) packets received by this sender 1.8.0
webrtc_fir_count Only exists for video. Count the total number of Full Intra Request (FIR) packets received by this sender 1.8.0
webrtc_pli_count Only exists for video. Count the total number of Picture Loss Indication (PLI) packets received by this sender 1.8.0
webrtc_sli_count Only exists for video. Count the total number of Slice Loss Indication (SLI) packets received by this sender 1.8.0

HTTP Metrics

Anbox Cloud services that include an HTTP API are monitored.

Anbox Stream Gateway

A few metrics are available to instrument the API. Routes are indicated in the labels, e.g.:

anbox_stream_gateway_http_request_duration_seconds_bucket{handler="get_sessions",host="juju-2db13b-1",method="get",le="0.5"} 1

The route name would be get_sessions in this case.

Here’s the list of all routes and their corresponding label

Method Route Label
GET /1.0/status get_service_status
POST /1.0/sessions create_new_session
GET /1.0/sessions get_sessions
GET /1.0/sessions/<id> get_session
DELETE /1.0/sessions/<id> delete_session
DELETE /1.0/sessions delete_sessions
POST /1.0/sessions/<id>/join join_session
GET /1.0/sessions/<id>/sockets/master streaming_websocket_master
GET /1.0/sessions/<id>/sockets/slave streaming_websocket_slave
GET /1.0/applications get_applications
GET /1.0/regions get_regions

The different metrics available are as follow:

Name Description Available since
anbox_stream_gateway_http_in_flight_requests Number of HTTP requests being processed at the moment 1.9.0
anbox_stream_gateway_http_request_duration_seconds_bucket The HTTP request latency in seconds 1.9.0
anbox_stream_gateway_http_request_size_bytes_bucket The HTTP request size in bytes 1.9.0
anbox_stream_gateway_http_requests_total Total number of HTTP requests made 1.9.0
anbox_stream_gateway_http_response_size_bytes_bucket The HTTP response sizes in bytes 1.9.0

AMS

AMS can either be reached via HTTP/S or via a Unix domain socket. The API metrics are separate between http, https and unix:

ams_http_request_duration_seconds_bucket{handler="http_applications_GET",host="juju-2db13b-1",method="get",le="0.5"} 1
ams_http_request_duration_seconds_bucket{handler="https_applications_GET",host="juju-2db13b-1",method="get",le="0.5"} 1
ams_http_request_duration_seconds_bucket{handler="unix_applications_GET",host="juju-2db13b-1",method="get",le="0.5"} 1

This distinction gives a more granular approach to monitoring.

All handler names adopt the following convention <transport method>_<object>_<http method>. E.g.: unix_containers_POST
Here’s the list of all routes and their corresponding label (ignoring the communication method prefix)

Method Route Label
GET /1.0 service_GET
GET /1.0/addons addons_GET
POST /1.0/addons addons_POST
GET /1.0/addons/<id> addon_GET
PATCH /1.0/addons/<id> addon_PATCH
DELETE /1.0/addons/<id> addon_DELETE
DELETE /1.0/addons/<id>/<version> addon_version_DELETE
GET /1.0/applications applications_GET
POST /1.0/applications applications_POST
GET /1.0/applications/<id> application_GET
PATCH /1.0/applications/<id> application_PATCH
DELETE /1.0/applications/<id> application_DELETE
GET /1.0/applications/<id>/<version> application_version_GET
PATCH /1.0/applications/<id>/<version> application_version_PATCH
DELETE /1.0/applications/<id>/<version> application_version_DELETE
GET /1.0/applications/<id>/manifest application_manifest_GET
GET /1.0/applications/<id>/<version>/manifest application_version_manifest_GET
GET /1.0/containers containers_GET
POST /1.0/containers containers_POST
GET /1.0/containers/<id> container_GET
DELETE /1.0/containers/<id> container_DELETE
POST /1.0/containers/<id>/exec container_exec_POST
GET /1.0/containers/<id>/logs container_logs_GET
GET /1.0/containers/<id>/logs/<name> container_log_GET
GET /1.0/version version_GET
GET /1.0/nodes nodes_GET
POST /1.0/nodes nodes_POST
GET /1.0/nodes/<id> node_GET
PATCH /1.0/nodes/<id> node_PATCH
DELETE /1.0/nodes/<id> node_DELETE
GET /1.0/images images_GET
POST /1.0/images images_POST
GET /1.0/images/<id> image_GET
PATCH /1.0/images/<id> image_PATCH
DELETE /1.0/images/<id> image_DELETE
DELETE /1.0/images/<id>/<version> image_version_DELETE
GET /1.0/config config_GET
PATCH /1.0/config config_PATCH
GET /1.0/tasks tasks_GET
GET /1.0/registry/applications registry_applications_GET
DELETE /1.0/registry/applications/<id> registry_application_DELETE
POST /1.0/registry/applications/<id>/push registry_application_push_POST
POST /1.0/registry/applications/<id>/pull registry_application_pull_POST
GET /1.0/artifacts/<id> internal_artifacts_GET
PATCH /1.0/containers/<id> internal_containers_PATCH

The different metrics available are as follow:

Name Description Available since
ams_http_in_flight_requests Number of HTTP requests being processed at the moment 1.10.0
ams_http_request_duration_seconds_bucket The HTTP request latency in seconds 1.10.0
ams_http_request_size_bytes_bucket The HTTP request size in bytes 1.10.0
ams_http_requests_total Total number of HTTP requests made 1.10.0
ams_http_response_size_bytes_bucket The HTTP response sizes in bytes 1.10.0

Last updated a month ago.