Monitoring & Metrics

info

This page covers all the metrics exposed in the Responsive Dashboard as well as how to integrate with our Metrics API (which exposes more metrics than are available in the Dashboard by default).

The Responsive Dashboard

The dashboard in our UI is broken up into three sections:

An Overview that provides at-a-glance information on the health of your application.
A Processing Metrics section that dives deeper into the state of processing.
A Storage Metrics section (only available with Responsive configured storage) that snapshots some information on associated state storage.

Overview Metrics

Title	Metric (Exposed via Metrics API)	Description
Running Containers	This is a derived metric which counts the number of containers emitting `kafka_streams_thread_process_rate`	The number of running Kafka Streams containers
Current Processing Rate	`kafka_streams_thread_process_rate`	This is the number of events per second processed across your entire application
Input Append Rate	`diagnoser_latency_expected_partition_append_rate`	This is the number of events per second that are being appended across all input topic(s) processed by your application
Storage Size (Used)	`hardware_disk_metrics_disk_space_used_bytes` for MongoDB and `node_filesystem_size_bytes` for ScyllaDB	The amount of storage utilized by your state store in the remote database.

Processing Metrics

Title	Metric (Exposed via Metrics API)	Description
Process Rate (Graph)	`kafka_streams_thread_process_rate`	This is the number of events per second processed (per container)
Input Append Rate (Graph)	`diagnoser_latency_expected_partition_append_rate`	This is the number of events per second that are being appended across per input topic-partition processed by your application
Lag (Table)	`kafka_streams_records_lag`	This is a table of per topic-partition lag (the number of events behind the latest record)

Title	Metric (Exposed via Metrics API)	Description
Events Processed	Derivative of `responsive_kafka_streams_source_offset_end` summed by topic	The total number of events processed by this application per day (used for billing)
Expected Latency	`diagnoser_latency_expected_node_seconds`	The expected amount of time it would take for an event to be processed if it were enqueued onto the source topic now.

Title	Metric (Exposed via Metrics API)	Description
Lag (Graph)	`kafka_streams_records_lag`	This is a graph of per topic-partition lag (the number of events behind the latest record)
Processing Ratio	`kafka_streams_thread_{commit,poll,punctuate,process}_ratio`	The percentage of time spent in each of the main phases of computation (commit, poll, punctuate, process)

Title	Metric (Exposed via Metrics API)	Description
Rebalancing	`kafka_streams_rebalance_rate`	This will be a value of 1 if during a period of time there was a rebalance, otherwise it will be 0
Partition Assignment	`kafka_streams_assigned_partitions`	This is the number of partitions assigned to each instance of your application.

Storage Metrics

Title	MongoDB Metric / ScyllaDB Metric	Description
Storage Size	`hardware_disk_metrics_disk_space_used_byte` `node_filesystem_size_bytes`	The amount of remote storage utilized (across all applications using this storage)
Read Latency (Avg)	`mongodb_opLatencies_reads_latency` `rlatencya`	The average read latency
Write Latency (Avg)	`mongodb_opLatencies_writes_latency` `wlatencya`	The average write latency

Metrics API

The metrics API for your organization is available at <org id>-<env id>.metrics.us-west-2.aws.responsive.cloud and authenticates using the API keys you create for that environment in the UI. This means that if you are using prometheus you can configure a prometheus scrape job to scrape these metrics:

job_name: responsive-streams-metrics
scrape_interval: 10s
scheme: https
metrics_path: /export
basic_auth:
  username: <api key> # this is an API key created in your Responsive Cloud environment
  password: <secret>  # this is the secret for the key created above
static_configs:
  - targets:
    - <org id>-<env id>.metrics.us-west-2.aws.responsive.cloud

Once you’ve got an application reporting metrics to Responsive and your API keys, you can simply run local docker Prometheus and Grafana containers to pull data from Responsive. First make sure that you have prometheus.yml in your local directory setup correctly (with the scrape job config from above).

Then run the following docker commands:

docker run --name prometheus -d -p 9090:9090 \
  -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml \
  prom/prometheus

docker run -d -p 3000:3000 --name=grafana grafana/grafana-enterprise

Once that is up and running, you can setup grafana to use your local prometheus instance (use http://host.docker.internal:9090 as the URL for prometheus if you haven’t set up explicit docker networking).

Recreate the Responsive Dashboard

Import the dashboard definition below and you’ll immediately start seeing the metrics show up in Grafana!

Responsive Grafana Dashboard

The Responsive Dashboard​

Overview Metrics​

Processing Metrics​

Storage Metrics​

Metrics API​

Recreate the Responsive Dashboard​