Skip to main content

Monitoring & Metrics

info

This page covers all the metrics exposed in the Responsive Dashboard as well as how to integrate with our Metrics API (which exposes more metrics than are available in the Dashboard by default).

The Responsive Dashboard

The dashboard in our UI is broken up into three sections:

  1. An Overview that provides at-a-glance information on the health of your application.
  2. A Processing Metrics section that dives deeper into the state of processing.
  3. A Storage Metrics section (only available with Responsive configured storage) that snapshots some information on associated state storage.

Overview Metrics

Overview Metrics
TitleMetric (Exposed via Metrics API)Description
Running ContainersThis is a derived metric which counts the number of containers emitting kafka_streams_thread_process_rateThe number of running Kafka Streams containers
Current Processing Ratekafka_streams_thread_process_rateThis is the number of events per second processed across your entire application
Input Append Ratediagnoser_latency_expected_partition_append_rateThis is the number of events per second that are being appended across all input topic(s) processed by your application
Storage Size (Used)hardware_disk_metrics_disk_space_used_bytes for MongoDB and node_filesystem_size_bytes for ScyllaDBThe amount of storage utilized by your state store in the remote database.

Processing Metrics

Process RateLag TableInput Append Rate
TitleMetric (Exposed via Metrics API)Description
Process Rate (Graph)kafka_streams_thread_process_rateThis is the number of events per second processed (per container)
Input Append Rate (Graph)diagnoser_latency_expected_partition_append_rateThis is the number of events per second that are being appended across per input topic-partition processed by your application
Lag (Table)kafka_streams_records_lagThis is a table of per topic-partition lag (the number of events behind the latest record)
Events ProcessedExpected Latency
TitleMetric (Exposed via Metrics API)Description
Events ProcessedDerivative of responsive_kafka_streams_source_offset_end summed by topicThe total number of events processed by this application per day (used for billing)
Expected Latencydiagnoser_latency_expected_node_secondsThe expected amount of time it would take for an event to be processed if it were enqueued onto the source topic now.
Lag GraphProcessing Ratio
TitleMetric (Exposed via Metrics API)Description
Lag (Graph)kafka_streams_records_lagThis is a graph of per topic-partition lag (the number of events behind the latest record)
Processing Ratiokafka_streams_thread_{commit,poll,punctuate,process}_ratioThe percentage of time spent in each of the main phases of computation (commit, poll, punctuate, process)
RebalancingAssignment
TitleMetric (Exposed via Metrics API)Description
Rebalancingkafka_streams_rebalance_rateThis will be a value of 1 if during a period of time there was a rebalance, otherwise it will be 0
Partition Assignmentkafka_streams_assigned_partitionsThis is the number of partitions assigned to each instance of your application.

Storage Metrics

Overview Metrics
TitleMongoDB Metric / ScyllaDB MetricDescription
Storage Sizehardware_disk_metrics_disk_space_used_byte
node_filesystem_size_bytes
The amount of remote storage utilized (across all applications using this storage)
Read Latency (Avg)mongodb_opLatencies_reads_latency
rlatencya
The average read latency
Write Latency (Avg)mongodb_opLatencies_writes_latency
wlatencya
The average write latency

Metrics API

The metrics API for your organization is available at <org id>-<env id>.metrics.us-west-2.aws.responsive.cloud and authenticates using the API keys you create for that environment in the UI. This means that if you are using prometheus you can configure a prometheus scrape job to scrape these metrics:

job_name: responsive-streams-metrics
scrape_interval: 10s
scheme: https
metrics_path: /export
basic_auth:
username: <api key> # this is an API key created in your Responsive Cloud environment
password: <secret> # this is the secret for the key created above
static_configs:
- targets:
- <org id>-<env id>.metrics.us-west-2.aws.responsive.cloud

Once you’ve got an application reporting metrics to Responsive and your API keys, you can simply run local docker Prometheus and Grafana containers to pull data from Responsive. First make sure that you have prometheus.yml in your local directory setup correctly (with the scrape job config from above).

Then run the following docker commands:

docker run --name prometheus -d -p 9090:9090 \
-v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml \
prom/prometheus

docker run -d -p 3000:3000 --name=grafana grafana/grafana-enterprise

Once that is up and running, you can setup grafana to use your local prometheus instance (use http://host.docker.internal:9090 as the URL for prometheus if you haven’t set up explicit docker networking).

Recreate the Responsive Dashboard

Import the dashboard definition below and you’ll immediately start seeing the metrics show up in Grafana!

Responsive Grafana Dashboard

Grafana