Observability guidelines
We are still refining our observability guidelines. If in doubt - please reach out on #protocol-public channel on Grove Discord.
Metrics
Overview
In our system, metrics are exposed using the Prometheus exporter. This approach aligns with tools like Rollkit, and we leverage the go-kit metrics package for custom metrics implementation. For practical examples of metric definitions, refer to RelayMiner Metrics.
Types of Metrics
Counter: A cumulative metric that represents a single numerical value that only ever goes up. Ideal for counting requests, tasks completed, errors, etc.
Gauge: Represents a single numerical value that can arbitrarily go up and down. Suitable for measuring values like memory usage, number of active goroutines, etc.
Histogram: Captures a distribution of values. It divides the range of possible values into buckets and counts how many values fall into each bucket. Useful for tracking request durations, response sizes, etc.
High Cardinality Considerations
Developers should be cautious about the high cardinality of labels. High cardinality labels can significantly increase the memory usage and reduce the performance of the Prometheus server. To mitigate this:
Limit the use of labels that have a large number of potential values (e.g., user IDs, email addresses).
Prefer using labels with low cardinality (e.g., status codes, environment names).
Regularly review and clean up unused or less useful metrics.
Best Practices
Clarity and Relevance: Ensure that each metric provides clear and relevant information for observability.
Documentation: Document each custom metric, including its purpose and any labels used.
Consistency: Follow the Prometheus Metric and Label Naming Guide for consistent naming and labeling. See more at Prometheus Naming Guide.
Defer: When the code being metered includes conditional branches, defer calls to metrics methods to ensure that any referenced variables are in their final state prior to reporting.
Sufficient Variable Scope: Ensure any variables which are passed to metrics methods are declared in a scope which is sufficient for reference by such calls.
Ensure that these variables are not shadowed by usage of a subsequent walrus operator
:=(redeclaration) within the same scope.The above might require declaring previously undeclared variables which are part of a multiple return.
Examples
Counter
x/proof/keeper/msg_server_create_claim.go
Gauge
x/tokenomics/module/abci.go
Histogram
Logs
Please refer to our own polylog package.
Last updated
Was this helpful?
