For businesses today, it is critical to have maximum observability over your infrastructure to be able to proactively identify issues. Having full visibility of the health of your entire Kubernetes environment means you can resolve issues faster, reducing the time to recovery. Observability means using metrics to measure health, logs to analyse and troubleshoot and traces to debug slow transactions.
This is intended for the Operations Manager or CTO.
Use custom metrics to help provide deep context to the particular areas of concern to understand factors that can be tuned to optimise performance
Perform trend analysis to identify repeating issues and automate decisions based off this
Structure your logs so they can be queried and use tracing to identify where logging is more relevant, limiting log noise and maximising the important messages