Site Reliability Engineering: Top SRE Tools As Voted On By SREs

Data & Analysis

Catchpoint is proud to present the top SRE tools as voted on by SREs.

In our fourth annual SRE Survey, compiled in partnership with VMware Tanzu Observability and DevOps Institute, we simply asked, “What are a few tools that every SRE should have available in their toolbelt?” Today, we are excited to share the findings with you.

Download: The SRE Survey was a primary data source for the annual SRE Report; download it here.

While some of the answers were not strictly tools, the analysis gives us valuable insight into the mindset of an SRE. Whether a vet or a newcomer, this list is sure to provide value (and some humor!) for you, your teams, and your business.

SRE toolkit word cloud

Empirical analyst note: The survey question, “What are a few tools that every SRE should have available in their toolbelt?” elicited 590 unique responses. The survey was open for the month of April 2021. The data presented in this post does not constitute endorsement.

According to the Site Reliability Engineering book, “SRE is what happens when you ask a software engineer to design an operations team.” Since SREs work to increase the efficiency of operational activities and mitigate the risk of their transformational activities, this list is divided into appropriate sections:

  • Operational Tools
  • Developer Tools
  • DevOps Lifecycle Tools
  • Other

Operational Tools

Tool
Percentage
Link
Grafana
3.90%
Prometheus
2.88%
Terraform
2.37%
Catchpoint
2.20%
New Relic
1.69%
Observability Solution
1.36%
Incident Investigation
1.02%
CMDB - Configuration Management Database
0.34%
ThousandEyes
0.34%
Digital Experience Monitoring
0.34%
Honeycomb
0.34%
Configuration Management Tools
0.34%
iPerf
0.17%
Authy
0.17%
LogicMonitor
0.17%
Machine Learning Capabilities
0.17%
Distributed Job Execution
0.17%
AppDynamics
0.17%
Freshdesk
0.17%
Icinga
0.17%
BigPanda
0.17%
Slack
0.17%
status.io
0.17%
Sumo Logic
0.17%
Playbooks/Runbooks
0.17%
Troubleshooting Tools and Capabilities
0.17%
Devolutions
0.17%
Puppet
0.17%
VirtualBox
0.17%
Runbook Automation
0.17%

Developer Tools

Tool
Percentage
Link

DevOps Lifecycle Tools

Other

Tool
Percentage
Link
Stack Overflow
0.17%
Ability To Say No When Necessary
0.17%
Customer Understanding
0.17%
Cloud Computing Capabilities
0.17%
Problem Solving Abilities
0.17%
Regular Reporting
0.17%

SREs Perform A Broad Range Of Activities

SRE is a specific implementation of large DevOps principles and philosophies. While we have categorized this list, it’s important to always bear in mind that, to inspire innovation and solve complex problems, SREs must perform a broad range of activities. This includes low-level operational tasks, tactical implementations, and high-level strategic initiatives.

It’s also important to remember that the derived value of some of these activities is quickly realized. For others, it may take time to show value, or the value may be more difficult to measure.

DevOps Lifecycle for SREs

Regardless of whether a survey response should have been classified as “Dev” vs “Ops,” or even maybe classified as a part(s) of the DevOps lifecycle, one thing is for sure: SRE is here to stay.

We hope this data helps you in your efforts to be free to be an SRE! If you are an SRE and you have recommendations for additional tools to add to the list, Tweet to us at the @Catchpoint handle. We plan to continue to update the list over time and we value your input!

If you'd like to attend SREcon, you can register here: https://www.usenix.org/conference/srecon21

Published on
Oct 12, 2021
,
updated on
Back To Top
Director of Product Marketing