Site Reliability Engineers (SREs) Say Incident Resolution, Related Stress and Excessive Manual Tasks Are Key Concerns
NEW YORK--Mar 25, 2019--Catchpoint™, the digital experience monitoring (DEM) leader, today released findings of its 2019 global survey of site reliability engineers (SRE) who say incident resolution is a major part of their role.
The survey showed incident resolution can contribute to a stressful work environment with management sometimes unaware of that impact. Incidents are defined as unplanned service interruptions including outages, operational overload, slowdowns in delivery, notification fatigue and other unanticipated events.
The 2019 survey of 188 professionals with the title or responsibilities of SRE, conducted by Catchpoint in January and February, also revealed:
- 49 percent said they had worked on an incident in the last week, while the same percentage stated they have worked on outages longer than a day in their career.
- Nearly 60 percent said their responsibilities involve excessive amounts of manual, repetitive tasks; only 38 percent said they’ve used automation to reduce that toil.
- 64 percent said their role or SRE team has been in existence for three years or less, indicating the job description is still evolving.
- 67 percent of SREs who feel stress after each incident do not believe their company cares about their well-being.
The full survey report is available here: https://www.sresurvey2019.com/.
“While stress is part of an SRE’s job, the survey shows incidents have been normalized and many organizations are not addressing the impact,” comments Nithyanand Mehta, VP of Professional Services at Catchpoint. “Combine this with the 48 percent who said their company hasn’t defined service level objectives for essential services, and a question emerges: is the SRE role evolving proactively based on business needs and employee satisfaction, or is it becoming reactive and contributing to IT’s high turnover rate?”
Catchpoint Analysis: Organizational culture is often shaped by incidents and stress levels. Respondents reported that the majority of incidents were related to “massive changes made under duress to meet deadlines,” with little interest in conducting a post-incident review. The levels of stress won’t be reduced until better processes are put in place. These include building systems, applications and services better from the start, sharing knowledge around incidents, and automating communication around incident response (e.g., setting up Slack channels, post mortem pages, and status updates).
All companies, whether they have been doing SRE for over a decade or have just started, can find ways to improve their incident management and reduce toil with better automation and improved alerting.
“The role of the SRE is critical in an era where the digital experience is directly connected to business outcomes,” comments Mehdi Daoudi, CEO and co-founder of Catchpoint. “By focusing on the human element, our second SRE survey can hopefully shed light on what effect experience-impacting incidents like outages or slowdowns have on your teams and their ability to avoid or contain them. My biggest takeaway: if most SREs are spending excessive time in repetitive tasks, this does not leave enough room for the key components of a true SRE team - capacity planning, and improving the performance, availability and resiliency of the systems, applications and services for which they are responsible.”
LinkedIn currently lists over 2,000 U.S. job openings for SREs, twice as many as last year at this time, when Catchpoint released its first SRE survey.
Catchpoint is revolutionizing end-user experience monitoring to help companies deliver amazing digital experiences. Our platform provides complete visibility into your users’ experiences from anywhere – and real-time intelligence into your applications and services to detect and fix issues faster. We are proud to partner with digital innovators like Google, L’Oréal, Verizon, Oracle, LinkedIn, Honeywell, Priceline, and Qualtrics, who trust Catchpoint to improve their brand experience and drive their business success. See how Catchpoint can reduce your Mean Time to Detect at www.catchpoint.com/freetrial.
View source version on businesswire.com:https://www.businesswire.com/news/home/20190325005196/en/
CONTACT: The Medialink Group for Catchpoint:
- Kristina LeBlanc, email@example.com
- Frank Cioffi, firstname.lastname@example.org