What a difference a year makes. In a matter of 365 days, the entire planet stared down at uncertainty, and while most of the world is far from recovered, we are starting to see a time where some level of normalcy will return. But what will this look like? How will the past year transform our social interactions, our time out of the house, and how we conduct business?
Each year, we issue a survey focused on the Site Reliability Engineer (SRE). This emerging role within DevOps supports businesses in assuring availability, performance, reachability, and reliability of systems, networks, and applications. While this role mostly comes from the development operations side of the ITOps group, their focus goes beyond code testing. The SRE is pivotal in making sure that the code works, but so do all the supporting pieces that deliver the code to the user, whether it be a customer or an employee.
We just launched the annual SRE survey for 2021, it will be very interesting to see the results. Because of the events in early 2020, we had launched an addendum survey to understand how the shift to a remote workforce impacted the SRE community. The first survey focused on the role of an SRE, and the addendum survey was added after much of the world went into lockdown. The collective results from both pieces were very interesting. There were four key insights extracted from last year’s results. Two of which seem to have been heavily influenced by the global pandemic.
Observability components exist, but observability does not.
What our SRE colleagues told us last year is that there was more of an internal focus for measuring performance and availability than an external view. This meant that while developed code was highly monitored, there was less focus on the end user’s experience with said code. Less emphasis was placed on monitoring and optimizing the service delivery chain. So even though the organization had a focus on keeping error rates low, impacts to the end-user such as response time were not as equally monitored. This creates a risk for a poor digital experience. True observability requires an inference of both internal and external outputs.
Heavy ops workload comes at a cost.
Last year, our survey respondents showed that there was a significant emphasis placed on operations-based work. This means less time was focused on development activities. While one would think that this was based on the disruptive year, the numbers were nearly identical before and after lockdowns occurred. Like all things, a healthy balance is needed. More emphasis on development activities should result in a reduction of operations work. It will be interesting to see if the distribution of work for an SRE changes for this year, and a stronger focus on DevOps performance is realized.
The shift to remote creates opportunities and challenges.
While most IT organizations had to make adjustments to working from home, this created a new set of challenges. It was no longer a discussion about the balance between revenue-generating projects versus toil. It became a challenge of work/life balance. While this might have been an area of risk for any home-based worker, it was especially difficult for IT professionals. Most organizations had to rely on the internet to work. This added stress to IT teams because they were no longer supporting employees behind the corporate firewalls using their LAN. Now they were supporting these teammates remotely over a network that they cannot control. Add the additional factors of everyone being home all day and night. Maintaining balance, let alone, focus was a problem. This shift has redirected how the employee must be empowered and entrusted to optimize their own performance. This means an acceleration into cloud-based services, infrastructure, and endpoint controls, which could become a significant productivity improvement if not a strategic advantage.
The future of SRE is remote and bright.
If there has been a sliver of a silver lining over the last year, it could be associated with the fact that focus on the user experience has been brightly highlighted for most organizations. Providing the optimal user experience, whether it’s offering fast and easy online purchasing capability, or assuring an employee always has access to their files, emails, and applications, the experience must be measured and improved. The need to have observable systems in place to measure from click-to-code is imperative, and the role of the SRE is poised for acknowledgment.
Over the last year, we have certainly seen change impact how and where we work. It will be interesting to see if anything from the last year has become a permanent part of everyday life for the SRE community. As 2021 shows signs of stabilization and a potential return to the office, how and when are the core questions.
If you have a role in site reliability engineering, we encourage you to participate in this year’s SRE survey.