Incident Review: Maintaining Sessions, Cookies, & Persistence
We identified a very interesting case involving cookies, which showed how the way that CDNs map session and cookie data is vital to performance.
One of our customers, a major financial corporation, recently stumbled upon an issue when Catchpoint triggered intermittent alerts for a specific test. Digging into the data, we identified a very interesting case involving cookies, which showed how the way that CDNs map session and cookie data is vital to performance and end-user experience.
Web applications rely on sessions to maintain state. When a user accesses a web application, a unique session ID is created, which stores data relevant to the function of the application such as username or other parameters that customize the end-user experience. These sessions are stored in server memory for a defined period. For example, when you log into your account on a website, a session keeps track of any configuration changes or additional options that customize the user experience while you are logged in.
Cookies are generated and stored on the client side for that session, each cookie is linked to the domain that created it. The values stored in the cookies make it easier to retrieve specific information from the server-side. If a connection is idle and terminated, then the application can reconnect to the same session on that server by passing on the session ID that has been stored as a cookie.
This seems like an easy way to maintain sessions and exchange bits of data. But the current web architecture is a lot more complex as it involves more than just the client and server. The requests from multiple clients are sent to different servers so that the traffic is distributed, saving server resources and ensuring optimal performance. In such a situation, it is possible that a user is redirected to a server that does not have any saved session for that particular user. Cookie-based persistence or stickiness is used to avoid such scenarios. This technique ensures that requests from the user are mapped to the relevant server during the session.
Load Balancer with and without Persistent/Sticky Sessions
Our customer was using a similar technique to manage user login on their customer portal. The tests we were running showed that users were unable to login to the site intermittently.
We looked at the header data to understand what could be happening. In this case, the cookie that was set during the session somehow changed causing the login process to fail.
The expected behaviour during the login process should have been:
Object Response Headers:
Set-Cookie: __RequestVerificationToken_L1ZQQw2=UPu_WgLN1ghfOjvCZ6aeNh0tDZ0hCFDCwuoYdTzKgSs2CzFum9IrenXmHwC4f3sVOY9_FRnBKycotMERqHgPiBCZYTM1; path=/; secure; HttpOnly
Redirect request – Request Headers:
Cookie: PaymentControls=cicunba0ugh55rxetjl2nq1f; ASP.NET_SessionId=; LanguageSelected_ _bazic444=en-US; __RequestVerificationToken_L1ZQQw2=UPu_WgLN1ghfOjvCZ6aeNh0tDZ0hCFDCwuoYdTzKgSs2CzFum9IrenXmHwC4f3sVOY9_FRnBKycotMERqHgPiBCZYTM1; lbs=!GV5Yy2cYv344wI3jE2oBe50eA+UGWA7f1BrBpP1NpuVLAb5+l/HGywXuPGP8jH4Jvi5oATYegQPYvPfoYCC8QLS9hoyHNDvkUgT1WGs=; authenKey=87aa4032-eb2e-48dd-94f5-1eddd7f8fd7e__RequestVerificationToken=NVPpYEQI4QH3M6ptiv6j0Yip0sj1J8N3FOmVp5ccvGlEHNlDH7ch7mJH3HKCkNFCcw6BiiHMQmJ7jvlnte6dll_cQ0g1
The headers returned a different cookie in all the failed runs:
Object Response Headers:
Set-Cookie: __RequestVerificationToken_L1ZQQw2=oZZNrVVzP2wnKBv6S65T7PneIwFOpdHM6rlqOV5DGtRFnqbR6Tq6vTp044S-6Cr6D2Nq9ppLydN8rZgC6PsLRjci3D41; path=/; secure; HttpOnly
Redirect request – Request Headers:
Cookie: PaymentControls=iizurivzwp5kfpsmlfe0dtfe; ASP.NET_SessionId=; LanguageSelected_ _bazic444=en-US; __RequestVerificationToken_L1ZQQw2=oZZNrVVzP2wnKBv6S65T7PneIwFOpdHM6rlqOV5DGtRFnqbR6Tq6vTp044S-6Cr6D2Nq9ppLydN8rZgC6PsLRjci3D41; lbs=!Sjt/ICv7N73ll2cEn+fedyiDBMRN93qVClPmEFjwEdROa9LUXiiJ66ZTlVf0XR9U/GY9m/LOL4l19TP+DMSHTgr4GEkqc6YSzPzahxk=UserName=*******************&Password=***********&OldPassword=&NewPassword=&ConfirmPassword=&__RequestVerificationToken=Z-4ET7RU0ug-izi2FnMKpKE7Q0L2u-JfLrAKtuBy_JVGsL1GoLw07U2_mbgfuQpINqJFXBjDebDmCFVj8R7H3ho09Ow1
We were able to identify what was causing the failure by correlating the header information with data from the failed tests. The user requests were being redirected to the wrong server, which caused a mismatch in the cookie values.
The customer was using multiple datacenters to handle traffic with Akamai providing load balancing. When the user traffic hits a single datacenter, Akamai does not use load balancing for the traffic and the cookie is mapped to the same datacenter, resulting is a successful login process. However, when multiple datacenters come into play, Akamai load balancers redirect the traffic to different data centers, thereby creating a new session with a new cookie which is then mapped randomly. This could be the result of not enabling persistence at the CDN level, so the traffic ends up being distributed to the wrong datacenter, thus breaking the connection.
This was an intermittent issue, but it had a real end-user impact which had gone unnoticed until the Catchpoint tests triggered the alerts. The test data provided visibility into the root cause, which the customer forwarded to their CDN team to get the issue addressed.
Web applications are built on distributed systems. Complex parts work together to form a fully functional application, and each component has its own set of variables that can impact performance. Measuring just load time, downtime, and availability is not ideal, because this data is not comprehensive enough and will leave you searching for answers in case of a performance issue.
You need to have visibility at every point in the delivery chain; root cause analysis becomes extremely difficult without it. To quickly resolve an issue, you must have the right data from the right perspectives, and this is only possible when performance is tracked at all levels of the application, including third-party vendors like CDNs.