QA Horror Stories [Chapter 1]: Nightmare On Regression Street
Working as a software QA Manager can be very rewarding. You get the opportunity to be an important part of software development teams, deeply exploring products to understand their mechanisms. This deep dive allows you to leverage your insights to enhance the product, playing a vital role in refining the outcome. It’s a wonderful aspect of the job!
However, there are those challenging days too. Imagine days when it feels like everything is going off-track. You’re testing a small piece of code, and nothing seems to function as expected. Conversations with team members might escalate into discussions or even debates. Then, you encounter bug after bug in the application, despite expectations that everything should be running smoothly. On such days, it feels like you’re in a horror movie, desperately trying to escape an overwhelming sense of fear and pressure. Yet, at every turn, there seems to be a new nightmare waiting for you.
This is precisely what I’ll be exploring in my new series ‘QA Horror Stories’. In this series, I’ll delve into my own experiences, covering everything from those tough testing days to very specific, persistent problems during software development that just don’t seem to resolve.
We’ll begin our journey with a topic that sends shivers down the spine of every developer… Regression Testing!
Once Upon a Time, There Was a Project…
Not too long ago, I found myself in a kickoff meeting for a new project at my current job. This wasn’t just any client; it was a major e-commerce project where we were handling both the backend and frontend. The team was a mix of skills: two junior developers and two senior ones. We had an initial learning period, where everyone had the chance to familiarize themselves with the important frameworks and start using the new tools we needed for the project. At that point, everything appeared to be calm and perfectly planned out!
The Nightmare Begins…Chaos in the Codebase
After discussing the requirements, the developers began their work, but we soon faced our first challenge. The QA or testing environment wasn’t functioning, and our DevOps expert had to step in to resolve it. It turned out to be a complex issue, taking nearly two weeks to fix due to various errors and his limited availability. Despite this setback, we managed to get it up and running, and I was finally able to start testing after almost two weeks of development.
I looked at the Pull Requests (PRs) and tickets in Jira and then I noticed something terrible…every PR so far has been merged already to the developer (main) branch! I had like 13 open PRs, which were technically speaking not open anymore and I had to test them all in the dev branch. The project began like a nightmare!
First, I tackled the most critical issue — I put an end to the ongoing chaos. During our daily meeting, I insisted that developers provide me with the branch of each Pull Request for testing. I made it clear that we would only merge these branches after my approval. Once this was established, I began testing on the dev branch and, unsurprisingly, found numerous bugs. This led to a complex situation. Since everything had already been merged, pinpointing the exact source of each error was challenging. Each developer denied any connection between the bugs and their code. As a result, we were forced to create additional PRs to fix those that had already been merged. We ended up in quite a complicated situation with opening more and more Pull Requests!
Lesson #1:
The key takeaway here is the importance of communication with your developers and a clear test strategy. They should always provide the tester with a separate branch where new implementations and features can be tested. This is especially crucial in larger projects, where testing post-merge can lead to confusion. Without a clear testing process, it becomes increasingly difficult to track when changes occur, and the system’s behavior becomes more unpredictable with each newly merged PR.
The Nightmare Continues…Rise Of The Undead Errors
After a considerable amount of time and a few extra gray hairs, we were able to streamline the process of handling PRs. I used to receive PRs for testing only when the branch had been successfully built and prepared. I tested them and approved their merging. However, just as we achieved this improvement, the next challenge quickly surfaced on the horizon…
It was a quiet and peaceful Friday when I spoke with the Product Owner, and she informed me that she had come across something unusual on the developer (main) branch. She had observed that the menu was not displayed correctly and asked if I had noticed it too. Instantly, my anxiety kicked in, and I started to worry that I had not tested it thoroughly enough. I quickly examined the development branch, and she was absolutely right — there was an issue! How could I have missed such an obvious problem? Fortunately, I had some written documentation of my testing, which allowed me to review the test reports for that specific feature. There, I discovered the problem — it had worked perfectly during my testing on the branch, but the error had occurred after the merge!
I had a bad feeling that this might just be the tip of the iceberg, so I delved deeper into the developer branch. After two hours of thorough investigation, I uncovered a shocking revelation — there were 27 bugs! What was even more puzzling was that 16 of them, more than half, had been functioning correctly during my testing on the branch. I couldn’t understand how this had happened. So, I decided to talk to the developers to figure out what went wrong. As I had more discussions with them, I started to uncover the reasons behind this confusing situation, bit by bit.
The first developer admitted that he had encountered merge conflicts frequently and had attempted to resolve them, but he wasn’t sure if he had done it correctly. It seemed that his lack of experience and potential mistakes were contributing factors, which is understandable, especially for a junior developer.
I then approached the senior developers, hoping they might have caught some issues during their code reviews. Surprisingly, their reply wasn’t very comforting. They mentioned that they usually assumed developers did a good job and therefore only gave PRs a brief glance. Really?! Why are we doing the code review in the first place?
Finally, the last developer revealed that he wasn’t very familiar with the framework and often encountered errors he couldn’t fix due to a lack of time for proper learning. So a lot of lessons could be learned here.
Lesson #2:
Make sure your developers, especially the less experienced ones, understand the basic rules for merging pull requests. They also need to know how to update their branch with the latest changes from the main branch. Sometimes, junior developers might be too shy to ask for help and just start working on their own. This can often lead to big problems where the system starts failing little by little.
Lesson #3:
If you include static testing (like reviews) in your workflows, it’s important to take them seriously. I’ve often heard developers and others mention that they’re too busy for reviews, or they’re not sure how to properly conduct them. If you’re not certain about the code, don’t hesitate to ask the developer who wrote it. If you think something in the code could be improved, leave a comment. And if you don’t have enough time to do a thorough review, it’s okay to ask another developer to step in. Many times, significant issues, especially those related to the architecture and adaptability of the code, go unnoticed because reviews are rushed and not done carefully.
Lesson #4:
If you’re starting to work with a completely new framework, it’s important to inform the other developers in advance. This is also true for QA team members who are testing something unfamiliar. Asking the right questions early on and taking the time to understand the framework you’ll be using or testing is crucial. Clear communication about this learning process ensures that every team member is aware of and prepared for the situation.
Unseen Horrors on Staging
So I discussed all the previously mentioned lessons with the developers, and they implemented them to some extent. As a result, our regression tests started detecting fewer errors in the developer branch, leading me to believe we had overcome the worst challenges. Then came the next phase: staging. After our initial deployment to the staging environment, everything appeared to be in order. It all worked smoothly, almost too smoothly, which seemed a bit unsettling.
For those who might not be familiar with this: typically, there are three main environments in software development — testing, staging, and production. The testing or QA environment is where all internal testing happens. Staging is the phase where the customer tests and reviews the current implementation periodically. Finally, production is the stage where everything is going live and accessible to the end-users.
Learning from past errors, I immediately initiated a regression test on the staging environment. Everything that had functioned correctly in the testing environment seemed to work well here too. However, I encountered a new bug — one that hadn’t appeared in either the testing or development branches. This raised a new question: were there now fresh regression errors emerging on staging?
Further discussions led us to some answers. We realized that specific settings and scripts in the backend needed to be transferred manually after each deployment to a new environment. Additionally, we discovered that some environment variables weren’t set correctly, among other issues.
Lesson #4:
Remember, errors can pop up anywhere, even during something as seemingly straightforward as deploying to a new environment. Deployments can be time-consuming, especially in bigger complex systems and are often prone to mistakes. Therefore, it’s essential to be ready to conduct a regression test on each new environment where your code base is deployed.
The Final Whisper
This project was an incredible learning experience. It not only gave me a sense of what it’s like to work on an e-commerce project, but it also deepened my understanding of how to test various aspects specific to this domain (stay tuned for a future blog post on that!). However, the most valuable lesson I learned was about the critical importance of regression testing. I often heard it described as a ‘nice-to-have’, with the assumption that everything would work out regardless. I’ll admit, I shared this mindset, partly because it seemed to work out in the past. But relying on past successes isn’t a good strategy for tackling new challenges, and I learned this the hard way.
So, a word of advice: always be prepared for those sneaky, recurring regression errors. They’re lurking in the shadows of your code, ready to invade your dreams and slash through your enthusiasm for testing. These errors might seem to disappear, but beware — they can return unexpectedly, turning your testing process into a real-life nightmare!