I already had a conference talk “in the can” due to being unable to present at the Melbourne Testing Talks conference in 2022, so my preparation only consisted of some minor updates to the slide deck and a practice run to nail down the slide transitions and timing.
The meetup took place on the evening of 12th April and I would be second up (presenting virtually over Zoom), following Ashley Graf‘s half-hour talk on “50 questions to faster onboarding (as a QA)”. A decent crowd formed during the first 30-45 minutes of the session and I took the virtual stage at just after 6.30pm.
My talk was titled “Lessons Learned in Software Testing” and I shared six lessons I’ve learned during my twenty-odd years in the testing industry. I guess some of my opinions and lessons are a little contrarian, but I’m OK with that as much of what I see presented as consensus around testing (especially on platforms like LinkedIn) doesn’t reflect my lived experience in this industry. If you want to know the six lessons that I shared, you’ll need to watch all 45 minutes of my presentation!
Thanks to Paul for the opportunity to present to the Sydney Testers audience and also for the interesting questions during the Q&A afterwards.
A recording of both talks from this meetup (as well as the Q&A) is available on YouTube (my talk starts at 32 minutes into this recording):
I recently read “Meltdown” by Chris Clearfield & András Tilcsik. It was an engaging and enjoyable read, illustrated by many excellent real-world examples of failure. As is often the case, I found that much of the book’s content resonated closely with testing and I’ll share four of the more obvious cases in this blog post, viz:
An indicator light in the control room led operators to believe that the valve was closed. But in reality, the light showed only that the valve had been told to close, not that it had closed. And there were no instruments directly showing the water level in the core so operators relied on a different measurement: the water level in a part of the system called the pressurizer. But as water escaped through the stuck-open valve, water in the pressurizer appeared to be rising even as it was falling in the core. So the operators assumed that there was too much water, when in fact they had the opposite problem. When an emergency cooling system turned on automatically and forced water into the core, they all but shut it off. The core began to melt.
The operators knew something was wrong, but they didn’t know what, and it took them hours to figure out that water was being lost. The avalanche of alarms was unnerving. With all the sirens, klaxon horns, and flashing lights, it was hard to tell trivial warnings from vital alarms.
Meltdown, p18 (emphasis is mine)
I often see a similar problem with the results reported from large so-called “automated test suites”. As such suites get more and more tests added to them over time (it’s rare for me to see folks removing tests, it’s seen as heresy to do so even if those tests may well be redundant), the number of failing tests tends to increase and normalization of test failure sets in. Amongst the many failures, there could be important problems but the emergent noise makes it increasingly hard to pick those out.
I often question the value of such suites (i.e. those that have multiple failed tests on every run) but there still seems to be a preference for “coverage” (meaning “more tests”, not actually more coverage) over stability. Suites of tests that tell you nothing different whether they all pass or some fail are to me pointless and pure waste.
So, are you in control of your automated test suites and what are they really telling you? Are they in fact misleading you about the state of your product?
2. Systems and complexity
The book focuses on complex systems and how they are different when it comes to diagnosing problems and predicting failures. On this:
Here was one of the worst nuclear accidents in history, but it couldn’t be blamed on obvious human errors or a big external shock. It somehow just emerged from small mishaps that came together in a weird way.
In Perrow’s view, the accident was not a freak occurrence, but a fundamental feature of the nuclear power plant as a system. The failure was driven by the connections between different parts, rather than the parts themselves. The moisture that got into the air system wouldn’t have been a problem on its own. But through its connection to pumps and the steam generator, a host of valves, and the reactor, it had a big impact.
For years, Perrow and his team of students trudged through the details of hundreds of accidents, from airplane crashes to chemical plant explosions. And the same pattern showed up over and over again. Different parts of a system unexpectedly interacted with one another, small failures combined in unanticipated ways, and people didn’t understand what was happening.
Perrow’s theory was that two factors make systems susceptible to these kinds of failures. If we understand those factors, we can figure out which systems are most vulnerable.
The first factor has to do with how the different parts of the system interact with one another. Some systems are linear: they are like an assembly line in a car factory where things proceed through an easily predictable sequence. Each car goes from the first station to the second to the third and so on, with different parts installed at each step. And if a station breaks down, it be immediately obvious which one failed. It’s also clear What the consequences will be: cars won’t reach the next station and might pile up at the previous one. In systems like these, the different parts interact in mostly visible and predictable ways.
Other systems, like nuclear power plants, are more complex: their parts are more likely to interact in hidden and unexpected ways. Complex systems are more like an elaborate web than an assembly line. Many of their parts are intricately linked and can easily affect one another. Even seemingly unrelated parts might be connected indirectly, and some subsystems are linked to many parts of the system. So when something goes wrong, problems pop up everywhere, and it’s hard figure out what’s going on.
In a complex system, we can’t go in to take a look at what’s happening in the belly of the beast. We need to rely on indirect indicators to assess most situations. In a nuclear power plant, for example, we can’t just send someone to see what’s happening in the core. We need to piece together a full picture from small slivers – pressure indications, water flow measurements, and the like. We see some things but not everything. So our diagnoses can easily turn out to be wrong.
Perrow argued something similar: we simply can’t understand enough about complex systems to predict all the possible consequences of even a small failure.
Meltdown, p22-24 (emphasis is mine)
I think this discussion of the reality of failure in complex systems makes it clear that trying to rigidly script out tests to be performed against such systems is unlikely to help us reveal these potential failures. Some of these problems are emergent from the “elaborate web” and so our approach to testing these systems needs to be flexible and experimental enough to navigate this web with some degree of effectiveness.
It also makes clear that skills in risk analysis are very important in testing complex systems (see also point 4 in this blog post) and that critical thinking is essential.
3. Safety systems become a cause of failure
On safety systems:
Charles Perrow once wrote that “safety systems are the biggest single source of catastrophic failure in complex, tightly coupled systems.” He was referring to nuclear power plants, chemical refineries, and airplanes. But he could have been analyzing the Oscars. Without the extra envelopes, the Oscars fiasco would have never happened.
DESPITE PERROW’S WARNING, safety features have an obvious allure. They prevent some foreseeable errors, so it’s tempting to use as many of them as possible. But safety features themselves become part of the system – and that adds complexity. As complexity grows, we’re more likely to encounter failure from unexpected sources.
Meltdown, p85 (Oscars fiasco link added, emphasis is mine)
Some years ago, I owned a BMW and, it turns out, it was packed full of sensors designed to detect all manner of problems. I only found about some of them when they started to go wrong – and doing so much more frequently than the underlying problems they were meant to detect. Sensor failure was becoming an everyday event, while the car generally ran fine. I solved the problem by selling the car.
I’ve often pitched good automation as a way to help development (not testing) move faster with more safety. Putting in place solid automated checks at various different levels can provide excellent change detection, allowing mis-steps during development to be caught soon after they are introduced. But the author’s point is well made – we run the risk of adding so many automated checks (“safety features”) that they themselves become the more likely source of failure – and then we’re back to point 1 of this post!
I’ve also seen similar issues with adding excessive amounts of monitoring and logging, especially in cloud-based systems, “just because we can”. Not only can these give rise to bill shock, but they also become potential sources of failure in themselves and thereby start to erode the benefits they were designed to bring in diagnosing failures with the system itself.
4. The value of pre-mortems
The “premortem” comes up in this book and I welcomed the handy reminder of the concept. The idea is simple and feels like it would work well from a testing perspective:
Of course, it’s easy to be smart in hindsight. The rearview mirror, as Warren Buffett once supposedly said, is always clearer than the windshield. And hindsight always comes too late – or so it seems. But what if there was a way to harness the power of hindsight before a meltdown happened? What if we could benefit from hindsight in advance?
This question was based on a clever method called the premortem. Here’s Gary Klein, the researcher who invented it:
If a project goes poorly, there will be a lessons-learned session that looks at what went wrong and why the project failed – like a medical postmortem. Why don’t we do that up front? Before a project starts, we should say, “We’re looking in a crystal ball, and this project has failed; it’s a fiasco. Now, everybody, take minutes and write down all the reasons why you think the project failed.”
Then everyone announces what they came up with – and they suggest solutions to the risks on the group’s collective list.
The premortem method is based on something psychologists call prospective hindsight – hindsight that comes from imagining that an event has already occurred. A landmark 1989 study showed that prospective hindsight boosts our ability to identify reasons why an outcome might occur. When research subjects used prospective hindsight, they came up with many more reasons – and those reasons tended to be more concrete and precise – than when they didn’t imagine the outcome. It’s a trick that makes hindsight work for us, not against us.
If an outcome is certain, we come up with more concrete explanations for it – and that’s the tendency the premortem exploits. It reframes how we think about causes, even if we just imagine the outcome. And the premortem also affects our motivation. “The logic is that instead of showing people that you are smart because you can come up with a good plan, you show you’re smart by thinking of insightful reasons this project might go south,” says Gary Klein. “The whole dynamic changes from trying to avoid anything that might disrupt harmony to trying to surface potential problems.”
I’ve facilitated risk analysis workshops and found them to be useful in generating a bunch of diverse ideas about what might go wrong (whether that be for an individual story, a feature or even a whole release). The premortem idea could be used to drive these workshops slightly differently, by asking the participants to imagine that a bad outcome has already occurred and then coming up with ways that could have happened. This might result in the benefit of prospective hindsight as mentioned above. I think this is worth a try and will look for an opportunity to give it a go.
I really enjoyed reading “Meltdown” and it gave me plenty of food for thought from a testing perspective. I hope the few examples I’ve written about in this post are of interest to my testing audience!
I took part in my first “Ask Me Anything” session on 22nd March, answering questions on the topic of “Exploratory Testing” as part of the AMA series organized by The Test Tribe.
Presenting an AMA was a different experience in terms of preparation compared to a more traditional slide-driven talk. I didn’t need to prepare very much, although I made sure to refamiliarize myself with the ET definitions I make use of and some of the most helpful resources so they’d all be front of mind if and when I needed them to answer questions arising during the AMA.
The live event was run using Airmeet.com and I successfully connected about ten minutes before the start of the AMA. The system was easy to use and it was good to spend a few minutes chatting with my host, Sandeep Garg, to go over the nuts and bolts of how the session would be facilitated.
We kicked off a few minutes after the scheduled start time and Sandeep opened with a couple of questions while the attendees started to submit their questions into Airmeet.
The audience provided lots of great questions and we managed to get through them all, in just over an hour. I appreciated the wide-ranging questions which demonstrated a spectrum of existing understanding about exploratory testing. There is so much poor quality content on this topic that it’s unsurprising many testers are confused. I hope my small contribution via this AMA helped to dispel some myths around exploratory testing and inspired some testers to take it more seriously and start to see the benefits of more exploratory approaches in their day-to-day testing work.
Thanks to The Test Tribe for organizing and promoting this AMA, giving me my first opportunity of presenting in this format. Thanks also to the participants for their many questions, I hope I provided useful responses based on my experience of adopting an exploratory approach to testing over the last 15 years or so!
I recently read This Is Marketing by Seth Godin and found it interesting and well-written, as I’d expected. But I didn’t expect this book to have some worthwhile lessons for testing folks who might be trying to change the way testing is thought about and performed within their teams and organizations.
I don’t think I’d previously considered marketing in these terms, but Seth says “If you want to spread your ideas, make an impact, or improve something, you are marketing”. If we’re trying to influence changes in testing, then one of our key skills is marketing the changes we want to make. The following quote from the book (and, no, I didn’t choose this quote simply because it mentions “status quo”!) is revealing:
How the status quo got that way
The dominant narrative, the market share leader, the policies and procedures that rule the day – they all exist for a reason.
They’re good at resisting efforts by insurgents like you.
If all it took to upend the status quo was the truth, we would have changed a long time ago.
If all we were waiting for was a better idea, a simpler solution, or a more efficient procedure, we would have shifted away from the status quo a year or a decade or a century ago.
The status quo doesn’t shift because you’re right. It shifts because the culture changes.
And the engine of culture is status.
I certainly recognise this in some of my advocacy efforts over the years when I was focused on repeating my “truth” about the way things should be from a testing perspective, but less tuned in to the fact that the status quo wasn’t going to shift simply by bombarding people with facts or evidence.
Seth also talks about “The myth of rational choice”:
Microeconomics is based on a demonstrably false assertion. “The rational agent is assumed to take account of available information, probabilities of events, and potential costs and benefits in determining preferences, and to act consistently in choosing the self-determined best choice of action,” says Wikipedia.
Of course not.
Perhaps if we average up a large enough group of people, it is possible that, in some ways, on average, we might see glimmers of this behavior. But it’s not something I’d want you to bet on.
In fact, the bet you’re better off making is: “When in doubt, assume that people will act according to their current irrational urges, ignoring information that runs counter to their beliefs, trading long-term for short-term benefits and most of all, being influenced by the culture they identify with.”
You can make two mistakes here:
1. Assume that the people you’re seeking to serve are well-informed, rational, independent, long-term choice makers.
2. Assume that everyone is like you, knows what you know, wants what you want.
I’m not rational and neither are you.
(Emphasis is mine)
I’m sure that any of us who’ve tried to instigate changes to way testing gets done in an organization can relate to this! People will often ignore information that doesn’t support their existing beliefs (confirmation bias) and team/organizational culture is hugely influential. It’s almost as though the context in which we attempt to move the needle on testing is important.
I think there are good lessons for testing changemakers in these couple of short passages from Seth’s book, but I would recommend reading the book in its entirety even if you don’t think marketing is your thing – you might just get some unexpected insights like I did.
I was delighted to be invited to participate in a webinar by the Association for Software Testing as part of their “Steel Yourselves” series. The idea is based on the Steel Man technique and I was required to make the strongest case I could for a claim that I fundamentally disagree with – I chose to argue for “Shift Nowhere: A Testing Phase FTW”!
I had plenty of time to prepare for the webinar and to do my research on the use and abuse of testing phases. I also looked into the “shift left” and “shift right” movements as counterpoints to the traditional notion of the testing phase. Sorting through the various conflicting and contradictory ideas around testing phases was an interesting process in itself. I built a short PowerPoint deck and rehearsed it a couple of times (so thanks to my wife, Kylie, and good mate, Paul Seaman, for being my audience) to make sure I would comfortably fit my arguments for a testing phase into the ten-minute window I would have during the webinar.
January 30th came around quickly and the webinar was timed well for Europe (morning) and Australia (evening) as well as places in-between, so it was good to see an audience from various parts of the world. The session was ably facilitated by James Thomas for the AST and Anne-Marie Charrett went first, to make her case that “Crosby was Right. Quality is Free”. She did a great job, fielded the questions from audience really well and made some good observations on the experience – and concluded right on time at 30 minutes into the session.
I felt like I delivered my short presentation in defence of a testing phase pretty well, getting a few smiles and interesting body language from the audience along the way! There were plenty of questions from James and the audience to challenge my claims and I tried hard to stay “in character” when answering them! The final section of the webinar allowed me to remove the mask and speak freely on my real points of view in this area.
Preparing for and presenting this defence of a testing phase was a challenging and interesting task. As usual, if we’re willing to look past the dogma, there’s usually some useful ideas we can take away from most things. While I disagree that the lengthy, pre-planned, scripted test phases I was often involved in during the early stages of my testing career really offer much value, I think the noise around the “shift left” and “shift right” movements has left a gap in-between where we still need to take pause and allow some humans to interact with the software before unleashing it on customers. (I’ve written about this previously in my blog post, The Power of the Pause.) Thanks to the AST for the opportunity to present at this webinar and give myself a refresher on this particular area of testing.
A recording of this “Steel Yourselves” webinar, along with plenty more awesome content, can be found on the AST YouTube channel.
It had been almost a year since I first acted as a Peer Advisor for an RST class with Michael Bolton. When Michael reached out to offer the opportunity to participate again, it was an easy decision to join his RST Explored class for the Australia/New Zealand/South East Asia timezones.
The peer advisor role is voluntary and comes with no obligation to attend for any particular duration, so I joined the classes as my schedule allowed. This meant I was in all of the first two days but only briefly during the second two days due to my commitments at SSW. Each afternoon consisted of three 90-minute sessions with two 30-minute breaks.
The class was attended by over 15 students from across Australia, New Zealand and Malaysia. Zoom was used for all of Michael’s main sessions with breakout rooms being used to split the participants into smaller groups for exercises (with the peer advisors roaming these rooms to assist as needed). Asynchronous collaboration was facilitated via a Mattermost instance (an open source Slack clone), which seemed to work well for posing questions to Michael, documenting references, general chat between participants, etc. It would be remiss of me not to call out the remarkable work of Eugenio Elizondo in his role as PA – he was super quick in providing links to resources, etc. as they were mentioned by Michael and he also kept Michael honest with the various administrivia required to run a smooth virtual class.
While I couldn’t commit as much time to the class this time around, I still enjoyed contributing to the exercises by dropping into the breakout rooms to nudge participants along as needed.
As with any class, the participants make all the difference and there were a bunch of very engaged people in this particular class. It was awesome to witness the growth in many of the more engaged folks in such a short time and I hope that even the less vocal participants gained a lot from their attendance. I enjoyed being on the sidelines to see Michael in action and how the participants engaged with his gifted teaching, and I hope I offered some useful advice here and there along the way.
I first participated in RST in 2007 in a chilly Ottawa and have been a huge advocate for this course ever since. The online version is a different beast to the in-person experience but it’s still incredibly valuable and it’s great to see the class becoming accessible to more people via this format. We continue to live in a world of awful messaging and content around testing, with RST providing a shining light and a source of hope for a better future. Check out upcoming RST courses if you haven’t participated yet, they remain the only testing classes that have the Dr Lee stamp of approval!
I’ve got a couple of testing community events coming up to kick off 2023. Both are free to attend and are interactive, so I’d love to see some of my blog audience getting involved!
30 January 2023 (8pm AEDT): Steel Yourselves webinar
First up is a webinar for the Association for Software Testing as part of their interesting series called “Steel Yourselves”, in which testers make the strongest case they can for claims they fundamentally disagree with.
In this webinar, I will argue the case for a testing phase, hence the title of my talk, “Shift Nowhere: a Testing Phase FTW”. After I make my case, the audience will have the chance to challenge my point of view and finally I’ll reflect on the experience of having to make a case for something I disagree with.
This webinar will also feature another well-respected Australian tester, with Anne-Marie Charrett presenting her case that “Crosby was Right. Quality is Free”.
I’m looking forward to this very different kind of webinar and have been busy putting my case together and practising my 10-minute presentation. It will certainly be interesting to deal with the challenges coming from the audience (“interesting” also encompassing “well out of my comfort zone”!).
8 March 2023 (9.30pm AEDT): Ask Me Anything on Exploratory Testing
My second online event is for The Test Tribe, as part of their “Ask Me Anything” series, in which I’ll be fielding audience questions about exploratory testing. This is a topic I’m passionate about and I have extensive experience of using exploratory testing as an approach, so I hope I can help others get a better understanding of the approach and its power during this session.
This is again something quite different for me as I generally present a prepared presentation of some kind, whereas in this case I don’t know what the audience will ask. I expect this to be quite a challenging experience but I’m excited to share my knowledge and experience on this oft-misunderstood topic.
It feels like much less than a year since I was penning my review of 2021, but the calendar doesn’t lie so it really is time to take the opportunity to review my 2022.
I published just 10 blog posts this year, so didn’t quite meet my personal target cadence of a post every month. There were a few reasons for this, the main one being my unexpected re-entry into employment (more on that below). Perhaps due to my more limited output, my blog traffic dropped by about 40% compared to 2021. I continue to be grateful for the amplification of my blog posts via their regular inclusion in lists such as 5Blogs and Software Testing Weekly.
March was the biggest month for my blog by far this year, thanks to a popular post about a video detailing how testers should fake experience to secure roles. I note in writing this blog post now that the video in question has been removed from YouTube, but no doubt there are similar videos doing the rounds that encourage inexperienced testers to cheat and misrepresent themselves – to the detriment of both themselves and the reputation of our industry.
I again published a critique of an industry report in November (after publishing similar critiques in 2020 and 2021) and this was my second most popular post of the year, so it’s good to see the considerable effort that goes into these critique-style posts being rewarded by good engagement.
I closed out the year with about 1,200 followers on Twitter, steady year on year, but maybe everyone will leave Twitter soon if the outrage many are expressing recently isn’t fake!
For the first few months of 2022, I continued doing a small amount of consulting work through my own business, Dr Lee Consulting. It was good to work directly with clients to help solve testing challenges and I was encouraged by their positive feedback.
Quite unexpectedly, an ex-colleague from my days at Quest persuaded me to interview at SSW, the consultancy he joined after Quest. A lunch with the CEO and some formalities quickly led to an offer to become SSW’s first Test Practice Lead (on a permanent part-time basis). I’ve now been with SSW for about seven months and it’s certainly been an interesting journey so far!
The environment is quite different from Quest. Firstly, SSW is a consultancy rather than a product company and I’ve come to realise how different the approach is in the consulting world compared to the product world. Secondly, SSW is a small Australian company compared to Quest being a large international one, so meetings are all standard working hours (and I certainly don’t miss the very early and very late meetings that so frequently formed part of my Quest working day!).
I have been warmly welcomed across SSW and I’m spreading the word on good testing internally, as well as working directly with some of SSW’s clients to improve their approaches to testing and quality management.
As I announced mid-2021, I was excited to be part of the programme for the in-person Testing Talks 2021 (The Reunion) conference in Melbourne, rescheduled for October 2022. Unfortunately, I had to give up my spot on the programme due to my COVID vaccination status – though, surprise surprise, all such restrictions had been removed by the time the event actually took place. But I did attend the conference and it was awesome to see so many people in the one place for a testing event, after the hiatus thanks to the pandemic and the incredibly harsh restrictions that resulted for Melbourne. (I blogged about my experience of attending Testing Talks 2022.)
In terms of virtual events, I was fortunate to be invited to act as a peer advisor for one of Michael Bolton’s virtual RST classes running in the Australian timezone. This was an awesome three-day experience and I enjoyed interacting with the students as well as sharpening my understanding of some of the RST concepts from Michael’s current version of the class.
Two very enjoyable virtual events came courtesy of the Association for Software Testing (AST) and their Lean Coffees. I participated in the May and September events suited to my timezone and they were enlightening and fun, as well as offering a great way to engage with other testers in an informal online setting.
I had an enjoyable conversation with James Bach too, forming part of his “Testing Voices” series on the Rapid Software Testing YouTube channel:
Although I’ve interacted with James online and also in person several times (especially during his visits to Melbourne), this was our most in-depth conversation to date and it was fun to talk about my journey into testing, my love of mathematics and my approach to testing. I appreciate James’s continued passion for testing and, in particular, his desire to move the craft forward.
I didn’t publish an updated version of my book An Exploration of Testers during 2022, but may do in 2023. I’m always open to additional contributions to this book, so please contact me if you’re interested in telling your story via the answers to the questions posed in the book!
I made good progress on the free AST e-book, Navigating the World as a Context-Driven Tester though. This book provides responses to common questions and statements about testing from a context-driven perspective, with its content being crowdsourced from the membership of the AST and the broader testing community. I added a further 10 responses in 2022, bringing the total to 16. I will continue to ask for contributions about once a month in 2023. The book is available from the AST’s GitHub.
Paul Seaman, Toby Thompson and I kicked off The 3 Amigos of Testing podcast in 2021 and produced three episodes in that first year, but we failed to reconvene to produce more content in 2022. There were a number of reasons for this, but we did get together to work up our next episode recently, so expect our next podcast instalment to drop in early 2023!
Volunteering for the UK Vegan Society
I’ve continued to volunteer with the UK’s Vegan Society both as a proofreader and also contributing to their web research efforts. I’ve learned a lot about SEO as a result of the web-related tasks and I undertook an interesting research project on membership/join pages to help the Society to improve its pages around joining with the aim of increasing new memberships.
I really enjoy working with The Vegan Society, increasing my contribution to and engagement with the vegan community worldwide. It was particularly rewarding and humbling to be awarded “Volunteer of the Season” and be featured in the Society’s member magazine, The Vegan, towards the end of the year.
As always, I’m grateful for the attention of my readers here and also followers on other platforms. I wish you all a Happy New Year and I hope you enjoy my posts and other contributions to the testing community to come through 2023 – the first public opportunity to engage with me in 2023 will be the AST’s Steel Yourselves webinar on January 30, when I’ll be arguing the case for a testing phase, I hope to “see you” there!
Thanks to the wonders of modern communication technology, I was interviewed by Rob Sabourin as part of his course on Software Engineering Practice for McGill University undergraduates in Montreal, Canada.
The early evening timeslot for Rob’s lecture on “Estimation” was perfect for me in Australia and I sat in on the lecture piece before my interview.
I’ve spent a lot of time in Rob’s company over the years, in both personal and professional settings, watching him give big keynote presentations, workshops, meetup group talks and so on. But I’d never witnessed his style in the university lecture setting so it was fascinating to watch him in action with his McGill students. He covered the topic very well, displaying his deep knowledge of the history of software engineering to take us from older approaches such as function point analysis, through to agile and estimating “at the last responsible moment”. Rob talked about story points (pointing out that they’re not an agile version of function points!) and estimating via activities such as planning poker. He also covered T-shirt sizing as an alternative approach, before wrapping up his short lecture with some ideas around measuring progress (e.g. burndown charts). Rob’s depth of knowledge was clear, but he presented this material in a very pragmatic and accessible way, perfectly pitched for an undergraduate audience.
With the theory over, it was time for me to be in the hot seat – for what ended up being about 50 minutes! Rob structured the interview by walking through the various steps of the Scrum lifecycle, asking me about my first-person experience of all these moving parts. He was especially interested in my work with Scrum teams in highly-distributed teams (including Europe, Israel, US, China and Australia) and how these team structures impacted the way we did Scrum. It was good to share my experiences and present a “real world” version of agile in practice for the students to compare and contrast with the theory.
It was a lot of fun spending time with Rob in this setting and I thank him & his students for their engagement and questions. I’m always open to sharing my knowledge and experience, it’s very rewarding and the least I can do given all the help I’ve had along the journey that is my career so far (including from Rob himself).
It’s that time of year again and I’ve gone through the pain of reviewing the latest edition of Capgemini’s annual World Quality Report (to cover 2022/23) so you don’t have to.
I reviewed both the 2018/19 and 2020/21 editions of their report in some depth in previous blog posts and I’ll take the same approach to this year’s effort, comparing and contrasting it with the previous two reports. Although this review might seem lengthy, it’s a mere summary of the 80 pages of the full report!
The survey results in this year’s report are more of the same really and I don’t feel like I learned a great deal about the state of testing from wading through it. My lived reality working with organizations to improve their testing and quality practices is quite different to the sentiments expressed in this report.
It’s good to see the report highlighting sustainability issues, a topic that hasn’t received much coverage yet but will become more of an issue for our industry I’m sure. The way we design, build and deploy our software has huge implications for its carbon footprint, both before release and for its lifetime in production usage.
The previous reports I reviewed were very focused on AI & ML, but these topics barely get a mention this year. I don’t think the promise of these technologies has been realised at large in the testing industry and maybe the lack of focus in the report reflects that reality.
It appears that the survey respondents are drawn from a very similar pool to previous reports and the lack of responses from smaller organizations mean that the results are heavily skewed to very large corporate environments.
I would have liked to see some deep questions around testing practice in the survey to learn more about what’s going on in terms of human testing in these large organizations, but alas there was no such questioning here (and these organizations seem to be less forthcoming with this information via other avenues too, unfortunately).
The visualizations used in the report are very poor. They look unprofessional, the use of multiple different styles is unnecessary and many are hard to interpret (as evidenced by the fact that the authors saw fit to include text explanations of what you’re looking at on many of these charts).
I reiterate my advice from last year – don’t believe the hype, do your own critical thinking and take the conclusions from surveys and reports like this with a (very large) grain of salt. Keep an interested eye on trends but don’t get too attached to them and instead focus on building excellent foundations in the craft of testing that will serve you well no matter what the technology du jour happens to be.
The survey (pages 72-75)
This year’s report runs to 80 pages, continuing the theme of being slightly thicker each year. I looked at the survey description section of the report first as it’s important to get a picture of where the data came from to build the report and support its recommendations and conclusions.
The survey size was 1750, suspiciously being exactly the same number as for the 2020/21 report. The organizations taking part were again all of over 1000 employees, with the largest number (35% of responses) coming from organizations of over 10,000 employees. The response breakdown by organizational size was very similar to that of the previous two reports, reinforcing the concern that the same organizations are contributing every time. The lack of input from smaller organizations unfortunately continues.
While responses came from 32 countries, they were heavily skewed to North America and Western Europe, with the US alone contributing 16% and then France with 9%. Industry sector spread was similar to past reports, with “High Tech” (18%) and “Financial Services” (15%) topping the list.
The types of people who provided survey responses this year was also very similar to previous reports, with CIOs at the top again (24% here vs. 25% last year), followed by QA Testing Managers and IT Directors. These three roles comprised over half (59%) of all responses.
Introduction (pages 4-5)
There’s a definite move towards talking about Quality Engineering in this year’s report (though it’s a term that’s not explicitly defined anywhere) and the stage is set right here in the Introduction:
We also heartily agree with the six pillars of Quality Engineering the report documents: orchestration, automation, AI, provisioning, metrics, and skill. Those are six nails in the coffin of manual testing. After all, brute force simply doesn’t suffice in the present age.
So, the talk of the death of manual testing (via a coffin reference for a change) continues, but let’s see if this conclusion is backed up by any genuine evidence in the survey’s findings.
Executive Summary (pages 6-7)
The idea of a transformation occurring from Quality Assurance (QA) to Quality Engineering (QE) is the key message again in the Executive Summary, set out via what the authors consider their six pillars of QE:
Agile quality orchestration
Quality infrastructure testing and provisioning
Test data provisioning and data validation
The right quality indicators
Increasing skill levels
In addition to these six pillars, they also bring in the concepts of “Sustainable IT” and “Value stream management”, more on those later.
Key recommendations (pages 8-9)
The set of key recommendations from the entirety of this hefty tome comprises little more than one page of the report and the recommendations are roughly split up as per the QE pillars.
For “Agile quality orchestration”, an interesting recommendation is:
Track and monitor metrics that are holistic quality indicators across the development lifecycle. For example: a “failed deployments” metric gives a holistic view of quality across teams.
While I like the idea of more holistic approaches to quality (rather than hanging our quality hat on just one metric), the example seems like a strange choice. Deployments can fail for all manner of reasons and, on the flipside, “successful” deployments may well be perceived as low quality by end users of the deployed software.
For “Quality automation”, it’s pleasing to see a recommendation like this in such a report:
Focus on what delivers the best benefits to customers and the business rather than justifying ROI.
It’s far too common for automation vendors to make their case based on ROI (and they rarely actually mean ROI in any traditional financial use of that term) and I agree that we should be looking at automation – just like any other ingredient of what goes into making the software cake – from a perspective of its cost, value and benefits.
Moving on to “Quality and sustainable IT”, they recommend:
Customize application performance monitoring tools to support the measurement of environmental impacts at a transactional level.
This is an interesting topic and one that I’ve looked into in some depth during volunteer research work for the UK’s Vegan Society. The design, implementation and hosting decisions we make for our applications all have significant impacts on the carbon footprint of the application and it’s not a subject that is currently receiving as much attention as it deserves, so I appreciate this being called out in this report.
In the same area, they also recommend:
Bring quality to the center of the strategy for sustainable IT for a consistent framework to measure, control, and quantify progress across the social, environmental, economic, and human facets of sustainable IT, even to the extent of establishing “green quality gates.”
Looking at “Quality engineering for emerging technology trends”, the recommendations are all phrased as questions, which seems strange to me and I don’t quite understand what the authors are trying to communicate in this section.
Finally, in “Value stream management”, they say:
Make sure you define with business owners and project owners the expected value outcome of testing and quality activities.
This is a reasonable idea and an activity that I’ve rarely seen done, well or otherwise. Communicating the value of testing and quality-related activities is far from straightforward, especially in ways that don’t fall victim to simplistic numerical metrics-based systems.
Current trends in Quality Engineering & Testing (p10-53)
More than half of the report is focused on current trends, again around the pillars discussed in the previous sections. Some of the most revealing content is to be found in this part of the report. I’ll break down my analysis into the same sections as the report.
Quality Orchestration in Agile Enterprises
I’m still not sure what “Quality Orchestration” actually is and fluff such as this doesn’t really help:
Quality orchestration in Agile enterprises continues to see an upward trend. Its adoption in Agile and DevOps has seen an evolution in terms of team composition and skillset of quality engineers.
The first chart in this section is pretty uninspiring, suggesting that only around half of the respondents are getting 20%+ improvements in “better quality” and “faster releases” as a result of adopting “Agile/DevOps” (which are frustratingly again treated together as though they’re one thing, the same mistake as in the last report).
The next section used a subset of the full sample (750 out of the 1750, but it’s not explained why this is the case) and an interesting statistic here is that “testing is carried out by business SMEs as opposed to quality engineers” “always” or “often” by 62% of the respondents. This seems to directly contradict the report’s premise of a strong movement towards QE.
For the results of the question “How important are the following QA skills when executing a successful Agile development program?”, the legend and the chart are not consistent (the legend suggesting “very important” response only, the chart including both “very important” and “extremely important”) and, disappointingly, none of the answers have anything to do with more human testing skills.
The next question is “What proportion of your teams are professional quality engineers?” and the chart of the results is a case in point of how badly the visuals have been designed throughout this report. It’s an indication that the visualizations are hard to comprehend when they need text to try to explain what they’re showing:
Using different chart styles for each chart isn’t helpful and it makes the report look inconsistent and unprofessional. This data again doesn’t suggest a significant shift to a “QE first” approach in most organizations.
The closing six recommendations (page 16) are not revolutionary and I question the connection that’s being made here between code quality and product quality (and also the supposed cost reduction):
Grow end-to-end test automation and increase levels of test automation across CI/CD processes, with automated continuous testing, to drive better code quality. This will enable improved product quality while reducing the cost of quality.
The Introduction acknowledges a problem I’ve seen throughout my career and, if anything, it’s getting worse over time:
Teams prioritize selecting the test automation tools but forget to define a proper test automation plan and strategy.
They also say that:
All organizations need a proper level of test automation today as Agile approaches are pushing the speed of development up. Testing, therefore, needs to be done faster, but it should not lose any of its rigor. To put it simply, too much manual testing will not keep up with development.
This notion of “manual” testing failing to keep up with the pace of development is common, but suggests to me that (a) the purpose of human testing is not well understood and (b) many teams continue to labour under the misapprehension that they can work at an unsustainable pace without sacrificing quality.
In answering the question “What are the top three most important factors in determining your test automation approach?”, only 26% said that “Automation ROI, value realization” was one of the top 3 most important factors (while, curiously, “maintainability” came out top with 46%). Prioritizing maintainability over an ability to realize value from the automation effort seems strange to me.
Turning to benefits, all eight possible answers to the question “What proportion (if any) of your team currently achieves the following benefits from test automation?” were suspiciously close to 50% so perhaps the intent of the question was not understood and ended up with a flip of the coin response. (For reference, the benefits in response to this question were “Continuous integration and delivery”, “Reduce test team size”, “Increase test coverage”, “Better quality/fewer defects”, “Reliability of systems”, “Cost control”, “Allowing faster release cycle” and “Autonomous and self-adaptive solutions”.) I don’t understand why “Reduce test team size” would seen as a benefit and this reflects the ongoing naivety about what automation can and can’t realistically achieve. The low level of benefits reported across the board lead the authors to note:
…it does seem that communications about what can and cannot be done are still not managed as well as they could be, especially when looking to justify the return on investment. The temptation to call out the percentage of manual tests as automated sets teams on a path to automate more than they should, without seeing if the manual tests are good cases for automation and would bring value.
We have been researching the test automation topic for many years, and it is disappointing that organizations still struggle to make test automation work.
Turning to recommendations in this area, it’s good to see this:
Focus on what delivers the best benefits to customers and the business rather than justifying ROI.
It’s also interesting that they circle back to the sustainability piece, especially as automated tests are often run across large numbers of physical/virtual machines and for multiple configurations:
A final thought: sustainability is a growing and important trend – not just in IT, but across everything. We need to start thinking now about how automation can show its benefit and cost to the world. Do you know what the carbon footprint of your automation test is? How long will it be before you have to be able to report on that for your organization? Now’s the time to start thinking about how and what so you are ready when that question is asked.
Quality Infrastructure Testing and Provisioning
This section of the report is very focused on adoption of cloud environments for testing. In answer to “What proportion of non-production environments are provisioned on the cloud?”, they claim that:
49% of organizations have more than 50% of their non-production environments on cloud. This cloud adoption of non-production environments is showing a positive trend, compared to last year’s survey, when only an average of 23% of testing was done in a cloud environment
The accompanying chart does not support this conclusion, showing 39% of respondents having 26-50% of their non-production environments in the cloud and just 10% having 51-75% there. They also conflate “non-production environment” with “testing done in a cloud environment” when comparing this data with the previous report, when in reality there could be many non-testing non-production environments inflating this number.
They go on to look at the mix of on-premise and cloud environments and whether single vendor or multiple vendor clouds are in use.
In answer to “Does your organization include cloud and infrastructure testing as part of the development lifecycle?”, the data looked like this:
The authors interpreted this data to conclude that “It emerged that around 96% of all the respondents mention that cloud testing is now included as part of the testing lifecycle”, where does 96% come from? The question is a little odd and the responses even more so – the first answer, for example, suggests that for projects where applications are hosted on the cloud, only 3% of respondents mandate testing in the cloud – doesn’t that seem strange?
The recommendations in this section were unremarkable. I found the categorization of the content in this part of the report (and the associated questions) quite confusing and can’t help but wonder if participants in the survey really understood the distinctions trying to be drawn out here.
Test Data Provisioning and Data Validation
Looking at where test data is located, we see the following data (from a subset of just 580 from the total of 1750 responses, the reason is again not provided):
I’m not sure what to make of this data, especially as the responses are not valid answers to the question!
The following example just shows how leading some of the questions posed in the survey really are. Asking a high-level question like this to the senior types involved in the survey is guaranteed to produce a close to 100% affirmative response:
Equally unsurprising are the results of the next questions around data validation, where organizations reveal how much trouble they have actually doing it.
The recommendations in this section were again unremarkable, none really requiring the results of an expensive survey to come up with.
Quality and Sustainable IT
The sustainability theme is new to this year’s report, although the authors refer to it as though everyone knows what “sustainability” means from an IT perspective and that it’s been front of mind for some time in the industry (which I don’t believe to be the case). They say:
Sustainable quality engineering is quality engineering that helps achieve sustainable IT. A higher quality ensures less wastage of resources and increased efficiencies. This has always been a keystone focus of quality as a discipline. From a broader perspective, any organization focusing on sustainable practices while running its business cannot do so without a strong focus on quality. “Shifting quality left” is not a new concept, and it is the only sustainable way to increase efficiencies. Simply put, there is no sustainability without quality!
Getting “shift left” into this discussion about sustainability is drawing a pretty long bow in my opinion. And it’s not the only one – consider this:
Only 72% of organizations think that quality could contribute to the environmental aspect of sustainable IT. If organizations want to be environmentally sustainable, they need to learn to use available resources optimally. A stronger strategic focus on quality is the way to achieve that.
We should be mindful when we see definitive claims, such as “the way” – there are clearly many different factors involved in achieving environmental sustainability of an organization and a focus on quality is just one of them.
I think the results of this question about the benefits of sustainable IT says it all:
It would have been nice to see the environmental benefits topping this data, but it’s more about the organization being seen to be socially responsible than it is about actually being sustainable.
When it comes to testing, the survey explicitly asked whether “sustainability attributes” were being covered:
I’m again suspicious of these results. Firstly, it’s another of the questions only asked of a subset of the 1750 participants (and it’s not explained why). Secondly, the results are all very close to 50% so might simply indicate a flip of the coin type response, especially to such a nebulous question. The idea that even 50% of organizations are deliberately targeting testing on these attributes (especially the efficiency attributes) doesn’t seem credible to me.
One of the recommendations in this section is again around “shift left”:
Bring true “shift left” to the application lifecycle to increase resource utilization and drive carbon footprint reduction.
While the topic of sustainability in IT is certainly interesting to me, I’m not seeing a big focus on it in everyday projects. Some of the claims in the report are hard to believe, but I acknowledge that my lack of exposure to IT projects in such big organizations may mean I’ve missed this particular boat already setting sail.
Quality Engineering for Emerging Technologies
This section of the report focuses on emerging technologies and impacts on QE and testing. The authors kick off with this data:
This data again comes from a subset of the participants (1000 out of 1750) and I would have expected the “bars” for Blockchain and Web 3.0 to be the same length if the values are the same. The report notes that “…Web 3.0 is still being defined and there isn’t a universally accepted definition of what it means” so it seems odd that it’s such a high priority.
I note that, in answer to “Which of the following are the greatest benefits of new emerging technologies improving quality outcomes?”, 59% chose “More velocity without compromising quality” so the age old desire to go faster and keep or improve quality persists!
The report doesn’t make any recommendations in this area, choosing instead to ask pretty open-ended questions. I’m not clear what value this section added, it feels like crystal ball gazing (and, indeed, the last part of this section is headed “Looking into the crystal ball”!).
Value Stream Management
The opening gambit of this section of the report reads:
One of the expectations of the quality and test function is to assure and ensure that the software development process delivers the expected value to the business and end-users. However, in practice, many teams and organizations struggle to make the value outcomes visible and manageable.
Is this your expectation of testing? Or your organization’s expectation? I’m not familiar with such an expectation being set against testing, but acknowledge that there are organizations that perhaps think this way.
The first chart in this section just makes me sad:
I find it staggering that only 35% of respondents feel that detecting defects before going live is even in their top three objectives from testing. The authors had an interesting take on that, saying “Finding faults is not seen as a priority for most of the organizations we interviewed, which indicates that this is becoming a standard expectation”, mmm.
The rest of this section focused more on value and, in particular, the lean process of “value stream mapping”. An astonishing 69% of respondents said they use this approach “almost every time” when improving the testing process in Agile/DevOps projects – this high percentage doesn’t resonate with my experience but again it may be that larger organizations have taken value stream mapping on board without me noticing (or publicizing their love it more broadly so that I do notice).
Sector analysis (p54-71)
I didn’t find this section of the report as interesting as the trends section. The authors identify eight sectors (almost identically to last year) and discuss particular trends and challenges within each. The sectors are:
Consumer products, retail and distribution
Energy, utilities, natural resources and chemicals
Healthcare and life sciences
Technology, media and telecoms
Four metrics are given in summary for each sector, viz. the percentage of:
Agile teams have professional quality engineers integrated
Teams achieved better reliability of systems through test automation
Agile teams have test automation implemented
Teams achieved faster release times through test automation
It’s interesting to note that, for each of the these metrics, almost all the sectors reported around the 50% mark, with financial services creeping a little higher. These results seem quite weak and it’s remarkable that, after so long and so much investment, only about half of Agile teams report that they’ve implemented test automation.
The main World Quality Report was supplemented by a number of short reports for specific locales. I only reviewed the Australia/New Zealand one and didn’t find it particularly revealing, though this comment stood out (emphasis is mine):
We see other changes, specifically to quality engineering. In recent years, QE has been decentralizing. Quality practices were merging into teams, and centers of excellence were being dismantled. Now, organizations are recognizing that centralized command and control have their benefits and, while they aren’t completely retracing their steps, they are trying to find a balance that gives them more visibility and greater governance of quality assurance (QA) in practice across the software development lifecycle.