Thursday, November 24, 2005

SugarShack 09: Reflections on QA: Where Can We Go Now?

Sugar Shack 09: Reflections on QA: Where Can We Go Now?

For more than a year now I have played MxO, through various builds, character wipes, and twists and turns. Two things have struck me about the QA evolution of the game: it has not evolved, and it is not effective. Fortunately, there are simply, relative cost-free solutions, which will help sustain the land we love. But some things done by the devs will need to adapt. Now, here I will be using the term devs broadly. Strictly speaking, the devs are the folks who sit and code features and content all day long, as distinct from content designers, artists, mission designers, world architects, combat architects, etc. For the sake of simplicity I’ll be referring to them all as devs: the people who provide the game.

To begin with, here are four examples from recent history. Believe me I could provide a whole lot more, but it would be painful and superfluous.

1. The SaiKung Shuffle: Endless running between three or four close buildings permitted people to rack up fast, vast xps. Apparently, the devs did not think about this. Since running from one place to another is not an exploit (nor, by the way, is efficiency), this must be attributed to bad planning and poor design.

Key problem: No tester thought like a player, to find a way to level as quickly as possible. Bingo! Lack of contact with customer!

2. Sudden instability in items. Walrus thoughtfully gives us a long list of items suddenly and silently affected by quick decay. “Lack of communication” prevented this from being conveyed until many people had seen items start to unravel almost before their very eyes. And these are items _not_on_Walrus’_list_.

Key problem (likely): No tester characters were prepped like real characters: with tons of junk. No one noticed the effects on other items. They only focused on the items on the list, and never thought about others. See the methodological problem here?

3. The mission timer. Remember this one? Players were penalized for being efficient, and rewarded for being disorganized and slow! Is there anywhere else in the universe where this happens? What were the devs thinking!

Key problem: No one played like players do, and no one thought about how players would respond to the hamhanded communications (or lack thereof).

4. The /afk emote. Remember this? Announced and documented, and DOA/MIA. How hard would it have been to test this and make sure it worked in the final patch build before sending the patch out? Apparently too hard, because no one did.

Key problem (likely): No one tested it, or no one communicated it. Either is a dismal choice.

These are to my mind representative, not definitive. And they have gone on for more then a year, so it’s hard to say “Oh, it was due to team turnover” or “management transition” or whatever. So, having pointed to a problem, I would like to suggest a solution.

One, the devs need to spent more time in the game.

We all occasionally hear of clans that claim to have devs or admins in them. Or, anyway, people who claim to be devs. Sure could have fooled me! Were this to be true, more realistic feedback to the devs would have prevented calamities like the ones I have described. Remember back in beta when some of the devs came in to play their own game, and got soundly whipped? That’s a symptom of a dev team inadequately experienced with their own creation. And it is seldom a recipe for success.

Two, the QA team needs to use characters for testing which model real characters.

In fact, they would do well to simply copy or model various existing characters from the game. This would have identified, for example, the problem with items decaying. Clearly they tested it with the items on the list. Clearly they did not test it with other items. Testing with characters outfitted the way real characters are would help avoid mistakes like this.

Three, in many areas the players know more then the devs do, when it should be the reverse.

For example, there are people in my clan and other clans who have spent months and months tinkering with builds to get the right combination of skills at the right level for any occasion. They are more cognizant of this than the devs are. The devs need to find a way to approach, learn, and assimilate this knowledge. Otherwise, they will be inventing the wheel, and there’s no guarantee that theirs will be round.

Four, the devs need to be more systematic and disciplined in QA and testing.

Most of the problems I have discussed so far stem from inadequate testing. But how can testing be improved? Make it more systematic and more realistic. In the Software Development LifeCycle, you have what is called unit testing. You try each piece of code or each process individually, and then in small combinations with others to make sure that they do what they are supposed to do, without unintended consequences. This is easy to do with a few modules. It’s very hard to do with dozens or hundreds. But there are standardized tools, called test scripts, test plans, and test cases, that help focus on likely problems. For example if you want to make sure that an item has an accelerated rate of decay, you might:

- Have a test character with the item and nothing else run in circles for hours and establish a baseline rate of decay.
- Have the same character with a new instance of the item run through 100 hardlines (25 from each major area) and see if the rate is present, and consistent.
- Have the same character use the teleporters in dungeons 100 times, and see if the rate is present, and consistent.
- Have the same character email the item back and forth 30 times and see if the rate is present, and consistent.
- Have the same character transfer the item to someone else and see if the rate is present, and consistent.
- Have the same character jack in and out a few dozen times with the same item and see if the rate is present, and consistent.
- Have a computer with the same character suddenly lose its network connection or get hard-rebooted a few dozen times, and see if the decay rate is present, and consistent.
- Have a coder decompile and recompile the item 10 or 20 times, and make sure the recreations are all uniform.

Then do the above with other buffed or non-buffed items to establish a real baseline, and make sure it works. Each of the above is a test case. All of them together form a test plan.

Maybe some of these cases are irrelevant, maybe the numbers of test instances are excessive, but you get the idea that you have to test something in as many relevant circumstances as can be imagined, in order to make sure that things go the right way. Monolith did some things along these lines in beta- remember the hordes of numbered little bots that ran around and did things? That was more server load testing, if I recall correctly. And there are many automated test tools available for quality assurance work.

Now, many people reading this will say this is too expensive and too much hassle. I encourage them to review the performance blunders noted above, and reconsider. After all, in life the cost of not testing is all too often much higher than the cost of not testing. Think of the Pinto, the O-Rings, and any number of air crashes.

These approaches will yield a better game. But there is likely no budget for more testers (assuming there are any!). But who needs more paid testers when they have us?

How can the community help? If SOE ever re-establishes a QA server, we need to invest serious time on it. But simply having us all transfer our characters over there is of limited value. A flood of 50s tells the devs nothing about how low, intermediate, and developing levels experience any proposed changes, and they are the new markets the game must cultivate. I’m willing to roll up a new character for this (named prettyprettyprincess), and I hope others are as well.

So I would like to challenge the devs to prepare tasks for us on the QA server. Try new abilities in various combinations. Try new items in various actions and combinations. Try new mishes individually and with groups. Try emailing this and that. Etc. In essence, simply opening a QA server will not yield the input the devs need, They need a more disciplined, systematic approach to QA. There is likely no budget for a flood of testing staff (even timeshared with other SOE games), but careful and constructive use of the user community can achieve much the same effect.

This might all sound negative, so let me hasten to say they have done some things well. The Halloween stuff, very tightly focused, went very smoothly. And the Pandora’s Box mish arcs have been well-planned, imaginatively conceived and written, and not unbalancing at all. Grats grats to all involved, for showing that it can be done! There has clearly been some QA success in MxO; we can only hope that some best practices are percolating through the rest of the game team.

Conclusion:

- Devs need more time ingame. Serious time!
- QA has been dismal, and needs a radical re-think.
- QA needs to be systematic and disciplined
- The QA server is a fabulous opportunity but it must also be used in a systematic manner.

This will avoid past catastrophes, enhance dev control and contribute to excellence in execution. What’s not to like?