a strategy doesn’t have to be a big giant document. It starts as an idea in your head and you have to get other people on board as part of that strategy. So you need to share some knowledge in some format to help share your idea. This blog is about how I’d go about developing a new test strategy in a new team.
History of the term
First let’s take some time to understand this term; strategy. Historically the word strategy is associated with war and battle:
Strategy is to help you win or achieve some goal. Many people talk about their tactics when they are thinking of their strategy. Tactics are your how. They aren’t your whole strategy.
A tactic is a conceptual action or short series of actions with the aim of achieving of a short-term goal. This action can be implemented as one or more specific tasks.
This book helped me understand the term, “Strategy” in a visual and fun way.
According to this book a strategy has 4 parts:
A distinct, measurable goal
A sequence of actions or tactics
Start with a purpose
If I was dumped into a new team tomorrow and asked to develop a test strategy, I’d start by interviewing/surveying a few people. Depending on the size of the team and who I was working with it could be an online survey or a casual chat over a coffee. I’d ask something along the following lines:
What does quality mean to you?
What are common problems in the testing process here?
If you could fix just one thing about our quality, what would it be?
Now different people are going to answer this differently. Developers might say test code coverage, easily maintainable code and easy deployments make a high quality product. Your project manager might say happy customers. Testers might say less bugs found in the test phase.
Develop a goal
Once I’ve surveyed enough people (5 people is a good enough number for most user research interviews), I’ll work on constructing a goal. it might be;
improve our continuous integration build times
increase our test coverage
reduce the amount of negative customer feedback
Make sure it is measurable. You could use SMART or OKR goal formats.
Develop a plan
Now what are some things I or the team could do to achieve our goals? We could create tasks during our sprint to help us work towards our goal. Once you’ve achieved something you survey those original interviewers to see if the perceived quality has actually improved.
Measure the improvements in quality of your product. For my team we are tracking the average app store ratings, crash rates and engagement with in app features to see if they are actually useful. https://bughuntersam.com/metrics-and-quality/
Risks and Gaps
A Test Strategy could also have a section about risks or gaps in this approach. For example things like performance testing and security testing might not be included. Having a brief explainer why these aren’t part of your strategy can be useful for explaining the context and scope.
if you are working on improving the UI Test automation coverage you can use this visual risk based framework to help focus on where to start and what to automate first and measure progress against it as part of your strategy.
I’m more comfortable with the term marketing strategy over test strategy because it’s easier to measure your impact and easier to come up with concrete goals. Software testing isn’t as tangible as many other parts of the business process and can be hard to measure.
Can your strategy be summarised by this comic:
What resources have helped you understand test strategies? I’d love to check them out.
The superannuation and investment mobile app I’ve been working on over the last year has finally been released. It’s been on the app store for just over a month now* and this blog is about how we are using metrics to help keep tabs on the quality of our app.
The average app store rating is one useful metric to keep track of. We are aiming to keep it above 4 stars and we are also monitoring the feedback raised for future feature enhancement ideas. I did an analysis of the average app store reviews of other superannuation apps here to get a baseline of what the industry average is. If we are better than the industry average, we have a good app.
Analytics in mobile apps
We are using Adobe Analytics for tracking page views and interactions for our web and mobile app. On previous mobile app teams I’ve used mParticle and mixpanel. The framework here doesn’t matter, I’ve found adobe workspace to be a great tool for insights, once you know how to use it. Also Adobe has tons of online web tutorials for building out your own dashboards.
App versions over time
Here’s our app usage over time broken down by app version:
We have version 1.1 on the app store and released 1.0 nearly 2 months ago. We did an internal beta release with version 0.5.0. If anyone on the old versions tries to log in they’ll see a forced update view.
Crashes are a fact of life with any mobile app team, there are so many different variables that go into app crashes. However keeping track of them and aiming for low rates is a good thing to measure.
With version 1.1 we improved our crash rates on android from 2.77% to 0.11%. You can use a UI exerciser that is called monkey from the command line in your android emulator to try and find more crashes too. With the following command I can send a 1000 random UI events to the emulator:
I can also keep on eye on how many error messages are seen. The spike in the android app error messages was me throwing the chaos monkey at out production build for a bit. However when there is both a spike in android and iOS, I know I can ask, “was there something wrong with our backend that day?”
Test Vs Prod – page views
If every page has one event being tracked, we can compare our upcoming release candidate against production; say we see that 75 page views were triggered on the test build and we compare this to the 100 page views we can see in production. We can then say we’ve tested 75% of the app and haven’t seen any issues so far.
There’s no need to aim for 100% coverage, our unit tests do cover every screen but because they run on the internal CI network those events are never sent to adobe. We have over 500 unit/UI tests on both android and iOS (not that number of tests is a good metric, it’s an awful one by the way).
But if you’ve tested the main flows through your app and that’s gotten you 50% or 75% coverage you are now approaching diminishing returns. What’s the chances in finding a new bug? Or a new bug that someone cares about?
You could spend that extra hour or two getting to 90-95% but you could also be doing more useful stuff with your time. You should read my risk based framework if you are interested in finding out more.
If you are working on a new feature or flow of your app, you can measure how many people actually complete the task. E.g. first time log in, how many people actually log in successfully? How many people lock their accounts? If you are trying to improve this process you can track to see if the rates improve or decline.
You could also measure satisfaction after a task is completed and ask for feedback, a quick out of 5 score along the lines of, “did this help you? was it easy to achieve?”. You can put a feedback section somewhere in your app.
The tip of the iceberg
These metrics and insights I’ve shared with you are just a small subset of everything we are tracking. And is a small part of our overall test strategy. Adobe has been useful for digging down into mobile device’s and operating systems breakdowns too. There’s many ways you can cut the data to provide useful information.
What metrics have you found useful for your team and getting a gauge on quality? What metrics didn’t work as well as you had hoped?
This is not financial advice and the views expressed in this blog are my own. They are not reflective of my employers views
Bugasura is an android app and a chrome extension. it helps with keeping track of exploratory testing sessions and comes with screenshot annotation and jira integration.
Here are a couple of screenshots of the android app in action, being used for an exploratory session on our test app.
First I selected the testing session:
While I’m testing I see this Bugasura overlay which I can tap to take a screenshot and write up a bug report on the spot:
Here’s their reporting a bug flow:
And here’s a testing report after I finished my exploratory testing where I can push straight to Jira if I want:
Here’s the sample report link (caveat, the screenshots attached to the bug are now public information on the internet, so there’s a privacy concern right there), but OMG, the exploratory session recorded the whole flow too (so a developer could see exactly what I did to find that bug).
Here’s that bug report in chrome paused at screen 13 out of 18:
Some caveats I’ve found so far; the test report is public (not private by default), you wouldn’t want to include screenshots of private or confidential information.
Bugasura only works on android/chrome. There isn’t an ios version but I guess with some remote device access running through Chrome it could work? We use Gigafox’s Mobile Device Cloud at work to access a central server of mobile devices and I imagine Bugasura could work with it.
Also I think they may have misspelt Elisabeth’s name in her quote.
This blog post reflects my opinions only and do not reflect the views held by my employer.
My team is going through a beta release for our mobile app to get early feedback. We’ve noticed that our android app is struggling compared to iOS. It seems that having an extra hurdle with signing up for the android beta program impacts installations. Naturally we’d expect the android engagement to lag a little behind iOS based on the Aussie mobile usage market analysis but there is still a significant drop.
We have 428 iOS installs, and 99 android installs. That’s a 19% Android installation rate. We have roughly 75-80% successful registrations and return log ins once people actually figure out how to install the app.
Google Groups vs Test Flight
We are using google groups to manage the distribution of the android beta app and because it’s harder to use than test flight for iOS we’ve gotten less installations. It’s fascinating how an extra hurdle in the sign up process can impact installations.
Our android numbers appear to be higher than usual here but I think it’s to do with the timeframe I’m collecting these numbers over. We’ve had a few people install the android app before we officially started the beta release and I think they’ve been counted in this statistics.
I like to use mind maps to help me test. Mind maps are a visual way of brainstorming different ideas. On a previous mobile team I printed and laminated this mind map to bring along to every planning session to help remind me to ask questions like, “What about accessibility? Automation? Security? or Performance?”:
As I go through exploratory testing (or pair testing), I’ll tick things off as I go and take session notes. Often this will involve having conversations with people, sometimes bugs are raised. Here is a quick mind map I’ll use for log in testing:
Heuristics for testing
This mind map approach can be combined with a heuristic test strategy approach or a nemonic test approach. Heuristics is a rule of thumb that helps you solve problems, they often have gaps because no mental model is perfect.
SFDPOT is a common nemonic that was developed by James Bach; who also developed Rapid Software Testing; a context driven methodology. James Developed his RST course with Michael Bolton.
We truly live in a global and inter connected society. But have you tested your app using a Right to Left (RTL) language such as Arabic? This blog post is a reflection on some of the design considerations to keep in mind when accomodating this.
Why does this matter?
Arabic is one of the top 5 spoken languages in the world with around 3 hundred million speakers and it is the third most spoken language in Australia. Even if you only release apps for the Australian market someone out there will have Arabic set as their default device language. It’s ok if you haven’t translated your app, but you should check that these people can still use it.
How do I test this?
Enable developer options and select “Force RTL layout direction”. On My Samsung S10 this is what my screen and options look like after enabling this option:
In Xcode you can change the build target language to a Pseudo RTL language to see how your app renders in this way without having to change the language on your device.
You don’t actually need to render your key pads in Right To Left, in fact it’s actually more jarring to render numbers in a RTL arrangement because ATM’s and phone pads are left to right in Arabic. Most Arab’s are use to globalised number pads. Samsung has an in-depth article on when RTL should be applied.
When I have RTL rendering set on my android phone, the log in pin screen and phone call functionality is in LTR. However some of my banking apps render their pin pads in RTL.
Common RTL Issues
I was pleasantry surprised to find out how many of my apps weren’t broken when I switched to RTL rendering. Facebook, twitter and email still look very good. Some apps (like my calculator) do not make sense to render RTL and they remain LTR:
Bug One: Overlapping labels
You will have to watch out for when labels overlap like in the domain app here:
Bug Two: Visuals doesn’t match written language
And when your text is rendered RTL but the visual cue is still LTR like in the shade bar for representing countries visitors to my blog in this wordpress statistics view:
Bug Three: Menu’s that animate from the side
In the app I’m helping build, the side menu renders pretty funkily in RTL mode, I can’t show you a screenshot of this behaviour but it’s probably the quirkiest RTL bug I’ve seen. If you find an app with bad side menu behaviour in RTL please share your screenshots with me. I’ve also seen a pin login screen where the icons where flipped but the button presses weren’t.
Bug Four: Icon’s aren’t flipped
Often icon’s have a direction associated with them like the walking person when you get google maps directions. Sometimes it can look a little odd when they aren’t flipped correctly (as if they are walking backwards).
Have you seen these bugs before?
Please let me know your thoughts or experiences in supporting RTL languages. I’d love to hear your stories.
Test strategy, what a funny concept. Now this strategy isn’t going to help you win any battles (this is where the word strategy comes from after all) but for lack of a better well understood term, this blog post is a reflection on what I imagine will work for my team*.
*disclaimer: what might work for my team might not work for yours. People are amazingly diverse and your team and company context is fundamentally different. Also this here is a wish list of what I think will work. It’s subject to change as we learn and evolve.
First let’s set the scene. Our scrum team includes 1 Android developer, 1 iOS developer, 2 back end developers, 2 business analysts (1 is our scrum master), 2 testers and a team/tech lead. We are changing our team structure and I’ve come on board as a software engineer in test. Our team closely collaborates with the design team and they are included in our group email threads but don’t come to our retro’s. We have a 10 day sprint cycle that looks a little like this:
We have a daily standup, a few kick off meetings at the start of the sprint to lock in what we are working on for the next 2 weeks, some mid sprint review/next sprint refinement sessions and a few meetings at the end that help tie up what we’ve completed. Consider this a crash course in Scrum Agile if you will. Not everyone is required to attend all of these meetings and I won’t be covering these meetings in detail in this blog post.
Get to the Test Strategy
Yes I know, that was a rambling tangent but the context is important. Before I get into the good bits I’ll ask you a question;
Why do we even bother with testing?
Some people say, “to ensure the product works as expected” or, “to find bugs”, “it’s my job to test things” and these are all ok answers but they miss the point a little. Here’s my answer:
We test to get feedback on the state of the product. To help us answer the question, “are there any known risks with shipping this product to production?”
Paraphrased from conversations with Michael Bolton, the tester not the singer
Every part of the following strategy is all tied into facilitating feedback. The more timely and accurate the feedback the better.
Testing and quality is a team responsibility, it’s not just up to one person to be the quality gate keeper. My role is to help facilitate feedback
Layer One: The product design feedback loop
This is all a little out of scope of my teams day to day activities but this how our design team tests if we are building the right thing that users need.
This might involve researching our market for current trends. How many of our customers care about their superannuation? What is their financial literacy? What type’s of problems are they facing? What are our competitors doing and how does their experience deliver value?
Eventually someone will need to start sketching out some design ideas. What’s the user flow through a particular feature?
This won’t happen for every new design, for example log in hasn’t gone through this process. Our new big features will go through this type of testing. This helps get feedback on the design and layout. Does it all make sense?
Design and user story creation
Out of all of the work, eventually the design team and the business analyst will work together to create acceptance criteria, refine the UI and get the rest of the team up to speed with the context of a feature. Our user stories and designs are usually shared on a confluence page and linked to Jira tasks. We use a GIVEN WHEN THEN structure for our user stories.
Layer Two: The code feedback loop
All testing is exploratory in nature, its front and centre. It’s across everything we do. chaos engineering is a type of it as well as building the code locally. We use our skills, plans and judgement to determine when and how much testing is needed at any point.
When we do the code review we will do exploratory testing based on the risk of the feature. Time boxed to a session or two depending on what has been built. We will look at the user stories, brain storm any more edge cases and consider if they are worth testing. Checking if the experience of the feature makes sense and if there are any ways people can get into some sticky unexpected situations.
As a developer builds a feature, they will create unit tests based on the user acceptance criteria. Developers will use the tools they are most comfortable with the write these tests. If you’d like to read more Martin Fowler has this blog post on Unit Testing.
I have my visual risk board next to our team which we use to prioritise how much testing we build at this layer. We use Espresso for Android and XCUITest for the iOS app.
“Why not Appium?”, I hear you ask
Simple, when test code lives in a repository outside of your production code, you decrease collaboration with the whole team. Also you can’t easily run your appium tests during pre commit testing or locally as a developer. You can follow the interactive visual risk for UI automation exercise here to understand more.
When a new API is being developed, I’ll often pair with the developer to do a code review. We will talk about the architecture, brain storm testing ideas, do a bit of testing (usually through postman if we are testing an API), we will chat about test coverage. Is it adequate? Is there any thing missing? Can we see the tests fail under the expected conditions?
If it’s a front end feature, I’ll check out the code locally and use a different emulator/simulator than what the developer uses. I’ll give the feature a good shake out and check the test coverage. I’ll also test for accessibility if it’s a new front end feature.
For our mobile app, we are able to do most of our code review testing without ever talking to a backend. The engineers have built some mock servers into out apps, when the app would call an API, our mock server returns a canned response. This helps us test that the UI and the flow hangs together even when test environments aren’t available. If you’d like to read more, check out this article on mock testing for android or this one for iOS.
We have different pipelines for different applications. We are using TeamCity as our Continuous Integration tool. Generally all of our unit tests and UI tests will be run. Maybe our contract tests. I have a few other ideas to increase the value from our build pipelines that I’ll talk about in Chaos Testing. If our main builds start failing, we won’t release the software.
We don’t necessarily focus on doing device testing for each feature that comes through. I try to pick a different emulator/simulator than what the developers do. I always make sure features get tested on a Samsung. Some features, if they are 3 star features from our risk analyst we will spend more time testing on a wide variety of devices. We currently have an on premise mobile device cloud server delivered by Mobile Labs. If you don’t have a device cloud, you could set up your own device farm.
Samsung has a wide market saturation and they always do funky stuff to the android UI. The android emulators are awesome at vanilla android. However, most people out there aren’t using vanilla Android :(.
We are moving towards having contract testing in place that lets us know if an API starts to break, if someone changes the JSON payload in an API our contract will break and someone will know the have more stuff to clean up. We don’t have contract testing for our mobile app yet but some of our downstream micro services are starting to build these. If you’d like to find more, read this article by Martin Fowler.
We have an integration test environment where our code is being constantly deployed into. Sometimes it can mean an API is down because it’s being deployed. We do a lot of our API testing in this environment.
With android there’s this command line tool called chaos monkey. This tool is a UI exerciser, it throws random user input at your UI to try and find where it crashes. I’m hoping to include this in a build pipeline for an overnight build. Run it for a few hours on an android device and see if it crashes. The next night, do the same thing but on a different device/os combination. This will give us reasonable device testing over a sprint. I don’t know of a similar tool for iOS. You can read more about chaos engineering on wikipedia.
Layer Three: Shipping the product feedback loop
A few days before the end of the sprint, our team and invited guests will sit down and do some exploratory testing on the features that have just been built. If anyone wants to explore a new API that’s been built, they can. If they’ve had their head in unit tests lately, they have the chance to explore some of the new UI. You can read how to run a bug bash to find out more. If major bugs are found here we won’t release the software. We might do a mob programming session when we don’t have enough features for a bug bash.
On the last day of the sprint we will demo our features to a broader audience. Feedback is gathered and turned into Jira items/research for the design team.
Then we release to internal staff members. Many other companies call this “eating your own dog food”. This gives people the chance to raise more feedback before we put the product in front of customers. You can read more on wikipedia here.
We can release our app to our high value or digitally savvy customers who want to ahead of the curve. This is a customer engagement strategy as well as a test strategy.
Percentage roll outs
The google play store allows you to do percentage rollouts. Say you rollout to 5% on the new version, monitor production for any new crashes or customers complaining. If it’s all smooth for a few days you can continue the rollout to 50% and then 100%. The google play store allows you to roll back if major bugs do occur. The apple play store has a similar feature.
Monitoring in production
What metrics should be communicated back to the team? How can we respond to issues in production? I like this quote from a 5 minute google talk back in 2007:
Sufficiently Advanced Monitoring is Indistinguishable from Testing
Layer Four: Supporting the product feedback loop
We should support all of the devices that 80% of our market uses. We will be support from Android 6 (marsh-mellow) and from iOS 11. There probably will be some obscure android devices out there that don’t play nice with our app. Android is a beast like that.
Facilitating customer feedback
There will be an easy way for customers to provide feedback in app. I have some ideas on how to make that experience better but there are privacy concerns to consider. We will also be monitoring our google/apple play store reviews for bugs.
Someone should be monitoring all of this feedback, attempting to reproduce bugs if customers are facing them and raising them in the teams backlog for prioritisation next sprint.
Soap Opera Testing
Maybe in the future, we could try some soap opera testing with the business? Soap opera testing is a condensed and over dramatised approach to testing. What are the wackiest scenarios our customers have actually tried? How does our system break? You can read more about this exercise here.
Why the layers?
Consider each of these layers like a net. It won’t catch everything, bugs in production will still happen. But when we have all of these feedback loops layered on top of each other, we get a pretty tight net, where hopefully no major issues get into production.
What about auditing or compliance?
Our source of truth is the code, Jira and Confluence. When we have all of it integrated, we can prove we tested a feature thoroughly without too much extra overhead. An auditors mindset and a testers mindset are very similar. Testers are concerned with product risks, auditors are concerned with business risk.
Their main question is, “did you do what you say you do? Did you follow your process?” and, “Is the existing process adequate?”.
Where are your test cases?
Michael Bolton has a 7 part series on breaking the test case addiction. You can read series one here. You don’t need test cases to prove you did adequate testing. They create unnecessary overhead that detract from adding business value.
What else is missing?
Security testing is not included in this test strategy. Neither is performance testing. Getting these included can be challenging. I’m open to your suggestions in how I can incorporate this type of feedback in a timely manner.
Have you wanted to start with automation testing and not known where to begin? Or maybe you have 100’s or thousands of test cases in your current automation pipeline and you want to reduce the build times. Here I will walk you through one way you could consider slicing up this problem. Using examples from Tyro’s banking app (I use to work on their mobile iOS team).
Break into flows
Analyse your app/site/tool and brainstorm the main flows that people will take through it. I picked 6 flows using tyro as an example app. Next I numbered them.
2. Transfer Funds
3. View Transaction
4. Contact Us
5. Change Pin
6. Log in
Mapping those flows to a risk board
Draw a graph, put frequency of use on the x axis down the bottom; things that are more used will be on the right hand side. On the vertical y axis put impact if broken. This is from a person point of view, how much would they care if that feature was a broken? From a business point of view you may have a different understanding of risk and that’s fine two. We will go into how to reflect that later.
Add your flows
Move the flows to your graph
It helps to pair on this exercise to help build up a shared understand. Do your designers and engineers have the same understanding of risk as you do? It’s ok if your answer is different to mine, we all have a different context and understanding.
Reflect other elements of risk
You might want to reflect other elements of risk such as security, financial, regulatory and anything else you can think of. At the end of the day this is only a 2 representation of risk and risk is a little more complex than these dimensions we put here.
Neat, what’s next?
If you are thinking, well that’s cool and all but what does that have to do with automation testing? Then please continue reading. You could use this board to decide which tests you should focus on building/refactoring next (hint, the stuff with 3 stars is pretty important). You could also use this to priortise your performance testing efforts. I took this board to our planning sessions to talk about new features and it helped with deciding how much automation/testing effort we may need. At the end of the day, your software will be more complex than this example.
Here is the actual board I used at Tyro with a bit more detail:
I then broke down each flow into a test case, and grouped similar test cases into a barebones automation test suite. You can also use this approach to generate exploratory testing ideas for each screen in your flow.
At Insight Timer we’ve just ordered a whole bunch of refurbished second hand android phones from Green Gadgets Australia for testing our Android app. We managed to get 9 devices for under $2K and it also gave as a pretty good manufacturer spread from Samsung to Google. This blog is how I went about building a home made charging station for these phones.
Keeping all of these phones charged
The two tablets came in large boxes. I decided I wanted to convert one of those boxes into a charging station. All of the phones had bits of foam that were used to hold the phones in place. I cut up these pieces of foam and hot glued them into the box. I used the left over phone boxes to store extra cables.
However, I forgot to counter for actually plugging the USB’s into something. Next on the order list was a bunch of USB charging stations, extra cables and cable ties. All ordered via MWave.
Charging Station 2nd iteration
After the USB charging stations, cables and 2 more phones from a Chinese supplier turned up I got to work on organising cables. We now have an Xaomi and a Lenovo in our device list.
I even have all of the cable types segregated, so there’s spaces to charge some of the test iPhones we have floating around too.
I’m quite pleased with our spread of devices and the budget of this set up. These devices aren’t exactly the highest end phones today but it’s a good thing our developers love to have the latest and greatest tech toys so they already have the high end covered. I might add one or two more phones that represent tiny screens. Have you built your own device farm before? How’d you keep all of the phones charged?