Test data at manual testing

Posted: 29.11.2013 in Ei kategoriaa
Tags: ,

This blog entry starts my test data related blog series. I will submit new one every now and then. If you have questions or comments, please, let me know. I like to learn and I like to give new ideas. So disagree freely with me.

Over the series I’m using single example application: Discussion forum software (Like e.g. SMF, phpBB). Key features are registration, logging in and out, different kind of user rights, posting public and private messages, reading them and searching.

One key feature of manual testing is human aspect. Human submits the data, human checks the results. All test data should consider that. We have limited possibility to notice small differences.

Normal bias which I’ve seen at manual testing is using repetitive patterns. If the date format is ddmmyy, even tester quite often uses date like 010101 or 111111. They are simple and quick to type and always valid. But as the test data they miss many possible error cases. What if age calculation swaps day and month? The data like that won’t notice it. Much better is something which makes missing the mistakes impossible. It could be for example 230142. Same number is not repeated at all, so if something fails, it is immediately noticed.

The forum also requires text data. If testing the messages, the test needs plenty of text. Usually the testers and developers are using lorem ipsum as test data. But that practice should be avoided. There are multiple failure points. You can’t read the text, so you most likely won’t notice if characters are swapped to some other. Warping of the lines should also be done correctly which is difficult to see from lorem ipsum. Also all other content related errors are masked behind nonsense. If I need plenty of text, I usually take it from Project Gutenberg.

Many organizations are still doing scripted manual testing. There you have to decide if the test data is part of test case or not. When it is not, it gives tester possibility to use different kind of inputs, but part of them might be weak and reproducing the situation suffers. If it is, then same data is used over and over again, and at the end we can say that it works at least with given data, but not sure about other data.

In my opinion you should consider what important inputs are, and specify those. If there is possibility to say “this kind of data” instead of “exactly this data”, use rather “this kind”. It still gives larger variation for inputs than exact input, but is still able to test what is wanted. “Not so important configuration data” should be specified so that it is easy to take to use. I’ve been at project where configuration of the test environment took almost whole day. In that case all configurations should be specified so that I could found them right away and in best case also be able to take them to use right away.

Thanks @HelenaJ_M about question about reusing of data versus creating all the time from scratch.

  1. Why not extracting data subset from a production database depending on what test data your application need. There are very simple tools on the market that allow on a very simple way to extract the necessary data and even anonymize confidential or sensible data. Your comment would be appreciated

    • Teemu Vesala says:


      I’ll write own independent text about using the production data. It has so much different kind of risks and that’s main reason why I don’t recommend using it. Even masking or anonymization has own problems related to privacy and security. It is usually easier to create model from production data, and then create the test data based to that.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s