Tag Archives: test data

Summary of Eurostar 2016

About Eurostar

Eurostar is Europe’s largest conference that is focused on testing and this year the conference was held in Stockholm October 31 – November 3. Since I have been working with test automation lately it seemed like a good opportunity to go my first test conference (I was there for two days). The conference had the usual mix of tutorials, presentations and expo, very much a traditional conference setup unlike the community driven style.

Key take away

Continuous delivery and DevOps changes much of the conventional thinking around test. The change is not primarily related to that you should automate everything related to test but that, in the same way as you drive user experience testing with things like A/B testing, a key enabler of quality is monitoring in production and the ability to quickly respond to problems. This does not mean that all automated testing is useless. But it calls for a very different mindset compared to conventional quality wisdom where the focus has been on finding problems as early as possible (in terms of phases of development). Instead there is increased focus on deploying changes fast in a gradual and controlled way with a high degree of monitoring and diagnostics, thereby being able to diagnose and remedy any issues quickly.


Roughly in order of how much I liked the sessions, here is what I participated in:

Sally Goble et. al. – How we learned to love quality and stop testing

This was both well presented and thought provoking. They described the journey at Guardian from having a long (two weeks) development cycle with a considerable amount of testing to the current situation where they deploy continuously. The core of the story was how the focus had been changed from test to quality with a DevOps setup. When they first started their journey in automation they took the approach of creating Selenium tests for their full manual regression test suite. This is pretty much scrapped now and they rely primarily on the ability to quickly detect problems in production and do fixes. Canary releases and good use of monitoring / APM, and investments in improved logging were the key enablers here.

Automated tests are still done on a unit, api and integration test level but as noted above really not much automation of front end tests.

Declan O´Riordan – Application security testing: A new approach

Declan is an independent consultant and started his talk claiming that the number of security related incidents continue to increase and that there is a long list of potential security breaches that one need to be aware of. He also talked about how continuous delivery has shrunk the time frame available for security testing to almost nothing. I.e., it is not getting any easier to secure your applications. Then he went on claiming that there has been a breakthrough in terms of what tools can with regard to security testing in the last 1-2 years. These new tools are categorised as IAST (Interactive Analysis Security Testing) and RASP (Runtime Application Self-Protection). While traditional automated security testing tools find 20-30% of the security issues in an application, IAST-tools find as much as 99% of the issues automatically. He gave a demo and it was impressive. He used the toolset from Contrast but there are other supplier with similar tools and many hustling to catch up. It seems to me that an IAST tool should be part of your pipeline before going to production and a RASP solution should be part of your production monitoring/setup. Overall an interesting talk and lots to evaluate, follow up on, and possibly apply.

Jan van Moll – Root cause analysis for testers

This was both entertaining and well presented. Jan is head of quality at Philips Healthcare but he is also an independent investigator / software expert that is called in when things go awfully wrong or when there are close escapes/near misses, like when a plane crashes.

No clear takeaway for me from the talk that can immediately be put to use but a list of references to different root cause analysis techniques that I hope to get the time to look into at some point. It would have been interesting to hear more as this talk only scratched the surface of the subject.

Julian Harty – Automated testing of mobile apps

This was interesting but it is not a space that I am directly involved in so I am not sure that there will be anything that is immediately useful for me. Things that were talked about include:

  • Monkey testing, there is apparently some tooling included in the Android SDK that is quite useful for this.
  • An analysis that Microsoft research has done on 30 000 app crash dumps indicates that over 90% of all crashes are caused by 10 common implementation mistakes. Failing to check http status codes and always expecting a 200 comes to mind as one of the top ones.
  • appdiff.com a free and robot-based approach to automated testing where the robots apply machine learned heuristics. Simple and free to try and if you are doing mobile and not already using it you should probably have a look.

Ben Simo – Stories from testing healthcare.gov

The presenter rose to fame at the launch of the Obamacare website about a year ago. As you remember there were lots of problems the weeks/months after the launch. Ben approached this as a user from the beginning but after a while when things worked so poorly he started to look at things from a tester point of view. He then uncovered a number of issues related to security, usability, performance, etc. He started to share his experience on social media, mostly to help others trying to use the site, but also rose to fame in mainstream media. The presentation was fun and entertaining but I am not sure there was so much to learn as it mostly was a run-through of all of the problems he found and how poorly the project/launch had been handled. So it was entertaining and interesting but did not offer so much in terms of insight or learning.

Jay Sehti – What happened when we switched our data center off?

The background was that a couple of years ago Financial Times had a major outage in one of their data centres and the talk was about what went down in relation to that. I think the most interesting lesson was that they had built a dashboard in Dashing showing service health across their key services/applications that each are made up of a number of micro services. But when they went to the dashboard to see what was still was working and where there were problems they realised that the dashboard had a single point of failure related to the data centre that was down. Darn. Lesson learned: secure your monitoring in the same way or better as your applications.

In addition to that specific lesson I think the most interesting part of this presentation was what kind of journey they had gone through going to continuous delivery and micro services. In many ways this was similar to the Guardian story in that they now relied more on monitoring and being able to quickly respond to problems rather than having extensive automated (front end) tests. He mentioned for example they still had some Selenium tests but that coverage was probably around 20% now compared to 80% before.

Similar to Guardian they had plenty of test automation at Unit/API levels but less automation of front end tests.

Tutorial – Test automation management patterns

This was mostly a walk-through of the website/wiki testautomationpatterns.wikispaces.com and how to use it. The content on the wiki is not bad as such but it is quite high-level and common-sense oriented. It is probably useful to browse through the Issues and Automation Patterns if you are involved in test automation and have a difficult time to get traction. The diagnostics tool did not appear that useful to me.

No big revelations for me during this tutorial if anything it was more of a confirmation of that the approach we have taken at my current customer around testing of backend systems is sound.

Liz Keogh – How to test the inside of your head

Liz, an independent consultant of BDD-fame, talked among other things about Cynefin and how it is applicable in a testing context. Kind of interesting but it did not create much new insight for me (refreshed some old and that is ok too).

Bryan Bakker – Software reliability: Measuring to know

In this presentation Bryan presented an approach (process) to reliability engineering that he has developed together with a couple of colleagues/friends. The talk was a bit dry and academic and quite heavily geared towards embedded software. Surveillance cameras were the primary example that was used. Some interesting stuff here in particular in terms of how to quantify reliability.

Adam Carmi – Transforming your automated tests with visual testing

Adam is CTO of an Israeli tools company called Applitools and the talk was close to a marketing pitch for their tool. Visual Testing is distinct from functional testing in that is only concerned with visuals, i.e., what the human eye can see. It seems to me that if your are doing a lot of cross-device, cross-browser testing this kind of automated test might be of merit.

Harry Collins – The critique of AI in the age of the net

Harry is a professor in sociology at Cardiff University. This could have been an interesting talk about scientific theory / sociology / AI / philosophy / theory of mind and a bunch of other things. I am sure the presenter has the knowledge to make a great presentation on any of these subjects but this was ill-prepared, incoherent, pretty much without point, and not very well-presented. More of a late-night rant in a bar than a keynote.


As with most conferences there was a mix of good and not quite so good content but overall I felt that it was more than worthwhile to be there as I learned a bunch of things and maybe even had an insight or two. Hopefully there will be opportunity to apply some of the things I learned at the customers I am working with.

Svante Lidman
LinkedIn profile

Test data – part 1

When you run an integration or system test, i.e. a test that spans one or more logical or physical boundaries in the system, you normally need some test data, as most non­trivial operations depends on some persistent state in the system. Even if the test tries to follow the advice of favoring to verify behavior over state, you may still need specific input to even achieve a certain behavior. For example, if you want to test an order flow for a specific type of product, you must know how to add a product of that type to the basket, e.g. knowing a product name.

But, and here is the problem, if you don’t have strict control of that data it may change over time, so suddenly your test will fail.

When unit testing, you’ll want to use mocks or fakes for dependencies (and have well factored code that lets you easily do that), but here I’m talking about tests where you specifically want to use the real dependency.

Basically, there are only two robust ways to manage test data:

  1. Each tests creates the data it needs.
  2. Create a managed set of data that covers all of your test needs.

You can also use a combination of the two.

For the first strategy, either you have an idempotent approach so that you just ensure a certain state, or, you create and delete the data for each run. In some cases you can use transactions to be able to safely parallelize your tests and not modify persistent state. Just open one at the start of the test and then abort it instead of committing at the end. Obviously you cannot test functionality that depends on transactions this way.

The second strategy is a lot easier if you already have a clear separation between reference data, application data and transactional data.

By reference data I mean data that change with very low frequency and that often is of limited size and has a list or key/value structure. Examples could be a list of supported languages or zip code to address lookup. This should be fairly easy to keep in one authoritative, version controlled location, either in bulk or as deltas.

The term application data is not as established as reference data. It is data that affects the behavior of the application. It is not modified by normal end user actions, but is continuously modified by developers or administrators. Examples could be articles in a CMS or sellable products in an eCommerce website. This data is crucial for tests. It’s typically the data that tests use as input or for assertions.

The challenge here is to keep the production data and the test data set in synch. Ideally there should be a process that makes it impossible (or at least hard) to update the former without updating the second. However, there are often many complicating factors: the data can be in another system owned by another team and without a good test double, the data can be large, or it can have complex relationships or dependencies that sometimes very few fully grasp. Often it is managed by non­technical people so their tool set, knowledge and skills are different.

Unit or component tests can often overcome these challenges by using a strategy to mock systems or create arbitrary test data and verify behavior and not exact state, but acceptance tests cannot do that. We sometimes need to verify that a specific product can be ordered, not a fictional one created by the test.

Finally, transactional data is data continuously created by the application. It is typical large, fast growing and of medium complexity. Example could be orders, article comments and logs.

One challenge here is how to handle old, ‘obsolete’ data. You may have data stored that is impossible to generate in the current application because the business rules (and the corresponding implementation) have changed. For the test data it means you cannot use the application to create the test data if that was you strategy. Obviously, this can make the application code more complicated, and for the test code, hopefully you have it organized so it’s easy to correlate the acceptance test to the changed business rule and easy to change them accordingly. The tests may get more complicated because there can now e.g. be different behavior for customers with an ‘old’ contract. This may be hard for new developers in the team that only know of the current behavior of the app. You may even have seemingly contradicting assertions.

Another problem can be the sheer size. This can be remediated by having a strategy for aggregating, compacting and/or extracting data. This is normally easy if you plan for it up front, but can be hard when your database is 100 TB. I know that hardware is cheap, but having a 100 TB DB is inconvenient.

The line between application data and transactional data is not always clear cut. For example when an end user performs an action, such as a purchase, he may become eligible for certain functionality or products, thus having altered the behavior of the application. It’s still a good approach though to keep the order rows and the customer status separated.

I hope to soon write more on the tougher problems in automated testing and of managing test data specifically.

Marcus Philip