Is AI based Test Automation the new Capture-Replay?

Test Automation is ‘broken’, it has been for years. Even today, 25 years into the evolution of a discipline, 20-25% automation seems like the industry norm, so why? Fundamentally it’s a knowledge capture issue.

Domain experts have the knowledge. It’s about articulation of that knowledge into a formal structure that can drive an automated test tool. How do we get from A (process knowledge) to B (a machine readable structure for replaying that process = code)?

a) Capture-Replay – Here be Dragons: Teach a tool, by limited example, the steps that you want to automate and replay them. Accuracy, robustness, maintenance, re-use, application of verification, the list goes on. Fundamentally not a good idea, I think we all accept that.

b) Abstraction frameworks – Provide a mechanism for domain experts to articulate process in a formalized abstract manner e.g. Tables or spreadsheets. Has some merits but requires a technical champion to build and maintain the framework with varying degrees of input, from keyword frameworks (large amounts of input in coding of keywords) to class-action frameworks (smaller amounts of input for coding specific requirements).

c) BDD – A kind of abstraction framework that mashes up the capture of domain knowledge and test structure into a ubiquitous language that has shared application. i.e. Spec and test code structure. E.g. Gherkin, Cucumber, Specflow et al. Does this work in reality? The jury is still out on BDD testing. Feature led development and testing evidence suggests works well, integration or larger E2E journeys seem harder, and technical input in implementing clauses in code is still needed.

d) TBC.

e) Artificial Intellgience (Hurrah!)- So we can now teach a tool using large quantities of examples through RL and ML techniques and just let bots do the testing for us! A bright future awaits! Here be Dragons (again). In essence a kind of aggregated capture-replay, learning how to interpret process “intent” by example. Still requires a formal mechanism for documenting intent for replay in an abstract form that is machine readable. On top of that, a way of specifying success/failure criteria and differentiating defects from automation failures is still needed. In this world is it a real defect, an application change or a gap in the AI’s learning?

Whilst I am genuinely immensely excited by the advances in AI and their application in the test automation space, particularly in interpreting micro-intent e.g. “Click the Login Button” to minimize automation brittleness. There still seems to be a disconnect between defining larger test process from domain expertise and automation code.

I suspect that Natural language processing (to infer test intent) is a candidate for research and development, but fundamentally there is still a need to capture domain expertise and process in some form.

So back to d) How do we capture domain expertise in a simple and flexible fashion that is machine readable? Well block coding might provide that interim: https://www.scriptworks.io

Share this article: