The Case For Better Behavior Driven Testing

Behavior driven testing is a bit of an oddity in software development. Ask any developer about the tests they write on a daily basis and you'll commonly hear about unit and integration tests. Bring up behavior testing however and you'll likely be met with a bemused stare followed by them walking away while muttering a soliloquy about how they're just integration tests or that it's someone else's job.

"But we use integration testing to verify our behavior." is a common response, and to be fair it's not entirely wrong either. There absolutely is a significant amount of overlap between the two and it doesn't help that the strata of testing methodologies is littered with proverbial landmines that can (and probably have) lead to gun fights between warring factions. The terms are so loaded and ambiguous that some companies (e.g. Google) have dropped them entirely in favor of their own categorization strategy.

As a software developer we develop a relationship with the tools we use daily and have a strong propensity to want to leverage them as much as possible. We're familiar with them, understand how they work, their strengths, their quirks, etc... So it's not unexpected to get push-back when frameworks like gauge or SpecFlow get brought up. After all, seeking the comfort of the familiar is human psychology 101, and nothing is quite as scary as the unknown. But in this particular instance, using these systems is likely a behavior (pardon the pun) borne out of necessity rather than desire.

Just Let the Developers Write the Tests Already

The prevailing wisdom is that the behavior tests should be the responsibility of the stakeholders. After all, they're the ones that cook up the requirements for the system, so that should make them eminently qualified to understand what the system's behavior should be as well, right? Well, to be blunt, they aren't. If they were, then all of the developers would just call it a day and go home since this would imply that the stakeholders already possess the technical aptitude necessary to write the system themselves. Behavior testing, much like any other engineering exercise requires an enormous amount of forethought to implement correctly. Even though companies don't ship their test code, it is their single greatest safeguard to verify that the product they do ship works as advertised. As such, great care must go into the testing portion of an application. Additionally, even though the tests themselves appear simple, their simplicity belies the required complexity necessary to implement the plumbing needed to execute them, and that is a highly technical task that warrants much skill and expertise.

Finally, Software developers are experts at translating requirements into actual software specifications. It's literally our job to figure out how to turn a half-baked, high level idea into working code. There aren't very many Tom Smykowskis in the real world. We already know how to translate other types of requirements into code. Just let us translate the tests as well already.

Stop Hiding Behind DSLs

There's a school of thought that if only we had a DSL expressive enough, yet simple to read/write that the stakeholders could finally be the masters of their own destiny when it comes to authoring behavior tests. I hate to be the bearer of bad news, but no such DSL exists. It never has, and it never will. If it did, we would already be using it to write the system itself. "Silver bullets" similar to this are heavily scrutinized (and ultimately dismissed) in The Mythical Man-Month. That said, products such as Fitnesse(wiki), gauge(markdown), and SpecFlow(gherkin) have all made significant strides in promoting a simple, easy-to-read syntax for expressing system behavior. Unfortunately they all suffer from an inherent impedence mismatch between their structure and actual programming code. In order to "glue" these systems together, a number of strategies are used such as code generation (SpecFlow) or reflection (Fitnesse) to get types to "match up" with their specification counterpart. SpecFlow and Gauge in particular do deserve extra credit for the great lengths they've gone to reduce this level of friction between test specifications and their underlying programming implementation, however.

A common complaint I've often heard from developers when discussing the use of DSLs for business specifications is "why do I need this when I can just use C#?". It's a reasonable question. C# is a remarkably robust language and with a little skill it's entirely possible to write code that is nearly indistinguishable from a DSL. In fact, some have already done this. Consider xbehave which allows developers to delineate scenarios (behaviors) into a series of named steps with their actual implementation hidden behind a lambda. These scenarios are seamlessly integrated into the xunit framework and executed using the exact same test runner that all of the other tests go through. That means that the existing quality gates, reporting systems, and metrics gathering tools all continue to work exactly as before.

This is not to say that DSLs are useless. Quite the opposite! Gherkin is a fantastic syntax for describing system behavior and we all love to write documentation in Markdown syntax, but at the end of the day these are data structures, not programming languages, and tests are ultimately written in programming languages.

C# to the Rescue

It's hard to believe it, but C# is 20 years old at the time of writing. I was a scrappy college student when it first hit the scene, and next year it'll be able to order a beer without a fake ID anymore. The language today is very different today than it was in 2002 with many features common today that didn't even exist in its earlier days. While it is not a metalanguage in the same vein as languages such as LISP or (to a lesser degree) C/C++, it does have a very malleable syntax and there are many tricks to make it look different from conventional programming code. Instead of relying on a DSL, an alternative is to construct a programming API that "looks" like a DSL but is still compiled as regular C# code.

Gherkin Without the Hassle

Gherkin is great. It's simple, succinct, and can be used to express a wide variety of ideas. The difficulty with it however is the aforementioned friction between specification files which need to go through a translation layer and the code itself. To their credit, Tricentis (the authors of SpecFlow) has done a terrific job of minimizing this as much as possible by making clever use of build targets to seamlessly translate specifications (features) into their corresponding testing framework-specific code. However, it can be opined that this complexity isn't really necessary. In order to write a feature using SpecFlow the developer must first create a .feature file, annotate it with the markup for the scenarios to test, then create a corresponding C# class file(s) containing the actual code for each step. In order for SpecFlow to make the connection between the two, specific [Attribute]s are used with magic strings containing regexes in order to capture parameters. In addition, since a real-world application will likely have many features (and therefore many, many more individual steps defined) it is necessary to devise a strategy for tagging steps early on in order to avoid global step pollution since by default any step can be used in any scenario test regardless of whether or not they have anything to do with that scenario.

This isn't to besmirch SpecFlow. It's a great product and works well. I personally use it professionally. But it isn't without its complexities, and an engineer I often ask myself "Is there a better way to do this?".

Empower the Developers

With JustBehave we want to create a full featured behavior testing library without any dependency on foreign DSLs. JustBehave provides a clean syntax for writing behavior tests using the gherkin syntax championed by cucumber entirely in C# code. By eschewing the use of custom DSLs the developer can focus on writing the tests themselves rather than layers of proxy code to act as a go-between for high level business specifications which are usually written by the developers anyway during the requirements gathering phase. Since the library ties into existing the vstest testing platform it can be integrated directly into an existing build pipeline automatically with little-to-no additional effort. Finally, testing frameworks are already fully capable of generating their own reports and can be leveraged to turn a behavior test into a human-readable report.

JustBehave provides the canvas to write behavior tests with many features automatically built-in. How the tests are actually written is up to the developers.

The repo for this project can be found here.

Related Books