Introducing Magpie: A new end-to-end test tool for the public domain

7 min readOct 11, 2023

Are you having issues with maintainability of your test suites? Too much code to maintain, hard to understand what it does? Do you need large numbers of similar scripts to cover all combinations of an aspect you want to test?

Magpie was created at Visiba Care in order to address issues like this at our company. We are thrilled to announce that Magpie now is released to the public domain for everyone to enjoy. You can find the repository, called magpie_core, on GitHub, released under the MIT license.

Background

The background behind the decision to create Magpie, was that the product that Visiba Care offers the market requires rigorous end-to-end testing before each new revision can be released. Each end-to-end test represents a flow of events spanning over functionality created by multiple teams, like in the picture below:

Fig 1. Interactions between healthcare professionals and patients

The challenge occurred because none of the teams had the overall responsibility of the entire end-to-end experience, putting it all in the hands of the QA engineers. Slightly simplified, each team had full-stack responsibility but functionality responsibility was split — one team handled the healthcare professional view, another team the patient view.

All in all, we needed a way of expressing tests that made it easy to understand what the intent was with each test and how it was implemented. It should be easy to explain to people from other teams and people from management what a test does, understanding its value and its return on investment.

Design, unique selling points

To address this, we came up with the design of a multi-actor, model-based testing tool which we called Magpie. Magpies are good at finding bugs. Legend has it that magpies predict fortune, good or bad. Fortune telling would be a fantastic property of a test tool, wouldn’t it?

The idea was simply to let a test consist of multiple interacting models, each model representing the end-to-end flow from one user’s perspective. By doing this, each team could create actors governed by models they understand. A complete end-to-end test could then be constructed by putting several interacting models together. Now, the backend only reacts to inputs from either end, so no model was needed to check its behavior.

In general, models should be created for the purpose of testing a single aspect of the system, e.g. messaging, handling of bookings etc. There could be one model describing login/logout behavior, another describing how to interact with messages etc. for each type of end-user.

Fig 2. Different models represent the behavior of each end user.

Describing what to test

Models are implemented as finite state machines, describing the behavior that should be examined. For simplicity, the models are created in a domain-specific language, resembling normal text.

The terminology used in this project is:
state: A condition in which we expect observable aspects to be the same
action: Something that initiates a transition
transition: A relation between states
oracle: A mechanism for telling if an observable aspect behaves correctly or not (pass/fail)

A model is constructed using all of these. Let’s give an example. We want to search the web and check that the web page opens and that the search delivers results. The simplest possible model would look like this:

We can see that this model has three states and two actions.

Describing how to test

So far, we have only designed a model. We need to instruct Magpie of how to do the actions specified in the model in order to make anything happen.

To explain this, we need to introduce three more terms:
actor: A runnable application (a thread really) executing a specified model
session: The context in which to put an actor (i.e. a web browser etc)
test file: A file on disk specifying which sessions to run in order to build a test

Magpie is written in Python. Magpie uses the Python flavor of Playwright in order to do web testing.

So, now we need to write code that actually does something. In the magpie_core repository there are more instructions on how this works. Let’s briefly touch how it works. Magpie will search for functions matching the names of states and actions. Since we are using Python, we want to write code that matches the coding standards of Python, so the names of the states and actions will be transformed to lowercase and spaces will be replaced by underscores. E.g “Start” would match the function “start()”, “At search page” would match the function “at_search_page()” and so on.

In the actions.py file belonging to the internet search actor, we would do something like this:

# Example for Google in Swedish

def open_search_page(page):
  """Open Google in the web browser"""
  page.goto("https://www.google.se")


def search_for_magpie(page):
    """Accept cookies and search for 'magpie'"""
    # Accept cookies:
    page.get_by_role("button", name="Avvisa alla").click()
    # Do the search:
    search_field = page.get_by_role("combobox", name="Sök")
    search_field.click()
    search_field.fill("magpie")
    search_field.press("Enter")

Notice: the page argument refers to Playwright’s Page object and is automatically passed as argument to all functions for an actor running in a web session.

Adding oracles

As you might have noticed, we haven’t yet tested anything. Previously, we stated that oracles should be added to states. An oracle is a mechanism telling whether things went as expected or not. For this purpose, there is a function called expect() that overloads Playwright’s page.expect() function, adding checks for a bunch of additional datatypes.

So, let’s examine this. In the states.py file belonging to the internet search actor, we would typically write something like this:

def start():
  """The initial state, we don't do anything here"
  pass


def at_search_page(page):
  """Let's check that we are on the right page"""
  expect(page).to_have_url("https://www.google.se")


def at_result_page(page):
  """Let's check that we had search enough results containing 'magpie'"""
  magpie_count = page.get_by_text("magpie").count()
  expect(magpie_count).is_greater_than(10)

Every time Magpie finishes running a function without any failures in expect(…) clauses, a Passed result will be logged. If an an expect(…) expression fails, a Failed result will be logged.

Cyclical models

The test designer has an option to create cyclical models, like in Fig 2 above. To do that, make sure that there are no end states (states with no outbound transitions) in your model.

The major benefit of cyclical models is that you may test the same thing over and over again, in order to increase the confidence of the application under test. Naturally, it should be possible to stop the execution. To address this, there is a possibility of setting a time cap (max run time) in the session definition.

To transform our example model above into a cyclical model, we may add a new row:

Code is not provided for the expanded example. The article is long enough anyway :)

Combining actors

Remember that we mentioned the possibility of creating larger tests from smaller entities. Let’s say you have actors representing message processing for healthcare professionals and for patients. Provided that the models are carefully designed, it should be possible to create different tests using the same actors:

One healthcare professional, one patient
More than one healthcare professionals, one patient
One healthcare professional, more than one patient

So, using the same actor code, it is possible to create different types of tests using the same actor code.

Running a test

Having set up a Python environment with all required dependencies, it is a simple matter to run a test from a command-line interface, passing the path to the test file to execute to Magpie’s main file for execution. Magpie runs the test file and produces a set of artifacts:

Text log to STDOUT in the command-line interface
A test summary in Markdown format (for humans)
Result log in CSV format (for machines)
If a test fails, screenshots and Playwright trace files will be saved for further examination in post-mortem analyses.

Conclusion

Using Magpie, we have a number of valuable possibilities:

We may separate what to test from how to test, which is crucial for good maintainability
Responsibilities of models can be split between different development teams and still be part of the same test
Using cyclical models, we can examine repeatability of the application under test
Reusing models for different purposes

Magpie contains more functionality than would be possible to mention in an article as brief as this. Check out the documentation in the magpie_core repository on GitHub for more information!

Let’s hope Magpie has a bright future!

A final note: Since I (David) have been the lead engineer in this project and I am leaving Visiba Care, I and Visiba Care have mutually agreed upon releasing Magpie under a repository controlled by my new company to simplify future maintenance, in case you wonder why the link above does not lead to Visiba Care.

Thank you for reading this far, have a pleasant day!