Go Go Gnome
the website of Sander Kooijmans

How to write tests that need a lot of data?

Posted: October 09, 2019
Last modified: October 10, 2019

Based on a true story.

Introduction

For 20 years I developed applications in Java. Over the years I have gained a lot of experience with writing tests (unit tests/integration tests). I have used different styles of writing tests. Some styles seemed promising at first, but turned out horrible, for example, when new features needed to be added to existing classes. Often the cause of tests becoming horrible was the large dataset required by the test. A couple of techniques tried out by my teammates and I helped us to write clean tests that used a lot of data. In this post I want to share these techniques with you.

The main goal of a test is to show that a specific piece of code works correctly. The piece of code that is being tested is called Subject Under Test.

A test consists of 3 parts:

  1. Arrange: sets up the test data
  2. Act: calls the code that must be tested
  3. Assert: verifies that the Subject Under Test worked correctly

Sometimes these parts are referred to as 'given/when/then'. This post describes how to keep the arrange part clean and readable when it needs to set up a lot of data in the database.

I present the techniques with code samples of a Warehouse Management System (WMS). As you can find on my website, I have worked for a paint factory for about 8 years and worked on a WMS, which later grew into an ERP. That application was developed in Java.

In March 2019 I started to work for a completely different company, Protix, and since that time I develop in Python. Since this post is a preparation for my presentation at PyCon DE & PyData Berlin, I show code samples in Python using a WMS implemented in Django.

The code samples I show you are not from the real WMS I have worked on before. Nevertheless, the code samples are realistic and complex enough to explain the techniques for writing tests that need a lot of data. That is why I say: this post is based on a true story.

Finally I want to thank John Castelijn who was one of my teammates at the factory. His input and feedback over the years helped to develop and improve these techniques.

What is a Warehouse Management System?

To explain what a Warehouse Management System (WMS) is, I first explain what a warehouse is. A warehouse is a large building, full of racks. The racks store pallets. Each pallet contains items, for example 200 cans of white paint, 1 liter each.

The following pictures give an impression of the building, racks and pallets:

A warehouse

Photo by Ramon Cordeiro on Unsplash

An aisle in a warehouse

Photo by Ruchindra Gunasekara on Unsplash

Operators drive forklifts to move pallets and to pick the orders.

A forklift carrying a pallet

Photo by ELEVATE on Pexels

The purpose of a warehouse is to handle orders. Just as software architecture is about decomposing functionality into components, the warehouse can be decomposed in areas with their own purpose:

Plan of a warehouse

Everything in the warehouse has an id:

Here is an example of a barcode and text of an id of a location in the pick area:

A pick location barcode

And here is an example of a barcode and text of a pallet id:

A pallet barcode

A WMS supports the planners and operators to fulfil orders. It does so by giving planners and operators an overview of the current stock and supporting the workflows in the warehouse. The main workflows in a WMS are:

What is a lot of data in a test?

When learning Test-Driven Development many people practice implementing small algorithms or data structures, for example, a stack. This is what unit tests for a Stack class look like:

class TestStack(TestCase):
    stack = Stack()

    def test_empty_stack_is_empty(self):
        self.assertTrue(self.stack.is_empty())

    def test_empty_stack_count_returns_zero(self):
        self.assertEqual(self.stack.count(), 0)

    def test_pop_from_empty_stack_raises_exception(self):
        self.assertRaises(StackIsEmptyError, self.stack.pop)

    def test_stack_with_one_item_is_not_empty(self):
        self.stack.push("foo")

        self.assertFalse(self.stack.is_empty())

    def test_stack_with_one_item_has_count_one(self):
        self.stack.push("foo")

        self.assertEqual(self.stack.count(), 1)

    def test_pop_from_stack_with_one_item_returns_item(self):
        self.stack.push("foo")

        self.assertEqual(self.stack.pop(), "foo")

    def test_multiple_items_on_stack_are_popped_in_reverse_order_of_push(self):
        self.stack.push("foo")
        self.stack.push("bar")

        self.assertEqual(self.stack.pop(), "bar")
        self.assertEqual(self.stack.pop(), "foo")

These tests use at most two elements of data: the strings foo and bar. It is very easy to provide such few elements in tests.

Most developers work on applications that store data in a database. Over time, the database consists of tens to hundreds of tables. Implementing new features might introduce a new table or add a few columns to an existing table. To test the new feature, requires data in tables. If you are lucky, your test only needs data form one or two tables. If you are unlucky, you need data from tens of different tables or a hundred records in a single table.

Now we return to writing tests for a WMS. Depending on the Subject Under Test, different locations need to be created. For example, to test stocking, a relatively large bulk area is needed, for example with 100 locations. To test replenishment, a small bulk and pick area can suffice. For picking, a relatively large pick area must be used. See the following table to get an impression of the number of locations that are needed to test a process. Note that apart from this, we always need items, users and at least one forklift.

Table of workflow locations

The Good, the Bad and the Ugly

For the Subject Under Test it does not matter how the data was set up, as long as the data is present. I have seen different ways to set up test data in a database:

When to fill the database?

  1. Fill database for each test case
  2. Fill database only once and use that database for all the test cases

How to fill the database?

  1. Fill database using SQL statements
  2. Fill database using Python code

What is stored in the database?

  1. The minimum set of data needed by the Subject Under Test. The data may be incomplete and non-realistic.
  2. Complete and realistic data. May be more than needed by the Subject Under Test.

Imagine we want to test moving a pallet from one location to another location. We first need to create locations in the database. Here is the Django model for locations:

class LocationType(Enum):
    BULK = "B"
    PICK = "P"
    PICK_AND_DROP = "&"
    AUDIT = "A"
    STAGING = "S"
    DOCK = "D"
    PROBLEM = "!"
    FORKLIFT = "F"


class Location(models.Model):
    id = models.CharField(max_length=8, primary_key=True)
    type = models.CharField(max_length=1, choices=[(tag, tag.value) for tag in LocationType])
    sequence = models.SmallIntegerField()
    level = models.SmallIntegerField()
    aisle = models.CharField(max_length=8, null=True, blank=True)
    blocked = models.BooleanField()

Here is a SQL statement that could be used to create a location in the database as part of the arrange part of a test case:

INSERT INTO location (id, type, sequence, level, aisle, blocked) 
VALUES ('BF145C', 'B', 543, 3, 'BULK-BF', false);

Instead of using SQL statements I have seen variations where data was copied from a production or test database and then stored as XML files. That is just another way of representing SQL statements and I consider them just as bad and ugly as SQL statements:

<testdata>
  <location id="BF145C" type="B" sequence="543" level="3" aisle="BULK-BF" blocked="false" />
</testdata>

Django makes it very easy to create an object using Python code:

location = Location.objects.create(
    id="BF145C", type=LocationType.BULK, aisle="BULK-BF", sequence=543, level=3, blocked=False
)

Imagine we add a new column, rename a column or rename a table. If you use SQL statements, you have to update a lot of SQL statements. If you use Python code as above, you would have to update a lot of Python code. The rest of this post describes techniques to keep your arrange part short, clean and readable even when you need to set up a lot of data. The techniques will make it way easier to deal with changes in the structure of the database than when you use SQL statements.

Technique 1: Test Data Builder

A pallet at a problem location

One of the features of our WMS is that when a pallet gets moved to a problem location, the pallet gets blocked. A problem location is a location with the type Location.PROBLEM.

Here is a test case to illustrate this scenario:

from django.test import TestCase

from wms.models import Item, ItemOnPallet, Location, LocationType, Pallet
from wms.service import Service
from wms.tests import test_data_builder as tdb


class TestPalletMove(TestCase):
    service = Service()

    def test_pallet_moved_to_problem_location_gets_blocked(self):
        forklift = Location.objects.create(
            id="FORKLIFT01", sequence=0, level=0, type=LocationType.FORKLIFT, blocked=False
        )
        problem_location = Location.objects.create(
            id="PROBLEM", sequence=0, level=0, type=LocationType.PROBLEM, blocked=False
        )
        item = Item.objects.create(description="White paint (1 liter)")
        pallet = Pallet.objects.create(location=forklift, blocked=False)
        pallet.items.add(ItemOnPallet.objects.create(pallet=pallet, item=item, quantity=100, batch="2019401234"))

        self.service.move_pallet(pallet, problem_location)

        self.assertTrue(pallet.blocked)
        self.assertEqual(pallet.location, problem_location)

The arrange part is the largest part of the test. Creating a location requires to fill in a lot of parameters, most of which are not even relevant for the test.

The test becomes more readable by defining default values for sequence and level. But sometimes it is not desirable to use default values. There is a way to improve readability without using default values in models: extract the code to create the forklift, the problem location and the pallet to separate methods:

def test_pallet_moved_to_problem_location_gets_blocked(self):
    forklift = self.create_forklift()
    problem_location = self.create_problem_location()
    item = self.create_white_paint()
    pallet = self.create_pallet(forklift, {item: 100})

    self.service.move_pallet(pallet, problem_location)

    self.assertTrue(pallet.blocked)
    self.assertEqual(pallet.location, problem_location)

def create_forklift(self):
    return Location.objects.create(id="FORKLIFT01", sequence=0, level=0, type=LocationType.FORKLIFT, blocked=False)

def create_problem_location(self):
    return Location.objects.create(id="PROBLEM", sequence=0, level=0, type=LocationType.PROBLEM, blocked=False)

def create_white_paint(self):
    return Item.objects.create(description="White paint (1 liter)")

def create_pallet(self, forklift, items):
    pallet = Pallet.objects.create(location=forklift, blocked=False)
    for item, quantity in items.items():
        pallet.items.add(
            ItemOnPallet.objects.create(pallet=pallet, item=item, quantity=quantity, batch="2019401234")
        )
    return pallet

See how the arrange part of the test case has improved? See how the intention of the arrange part becomes clear?

Most variables in the arrange part could be inlined to make the arrange part even smaller. The only variables that cannot be inlined are problem_location and pallet because they are used more than once. But what if we changed the create methods to methods into methods that get or create an object. For example, the method create_problem_location() can be changed into a method problem_location() that will create a problem location when it is called the first time, and returns the same problem location on all subsequent calls. We can apply this same technique tot he other create methods as well except for the create_pallet(). The reason for keeping create_pallet() as is, is that most test cases require pallets with specific contents or at specific locations and sometimes need multiple pallets. Using the create-or-get-trick for pallets is just not handy.

The test code now looks like this:

def test_pallet_moved_to_problem_location_gets_blocked(self):
    pallet = self.create_pallet(self.forklift(), {self.white_paint(): 100})

    self.service.move_pallet(pallet, self.problem_location())

    self.assertTrue(pallet.blocked)
    self.assertEqual(pallet.location, self.problem_location())

Wow, the arrange part is now just a single line! And notice how clear the important values stand out: we create a pallet on forklift and then move it to a problem location. The pallet has 100 cans of white paint. (Ok, I must admit that for this test the value 100 x white paint is not relevant, as long as the pallet is not empty.)

This technique of extracting methods, and using the create-or-get-trick, is really worth the effort. You can do this in each test class. But should each class get a method create_pallet() and white_paint() and forklift()? That would be violating the DRY principle (DRY = Don't Repeat Yourself). So the next step is to move these methods to a separate module. I name this module test_data_builder.

Within this module, the methods are grouped in classes, one class per model class. To make the classes in the test_data_builder module stand out from the model classes, they are prefixed with Tdb. So all methods that get or create a Pallet are grouped in a TdbPallet class. And instead of writing TdbPallet.createPallet() I write TdbPallet.create() to avoid duplication of the word Pallet and to make the lines shorter.

The advantage of introducing a class TdbLocation.forklift() instead of creating a function forklift() in the test_data_builder module is that it is easier to discover what kind of locations can be created. Just inspect the TdbLocation class or type TdbLocation. in the IDE and use auto-completion to find out which locations can be created.

The result of moving the methods to this module looks like this:

from wms.tests.test_data_builder import TdbItem, TdbLocation, TdbPallet

def test_pallet_moved_to_problem_location_gets_blocked(self):
    pallet = TdbPallet.create(TdbLocation.forklift(), items={TdbItem.white_paint(): 100})

    self.service.move_pallet(pallet, TdbLocation.problem())

    self.assertTrue(pallet.blocked)
    self.assertEqual(pallet.location, TdbLocation.problem())

Here is the code from the test_data_builder module:

from datetime import date

from wms.models import AssignedLocation, Customer, Item, ItemOnPallet, Location, LocationType, Order, OrderLine, Pallet


class TdbItem:
    @staticmethod
    def white_paint():
        return Item.objects.get_or_create(description="White paint (1 liter)")[0]

    @staticmethod
    def black_paint():
        return Item.objects.get_or_create(description="Black paint (1 liter)")[0]

    @staticmethod
    def yellow_paint():
        return Item.objects.get_or_create(description="Yellow paint (1 liter)")[0]


class TdbLocation:

    _next_sequence = 123

    @staticmethod
    def forklift(id="FORKLIFT01"):
        return Location.objects.get_or_create(id=id, sequence=0, level=0, type=LocationType.FORKLIFT, blocked=False)[0]

    @staticmethod
    def problem(id="PROBLEM"):
        return Location.objects.get_or_create(id=id, sequence=0, level=0, type=LocationType.PROBLEM, blocked=False)[0]

    @staticmethod
    def audit(id="AUDIT", blocked=False):
        return Location.objects.get_or_create(
            id=id, aisle=None, sequence=0, level=0, type=LocationType.AUDIT, blocked=blocked
        )[0]

    @staticmethod
    def bulk(id=None, aisle="BULK", sequence=None, level=0, blocked=False):
        sequence = TdbLocation._get_sequence(sequence)
        return Location.objects.get_or_create(
            id=id or f"{aisle}{sequence:03d}{chr(65 + level)}",
            aisle=aisle,
            sequence=sequence,
            level=level,
            type=LocationType.BULK,
            blocked=blocked,
        )[0]

    @staticmethod
    def create_pick(id=None, aisle="PICK", sequence=None, level=0, blocked=False, item=None, max_quantity=100):
        sequence = TdbLocation._get_sequence(sequence)
        location = Location.objects.create(
            id=id or f"{aisle}{sequence:03d}{chr(65 + level)}",
            aisle=aisle,
            sequence=sequence,
            level=level,
            type=LocationType.PICK,
            blocked=blocked,
        )

        AssignedLocation.objects.create(
            location=location, item=item or TdbItem.white_paint(), max_quantity=max_quantity or 100
        )

        return location

    @staticmethod
    def staging(id=None, aisle="STAGING", sequence=0, level=0, blocked=False):
        sequence = TdbLocation._get_sequence(sequence)
        return Location.objects.get_or_create(
            id=id or f"{aisle}{sequence:03d}{chr(65 + level)}",
            aisle=aisle,
            sequence=sequence,
            level=level,
            type=LocationType.STAGING,
            blocked=blocked,
        )[0]

    @staticmethod
    def dock(id="DOCK01", blocked=False):
        return Location.objects.get_or_create(
            id=id, aisle=None, sequence=0, level=0, type=LocationType.DOCK, blocked=blocked
        )[0]

    @staticmethod
    def _get_sequence(sequence=None):
        if not sequence:
            sequence = TdbLocation._next_sequence
            TdbLocation._next_sequence += 1
        return sequence


class TdbPallet:

    _next_batch = 1

    @staticmethod
    def create(location: Location, items=None, blocked=False, pick_list=None):
        """Creates a pallet.

        :param location: the location where the pallet is created.
        :param items: a dictionary of items and quantities. Use None to create an empty pallet.
        :param blocked: indicates whether the pallet is blocked.
        :param pick_list: indicates whether the pallet contains or will contain items picked for a pick list.
        """
        pallet = Pallet.objects.create(location=location, blocked=blocked, pick_list=pick_list)

        for item, quantity in (items or {}).items():
            batch = f"{TdbPallet._next_batch:06d}"
            TdbPallet._next_batch += 1
            pallet.items.add(ItemOnPallet.objects.create(pallet=pallet, item=item, quantity=quantity, batch=batch))

        return pallet


class TdbCustomer:
    @staticmethod
    def create(name="John Doe") -> Customer:
        return Customer.objects.create(name=name)


class TdbOrder:
    @staticmethod
    def create(items, customer=None, shipping_date=None):
        order = Order.objects.create(
            customer=customer or TdbCustomer.create(), shipping_date=shipping_date or date.today()
        )

        for item, quantity in items.items():
            order.lines.add(OrderLine.objects.create(order=order, item=item, quantity=quantity))

        return order

See how some functions create a thing and return the same thing the next time the function is called? That is useful for things that are constant for your tests. TdbItem.white_paint() will always return the same item. The exact details of the item are not relevant for most of the test. The Test Data Builder makes it easy to use a small set of predefined items.

Other functions create a new thing every time you call them. Pallets are typically created specifically for a test, for example the function TdbPallet.create().

The code in the Test Data Builder can get a bit complex, like TdbPallet.create() shows. Note that you typically write the methods in the Test Data Builder once, and modify/extend it a couple of times, but you use these methods hundreds of times. So while writing functions for Test Data Builder, focus on ease of use and cleanliness of the test code that uses the Test Data Builder.

Methods in Test Data Builders have the following properties:

A trick to ensure the method is simple to use is to first write a call to the method in a test and then implement the function.

When using the Test Data Builder in a test, the test code should be simple and readable. However, tests should not depend on a specific default values chosen by the Test Data Builder. If a test needs some pallet that is empty, and a Test Data Builder TdbPallet.create() is called without specific items, then this function must be called with an argument that makes clear that the pallet is empty: TdbPallet.create(TdbLocation.forklift(), items=None). If you need empty pallets in many tests, then you had better create a new method named TdbPallet.create_empty().

And one final tip: you can also use the Test Data Builder to test a REST API:

class TdbLocation:

    @staticmethod
    def forklift(id="FORKLIFT01"):
        return {"id": id, "sequence": 0, "level": 0, "type": LocationType.FORKLIFT, "blocked": False}


def test_create_location(client):
    response = client.post('/locations', data=TdbLocation.forklift())
    assert response.status_code == 201

Technique 2: Visualize the setup data

The second technique is illustrated by testing an important feature of stocking. Remember that stocking is the process of putting full pallets, which just arrived from the factory, in locations in the bulk area. The diagram below shows a side view of an aisle in the bulk area. The WMS will tell the forklift driver which location should be filled with the next pallet.

Filling the aisle level by level from bottom to top is not efficient, because the horizontal distance the forklift has to travel increases a lot. In real life there are more than 5 locations at one level. Filling the aisle column by column from left to right is more efficient. However, forklifts move faster horizontally than vertically. So the optimal way to fill the aisle is under an angle.

Stocking

Here is the code to test that the correct location is returned to stock a pallet in an empty aisle.

def test_stocking_pallet_in_empty_aisle(self):
    pallet = TdbPallet.create(TdbLocation.forklift(), items={(TdbItem.white_paint()): 100})
    for sequence in range(1, 6):
        for level in range(0, 5):
            TdbLocation.bulk(aisle=self.AISLE, sequence=sequence, level=level)

    destination = self.service.get_stock_location(pallet, self.AISLE)

    self.assertEqual(destination, TdbLocation.bulk(aisle=self.AISLE, sequence=1, level=0))

See how using a Test Data Builder in nested loops build an empty aisle in just 3 lines of code. This way of generating data gets messy when certain locations in the aisle are already occupied by pallets.

The diagram above inspires me to represent the aisle textually as a multiline string. For example, the following string represents the aisle containing 3 pallets and the asterisk indicates the location where the next pallet must be stocked:

 """|     |
    |     |
    |o    |
    |oo*  |"""

This multiline string defines both the arrange and assert part of the test!

Here is the code that uses such textual representations to specify the aisle that has to be configured and indicates the expected location where the next pallet must be stocked.

@parameterized.expand(
    [
        (
            "empty aisle",
            """|     |
               |     |
               |     |
               |*    |""",
        ),
        (
            "one pallet present",
            """|     |
               |     |
               |     |
               |o*   |""",
        ),
        (
            "two pallets present",
            """|     |
               |     |
               |*    |
               |oo   |""",
        ),
        (
            "three pallets present",
            """|     |
               |     |
               |o    |
               |oo*  |""",
        ),
        (
            "four pallets present",
            """|     |
               |     |
               |o*   |
               |ooo  |""",
        ),
        (
            "no free location in aisle",
            """|ooooo|
               |ooooo|
               |ooooo|
               |ooooo|""",
        ),
        (
            "first gap is filled",
            """|     |
               |     |
               | o   |
               |o*o  |""",
        ),
        (
            "blocked location is skipped",
            """|     |
               |     |
               |     |
               |x*   |""",
        ),
    ]
)
def test_stocking_pallet(self, _, bulk_aisle_map: str) -> None:
    """
    Test if a non-blocked pallet gets the correct stock location within an aisle.
    :param _: is added to the name of the test. Not used otherwise.
    :param bulk_aisle_map: two-dimensional map of the aisle. The pipes indicate the start and end of a level
    in the rack. The meaning of the characters between the pipes is:
    o: a pallet
    *: empty location, this is the expected location
    x: empty location, blocked
    """
    expected_location = self._create_locations_in_aisle(bulk_aisle_map)
    pallet = TdbPallet.create(TdbLocation.forklift(), items={TdbItem.white_paint(): 100})

    destination = self.service.get_stock_location(pallet, self.AISLE)

    if expected_location:
        self.assertEqual(destination, expected_location)
    else:
        self.assertIsNone(destination)

def _create_locations_in_aisle(self, bulk_aisle_map):
    lines = re.findall("[|][^|]+[|]", bulk_aisle_map)
    lines.reverse()
    expected_location = None
    level = 0
    for line in lines:
        for sequence in range(1, len(line) - 1):
            bulk_location = TdbLocation.bulk(aisle=self.AISLE, sequence=sequence, level=level)
            if line[sequence] == "o":
                items = {TdbItem.white_paint(): 100}
                TdbPallet.create(bulk_location, items=items)
            if line[sequence] == "*":
                expected_location = bulk_location
            if line[sequence] == "x":
                bulk_location.blocked = True
                bulk_location.save()
        level += 1
    return expected_location

The logic of parsing a multiline string and generating the data is implemented by _create_locations_in_aisle(). This code is a bit complex, but once this method has been written, generating test cases becomes a piece of cake. You can pair program with your product owner to write more test cases.

Another situation where this technique can be applied is when you test actor based code. You could use one string per actor to indicate at which moment in time a specific message is sent to that actor. Another string could be used to describe the expected messages sent by a specific actor:

in_1: "1        4          3     "
in_2: "  a        b      c       "
out:  "   (a,1)    (b,4)    (c,3)"

Technique 3: Workflow

The third technique is illustrated by testing parts of the picking workflow. The following diagram describes the workflow for order handling:

Order handling flow diagram

The step 'Operator picks pick list' can be further detailed as follows:

Picking workflow diagram

The WMS implements the following methods that are used by the terminal during the picking process:

To properly test these methods, we need an order and a pick list and a pallet on a forklift that contains the items that have already been picked. There are many scenarios we want to test:

The more picked items are required by the arrange part of the test, the more lines of code are added to the arrange part. This reduces the readability and maintainability of these tests.

Since the picking workflow is well defined, it is quite easy to write a PickWorkflow class that will use the business logic to get the database in a specific state. The class has methods that match with steps within the workflow diagrams shown above. Each method ensures that any preceding steps of the flow will have been performed.

from typing import Optional

from wms.models import Location, LocationType, Pallet, PickList
from wms.service import Service
from wms.tests.test_data_builder import TdbLocation, TdbOrder, TdbPallet


class PickWorkflow:
    def __init__(self, items, customer=None, shipping_date=None, generate_pick_locations=True, forklift=None):
        self.service = Service()
        self.forklift = forklift or TdbLocation.forklift()
        self.order = TdbOrder.create(items, customer, shipping_date)
        self.pick_list: Optional[PickList] = None
        self.pick_pallet: Optional[Pallet] = None
        self.picked_pallets = []

        if generate_pick_locations:
            self._create_pick_locations()

    def _create_pick_locations(self):
        for order_line in self.order.lines.all():
            location = TdbLocation.create_pick(item=order_line.item, max_quantity=order_line.quantity)
            items = {order_line.item: order_line.quantity}
            TdbPallet.create(location, items=items)

    def generate_pick_list(self):
        if self.pick_list:
            raise Exception("A pick list has already been generated for the order.")

        self.pick_list = self.service.create_pick_list(self.order)

        return self

    def pick(self, items):
        self._ensure_pick_list_is_generated()
        self._ensure_pick_pallet_is_created()
        for item, quantity in items.items():
            location = self.service.get_next_pick_location(self.pick_list, None)
            self.service.pick(self.pick_list, location, self.pick_pallet, item, quantity)
        return self

    def put_pallet_in_staging_lane(self, staging_location: Optional[Location] = None):
        if not self.pick_pallet:
            raise Exception("The forklift carries no pallet.")

        staging_location = staging_location or TdbLocation.staging()
        if not staging_location:
            raise Exception("The warehouse has no staging location.")

        self.service.move_pallet(self.pick_pallet, staging_location)
        self.pick_pallet = None

        return self

    def pick_and_put_pallet_in_staging(self, staging_location: Optional[Location] = None):
        self._ensure_pick_list_is_generated()
        location = Location.objects.filter(type=LocationType.PICK).first()
        while not self.pick_list.is_completely_picked():
            self._ensure_pick_pallet_is_created()
            location = self.service.get_next_pick_location(self.pick_list, location)
            if location:
                quantity = self.service.get_pick_quantity_for(self.pick_list, location)
                self.service.pick(self.pick_list, location, self.pick_pallet, location.assignment.item, quantity)
            else:
                raise Exception("Not enough items on stock for pick list.")

        self.put_pallet_in_staging_lane(staging_location)

        return self

    def pick_and_put_pallet_in_truck(self, dock: Optional[Location] = None):
        self.pick_and_put_pallet_in_staging()

        dock = dock or Location.objects.filter(type=LocationType.DOCK).first()
        if not dock:
            raise Exception("The warehouse has no dock location.")

        for pallet in self.picked_pallets:
            if not pallet.shipping_label:
                self.service.print_shipping_label(pallet)
            self.service.move_pallet(pallet, dock)

        return self

    def _ensure_pick_list_is_generated(self):
        if not self.pick_list:
            self.generate_pick_list()

    def _ensure_pick_pallet_is_created(self):
        if not self.pick_pallet:
            self.pick_pallet = TdbPallet.create(self.forklift, pick_list=self.pick_list)
            self.picked_pallets.append(self.pick_pallet)

Here are examples of tests that use this PickWorkflow class:

import re

from django.test import TestCase

from wms.models import Location
from wms.service import Service
from wms.tests.pick_workflow import PickWorkflow
from wms.tests.test_data_builder import TdbItem, TdbLocation, TdbPallet


class TestPicking(TestCase):
    service = Service()

    def test_get_pick_quantity_when_nothing_picked_yet(self):
        self._create_pick_locations("P_1: 100xwhite_paint")
        workflow = PickWorkflow(generate_pick_locations=False, items={TdbItem.white_paint(): 5}).generate_pick_list()

        quantity = self._get_pick_quantity("P_1", workflow)

        self.assertEqual(quantity, 5)

    def test_get_pick_quantity_for_location_that_has_less_than_remaining_quantity_to_be_picked(self):
        self._create_pick_locations("P_1: 3xwhite_paint")
        workflow = PickWorkflow(generate_pick_locations=False, items={TdbItem.white_paint(): 5}).generate_pick_list()

        quantity = self._get_pick_quantity("P_1", workflow)

        self.assertEqual(quantity, 3)

    def test_get_pick_quantity_for_empty_location(self):
        self._create_pick_locations("P_1: 0xwhite_paint")
        workflow = PickWorkflow(generate_pick_locations=False, items={TdbItem.white_paint(): 5}).generate_pick_list()

        quantity = self._get_pick_quantity("P_1", workflow)

        self.assertEqual(quantity, 0)

    def test_pick_part_of_pick_list_get_pick_quantity_for_item(self):
        self._create_pick_locations("P_1: 100xwhite_paint")
        workflow = PickWorkflow(generate_pick_locations=False, items={TdbItem.white_paint(): 5}).pick(
            items={TdbItem.white_paint(): 2}
        )

        quantity = self._get_pick_quantity("P_1", workflow)

        self.assertEqual(quantity, 3)

    def test_pick_complete_pick_list_get_pick_quantity_for_item(self):
        self._create_pick_locations("P_1: 100xwhite_paint")
        workflow = PickWorkflow(generate_pick_locations=False, items={TdbItem.white_paint(): 5}).pick(
            items={TdbItem.white_paint(): 5}
        )

        quantity = self._get_pick_quantity("P_1", workflow)

        self.assertEqual(quantity, 0)

    def test_pick_item_1_completely_get_pick_quantity_for_item_2(self):
        self._create_pick_locations("P_1: 100xwhite_paint", "P_2: 100xblack_paint")
        workflow = PickWorkflow(
            generate_pick_locations=False, items={TdbItem.white_paint(): 5, TdbItem.black_paint(): 10}
        ).pick(items={TdbItem.white_paint(): 5})

        quantity = self._get_pick_quantity("P_2", workflow)

        self.assertEqual(quantity, 10)

    def test_pick_item_1_partly_get_pick_quantity_for_item_2(self):
        self._create_pick_locations("P_1: 100xwhite_paint", "P_2: 100xblack_paint")
        workflow = PickWorkflow(
            generate_pick_locations=False, items={TdbItem.white_paint(): 5, TdbItem.black_paint(): 10}
        ).pick(items={TdbItem.white_paint(): 3})

        quantity = self._get_pick_quantity("P_2", workflow)

        self.assertEqual(quantity, 10)

    def test_get_pick_quantity_for_item_that_has_been_picked_partly_on_other_pallet(self):
        self._create_pick_locations("P_1: 100xwhite_paint")
        workflow = (
            PickWorkflow(generate_pick_locations=False, items={TdbItem.white_paint(): 5})
            .pick(items={TdbItem.white_paint(): 3})
            .put_pallet_in_staging_lane()
        )

        quantity = self._get_pick_quantity("P_1", workflow)

        self.assertEqual(quantity, 2)

    def _get_pick_quantity(self, location_id, workflow):
        location = Location.objects.get(id=location_id)
        return self.service.get_pick_quantity_for(workflow.pick_list, location)

    def _create_pick_locations(self, *args):
        for arg in args:
            match = re.match("(.+): ([0-9]+)x(.+)", arg)
            self.assertIsNotNone(match, f"The argument {arg} is invalid!")
            item = getattr(TdbItem, match.group(3))()
            location = TdbLocation.create_pick(id=match.group(1), item=item)
            items = {item: int(match.group(2))}
            TdbPallet.create(location, items=items)

The PickWorkflow is not only handy for testing picking. After pallets have been loaded in a truck, transport documentation can be generated. One type of transport documentation is a transport notice which describes the pallets and their contents in the truck. Here is an example of a test for generating the transport notice for three pallets from three different orders:

def test_transport_notice(self):
    workflow_1 = PickWorkflow(items={TdbItem.white_paint(): 10}).pick_and_put_pallet_in_truck(TdbLocation.dock())
    workflow_2 = PickWorkflow(items={TdbItem.black_paint(): 25}).pick_and_put_pallet_in_truck(TdbLocation.dock())
    workflow_3 = PickWorkflow(items={TdbItem.yellow_paint(): 45}).pick_and_put_pallet_in_truck(TdbLocation.dock())

    transport_notice = self.service.generate_transport_notice(TdbLocation.dock())

    self.assertEqual(
        transport_notice,
        [
            {
                "shipping_label": workflow_1.picked_pallets[0].shipping_label,
                "contents": [{"item": TdbItem.white_paint().description, "quantity": 10}],
            },
            {
                "shipping_label": workflow_2.picked_pallets[0].shipping_label,
                "contents": [{"item": TdbItem.black_paint().description, "quantity": 25}],
            },
            {
                "shipping_label": workflow_3.picked_pallets[0].shipping_label,
                "contents": [{"item": TdbItem.yellow_paint().description, "quantity": 45}],
            },
        ],
    )

Focus on clean test code

When writing tests, it might happen that you write the same lines of code in a couple of tests. When extracting this code to a method, make sure that using the extracted method is as simple as possible.

The WMS will check each pallet that is about to be loaded into a truck. If for a pallet an audit result is registered that indicates a difference was found between the actual contents on the pallet and the contents according to the WMS, the pallet is not allowed to be loaded. The idea is that the difference must be resolved first before the pallet can be loaded.

Pallet with missing can in dock

Imagine that you want to extract the two lines calling move_pallet() and register_audit_result() from this test:

def test_move_pallet_to_dock_with_differences_found_during_audit(self):
    pallet = TdbPallet.create(TdbLocation.bulk())
    service.move_pallet(pallet.id, TdbLocation.audit().id)
    service.register_audit_result(pallet.id, True)

    self.assertRaises(Exception, self.service.move_pallet, pallet.id, TdbLocation.dock().id)

Don't do it like this:

def test_move_pallet_to_dock_with_differences_found_during_audit(self):
    pallet = TdbPallet.create(TdbLocation.bulk())
    self.audit_pallet_with_differences_found(pallet.id, TdbLocation.audit().id)

    self.assertRaises(Exception, self.service.move_pallet, pallet.id, TdbLocation.dock().id)

But do it like this:

def test_move_pallet_to_dock_with_differences_found_during_audit(self):
    pallet = TdbPallet.create(TdbLocation.bulk())
    self.audit_pallet_with_differences_found(pallet, TdbLocation.audit())

    self.assertRaises(Exception, self.service.move_pallet, pallet.id, TdbLocation.dock().id)

Let the extracted method worry about ids of the pallet and dock. Keep the test code clean.

Conclusions

Three techniques to generate test data have been explained:

Using these techniques it is possible to setup the database

When using these techniques focus on ease of use and readability for the tests.

Updates

December 12, 2021: Fixed typos and replaced 'setup' by 'arrange part'.

August 23, 2020: Changed style of the Test Data Builder code from tdb.create_pallet() to TdbPallet.create(). Also formatted the code with Black.

October 10, 2019: Initial published version.