In this article, I explain why, and how we use containers for testing purposes
at Synthesio. For simplicity’s sake, most examples assume the project is
written in Go, and use a MySQL database, and we will use Docker as the
container application, but the method discussed can be and is used with any
language or technology and more complex setups.
Dependencies are complex pieces of software, for a good reason. A database, for
example, abstracts away the huge amount of complexity and features so our code
doesn’t have to handle it by itself and can focus on business features while
being simpler.
Testing code that use such a dependency, however, is a harder job than it seems
made unnecessarily complex by most testing frameworks. One of the reasons for
this is that when thinking, speaking, reading about testing, we could almost
use the term unit-testing interchangeably. And all in all, unit-tests are
badly adapted to testing code that directly use a complex dependency. Which
isn’t surprising, given that the whole point of unit testing is to not use any
dependency.
Mocking is not the solution you’re looking for
In the current state of the art, the most commonly accepted solution for
unit-testing a piece of code with a dependency is to fake it. To give the unit
under test something that mimics the dependency and will behave as we want, so
we can verify that the code is reacting as expected. This method, called
mocking, is actually a bad solution to the problem at hand.
To be perfectly honest, faking the dependency is a good solution for a certain
class of dependency. Unit-testing was born during the rise of object-oriented
programming, and is unsurprisingly well-adapted to code that can take as
dependency an interface of reasonable complexity, the mock being generally an
in-memory, dumb version of the real dependency. A database for example, is
generally too complex for this.
Let’s take a concrete example to illustrate the point. Golang has a library for
the purpose of mocking a SQL database during tests, sqlmock
. Here is an
example of code, taken from the library’s Github repository.
func TestShouldUpdateStats(t *testing.T) {
db, mock, err := sqlmock.New()
if err != nil {
t.Fatalf("an error '%s' was not expected when opening a stub database connection", err)
}
defer db.Close()
mock.ExpectBegin()
mock.ExpectExec("UPDATE products").WillReturnResult(sqlmock.NewResult(1, 1))
mock.ExpectExec("INSERT INTO product_viewers").WillReturnResult(sqlmock.NewResult(1, 1))
mock.ExpectCommit()
// now we execute our method
if err = recordStats(db, 2, 3); err != nil {
t.Errorf("error was not expected while updating stats: %s", err)
}
// we make sure that all expectations were met
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("there were unfulfilled expectations: %s", err)
}
}
If we think about it, the real interface between our code and the database
isn’t the actual method called, it’s the query language itself. And it isn’t
the query language that is mocked here, but simply the implementation code that
is supposed to communicate with the database, leaving to the user the care to
do the actual mocking, verification, etc.
The result of this is that the tests that use this kind of library generally do
only half of the job, and generally not the part that is actually worth
testing. Checking if a query wrapper really call the expected function is
actually a near-useless job, akin to a test that verify the value of a
constant.
A useful test would check if after calling the method, the value in the
database is the expected one. Which would need faking the logic of the database
itself and leave the implementation do whatever it wants with it. A useful
test wouldn’t need to be completely re-written when the implementation of the
method change even so slightly.
The problem is that a library that would implement such a feature would be
hugely complex. It would probably have to implement complex logic to parse the
SQL syntax, understand the queries, etc. If such a library existed, it would
probably resemble more of a full-fledged in-memory SQL database than a mocking
library. (I actually did exactly that using in-memory sqlite databases once upon
a time.)
The amount of work needed would be huge to say the least. And even then, it
wouldn’t be complete. It would have to understand vendor extensions, bugs, take
into accounts versions of the target vendor database, etc, to be ultimately
useful. Pouring such an amount of time in this is probably not worth it, when
we have a simpler and ready solution: why not just use the real database?
This approach has existed for a long time too but it suffered multiple
defaults, the most important being the need to install and configure a database
for the tests themselves, which generally lead to a project needing a complex
setup for something that should be simple and quick. Or you would need to
maintain an active database in your organization for the purpose of running
tests. And have one for each version used in production. Clumsy at best, not
re-usable, and generally too many constraints.
Luckily for us, the situation changed in the last years, with the arrival of
containers as an easy, simple and globally accessible solution. With a little
not-even-complex tooling, using a real database in a container for testing is
surprisingly easy, and lead to clean and concise tests.
Do or do not, there is no try
So, we want to run test that use a database with an actual database. How do we
do that ?
First step is to have the database in a container. Let’s spawn a container for
that. And throw in a docker-compose.yml
for good measure.
version: "2.1"
services:
mysql:
image: mysql:5.7
ports:
- "3306:3306"
Now, running our tests is a simple matter of running docker-compose up
before
the tests, and using localhost:3306
as the database address. Dead simple. A
little too naive, however.
For example, exporting ports that way is like opening a door to a world of
conflicts, ad hoc conventions, etc, which in the long run would be hard to
maintain. One solution is to run the tests from another container, linked to
the mysql one so it can access the database using the container’s network.
For this, we will simply add an app
container in our docker-compose.yml
.
version "2.1"
services:
app:
image: golang:1.10
links:
- mysql
mysql:
image: mysql:5.7
Now, the tests must use mysql:3306
as the address, and the command for
running the tests becomes docker-compose run --rm app /usr/bin/env bash -c
"go test"
. Better on the operation side, but not something we want to type
every time we want to run tests, configure CI, etc.
For simplicity’s sake, let’s put that in a Makefile. And while we’re at it, add
a build command too so we can also build in the container.
exec = docker-compose run --rm app /usr/bin/env bash -c
.PHONY: build
build::
${exec} "go build"
.PHONY: test
test::
${exec} "go test"
Much better. Running the tests is back to a simple make test
, the containers
are spawned as needed without intervention, multiple projects can coexist
without conflict, and the usage of Make or any other build too probably
integrate with any complex workflow. All is well, our job here is done.
Until we actually run the tests….
# make test
[…]
--- FAIL: TestApp (0.00s)
<autogenerated>:1: mysqltest: dial tcp 192.168.0.2:3306: getsockopt: connection refused
[…]
What is happening here? When it creates the containers, Docker is smart enough
to wait until the mysql
container is running before starting the app
container, but most databases need a little warmup before being ready, so when
the test code tries to connect to MySQL, the database is still initializing and
cannot accept the connection.
One solution is to use a tool like https://github.com/jwilder/dockerize
as
the app
container entrypoint. Dockerize will wait until the configured port
is ready before running the container command. There is a little issue here:
the golang
container doesn’t include dockerize
.
Containers to the rescue! The simplest solution is to have a custom image that
will do.
FROM golang:1.10.0
COPY entrypoint.sh /usr/local/bin/
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
CMD ["/usr/bin/env", "bash"]
RUN curl -sSL "https://github.com/jwilder/dockerize/releases/download/v0.5.0/dockerize-linux-amd64-v0.5.0.tar.gz" | tar -xz -C /usr/local/bin
With entrypoint.sh
being a file along these lines. (It can probably be
replaced by something shorter in the docker-compose.yml
’s entrypoint
option.)
#!/bin/bash
exec "$@"
Then, you can change the docker-compose.yml
file to use your custom image (we
will call it custom-golang
) and dockerize
.
version "2.1"
services:
app:
image: custom-golang:1.10
links:
- mysql
entrypoint: dockerize -timeout 20s -wait tcp://mysql:3306 entrypoint.sh
mysql:
image: mysql:5.7
OK, we have a database container ready for action, it’s time to code! Let’s do
something that resemble an actual test.
func TestFoo(t *testing.T) {
// Prepare the connection to the database container.
db, err := sql.Open("mysql", "root:@tcp(mysql:3306)")
if err != nil {
t.Fatal("connecting to database server:", err)
}
defer db.Close()
// Create the database and tables necessary.
_, err = db.Exec(`
CREATE DATABASE app;
USE app;
CREATE TABLE foo ( id INTEGER UNSIGNED PRIMARY KEY );
`)
if err != nil {
t.Fatal("initializing database:", err)
}
// Call the tested function.
result := foo(db)
// Check the result.
var expected = "bar"
if result != "bar" {
t.Errorf("unexpected output: wanted %s, got %s", expected, result)
}
}
While encouraging, this code has a number of problems that we need to address
before being reliable. The first of them is that it will only work once, as the
created base is persistent and the test will fail if the database already
exists. Additionally, we won’t be able to run tests in parallel either.
We could add a DROP
statement before the database creation, but it would only
solve half of the problem. A better solution would be to generate a random name
before actually running the query and use this as the database name.
name, _ := random.Alpha(10)
_, err = db.Exec(fmt.Sprintf("
CREATE DATABASE `%[1]s`;
USE `%[1]s`;
CREATE TABLE foo ( id INTEGER UNSIGNED PRIMARY KEY );
", name))
Good enough. Although a little rough, this solution works and solve all the
problems at hand. We can refine it further by moving the related code in a
dedicated helper so the test itself is cleared of unnecessary code, and each
test can use it independently.
func Spawn(t *testing.T, address, schema string) *sql.DB {
// Prepare the connection to the database container.
db, err := sql.Open("mysql", fmt.Sprintf("root:@tcp(%[1]s)", address))
if err != nil {
t.Fatal("connecting to database server:", err)
}
// Create the database and tables necessary.
name, _ := random.Alpha(10)
_, err = db.Exec(fmt.Sprintf(" CREATE DATABASE `%[1]s`; USE `%[1]s` ", name))
if err != nil {
t.Fatal("initializing database:", err)
}
// Load the schema.
_, err = db.Exec(schema)
if err != nil {
t.Fatal("loading schema:", err)
}
// Return the database and a cleaning
return db
}
func TestFoo(t *testing.T) {
// Create a firesh database for use in this test.
db := Spawn(t, "mysql:3306", "CREATE TABLE foo ( id INTEGER UNSIGNED PRIMARY KEY )")
defer db.Close()
// Call the tested function.
result := foo(db)
// Check the result.
var expected = "bar"
if result != "bar" {
t.Errorf("unexpected output: wanted %s, got %s", expected, result)
}
}
Another big improvement would be to load the database creation queries directly
from a file, so the schema and fixtures can be shared between different tests
and won’t pollute the test code.
func Spawn(t *testing.T, address string, fixtures ...string) *sql.DB {
// Prepare the connection to the database container.
db, err := sql.Open("mysql", fmt.Sprintf("root:@tcp(%[1]s)", address))
if err != nil {
t.Fatal("connecting to database server:", err)
}
// Create the database and tables necessary.
name, _ := random.Alpha(10)
_, err = db.Exec(fmt.Sprintf(" CREATE DATABASE `%[1]s`; USE `%[1]s`; ", name))
if err != nil {
t.Fatal("initializing database:", err)
}
for _, fixture := range fixtures {
Load(t, db, fixture)
}
// Return the database and a cleaning
return db
}
func Load(t *testing.T, db *sql.DB, fixture string) {
raw, err := ioutil.ReadFile(fixture)
if err != nil {
t.Fatalf("reading fixture %s: %s", fixture, err.Error())
}
// Load the schema.
_, err = db.Exec(string(raw))
if err != nil {
t.Fatalf("loading fixture %s: %s", fixture, err)
}
}
func TestFoo(t *testing.T) {
db := Spawn(t, "mysql:3306", "testdata/schema.sql")
defer db.Close()
// Call the tested function.
result := foo(db)
// Check the result.
var expected = "bar"
if result != "bar" {
t.Errorf("unexpected output: wanted %s, got %s", expected, result)
}
}
Conclusion
This testing method can be refined a bit more by adding a few tricks that won’t
be covered here, like cleaning the database after a successful test, or adding
templating to the fixtures. It can be adapted to almost any kind of database,
or even other types of dependencies like message brokers, other services, etc.
Among the downsides is the fact that it’s actually slower than something like
a mock. Spinning up the container is cheap but non-negligible and loading huge
fixtures can be long, but all things considered it is often a small price to
pay in comparison to the correctness and actual usefulness of the tests that
implement this.
Feel free to reach out if you want more details, have questions, or just want
to chat. I would love to hear your opinion on the subject.
Et voilà !