Using Mutation Testing to Weed Out Fake Unit Tests

Code kata: using mutation testing to improve quality of unit tests.

Summary

This is a set of exercises that will demonstrate

That having passing unit tests and high unit test coverage numbers may be giving a false sense of security due to low quality of the tests
how to identify problem spots using mutation testing and common test smells
how to fix these problems

What Is a Code Kata?

A code kata is an exercise in programming which helps programmers hone their skills through practice. A code kata is usually set up as a series of unit tests, which fail. Your task is to write code to make them pass. The idea is inspired by the Japanese concept of kata in the martial arts. Just like in the martial arts, you can repeat a kata multiple times to make improvements to your solutions.

Please note that this kata is a little different - all the tests initially pass. Don't worry, all the same ideas mentioned above still apply here. We will improve the tests, in the process making them fail, and then we will fix the code to make the tests pass, but by then they will be good tests.

Running This Kata

To build this kata you will need

Java 17 or newer
Maven 3.6.1 or newer
an IDE of your choice

There are two modules in this project

kata - contains the exercises, including the domain and test classes described below. You should be working with this module.
solutions - contains solutions to the exercises in the kata as well as explanations of the test smells (see the "Unit Test Smells" section below). There is more than one way to solve the kata, so your solutions may not look exactly like the ones in this module, in fact you may find ways to improve on the solutions here.

The domain for the kata is made up of two classes: Company and Employee:

Run all the unit tests in the mtk.domain.CompanyTest class. They should all pass. Check the test coverage metrics using either maven output or a coverage reporting function in your IDE test runner. The coverage should be close to 100%. Good news: there are tests, they all pass, and they cover all of our business logic. Looks like the software is ready to ship!

Unfortunately that would be a terrible idea as the code is full of bugs. To prove it, just take a look at the mtk.CompanyRunner class, which contains some simple business logic in its main() method. Run mtk.CompanyRunner.main() and looks at the console output. Does it look right? How can we have all these bugs despite having all these test?
Run the unit tests with mutations. Mutations will be introduced in your code by PIT - a mutation testing tool .
1. Enable the pitest maven profile for the project. This profile is bound to test phase of the maven lifecycle.
2. Run the test task in the module kata. (To run it from the command line with the profile activated, execute the mvn test -P pitest command.) With the profile enabled, this task will invoke the PIT framework to first introduce changes in the application code and then execute tests.
3. Inspect the results. The results are written in HTML format into a file in the target/pit-reports/YYYYMMDDHHMI directory. Open this file in a browser - you should see quite a bit of red. This means that some of the code mutations managed to survive - were not caught by the unit tests. Which means that in fact our unit tests do not test what they are supposed to.
Fix the test smells. Each test in the test class exhibits one or more test smells. Going through the tests one by one, fix the smell and make sure the test actually does what it is supposed to. To help you, the comments in some of the test methods explicitly say what smell is present there. Once you remove the smell, the test should start failing. This is a good thing, because now we have tests that actually validate the behavior of our software.
Fix the business logic, to make the tests pass. Look at the comments in the code, they may explain its intended behavior (does not mean the method as written behaves as intended).
Kill all mutants! The tests that have been fixed this way should catch mutation introduced by PIT. When all the tests (and the logic under test) are fixed, no mutations should be able to survive. So the end state should be passing tests and dead mutants (and no smells).

The rest of this document offers some general pointers, which may come in handy if you are new to unit testing.

Unit Tests - Necessary, But Not Sufficient...

...to build confidence in the software system under test. While our focus here is on unit tests, it helps to put them in a broader context. The table below lists common types of tests.

The meaning of the columns:

Category - A category of tests
Purpose - Why is this kind of testing needed
Who - Roles involved in creation of tests and validating test results
Tools - Example of tools supporting this type of testing

Category	Purpose	Who	Tools
Unit	Validate a unit of behavior at the low (code) level, focusing on a small part of the system (e.g., a method)	Dev	JUnit
Acceptance	Validates that the business logic is implemented as specified for a given scenario	Dev, User	FitNesse
Mutation	Ensure quality of unit and acceptance tests	Dev	PITest
Integration	Detect issues in interactions between modules of the system	Tech Ops, Dev
User Acceptance	Certify by the users that the system as a whole is operating as expected	Dev, User
Production Mirror	Test system under load identical to production	Tech Ops
Chaos Engineering	Test system resiliency by failure injection into infrastructure (service processes, networks, clients, etc.)	Tech Ops	Chaos Monkey
Breakpoint	Determine the maximum amount of load that the system can support	Tech Ops	The Grinder

Unit Test Best Practices

These are some of the practices to follow to ensure that the unit tests are effective, easy to maintain, easy to execute:

Automated - require no human involvement to determine the outcome
Focused - each test method tests one scenario
Complete - test the edge cases, try to cover all meaningfully different scenarios
Well named - test method name describes the scenario being tested
Fast - the relevant tests execute in a few seconds or faster
Independent - no external dependencies, no dependencies on other tests
Test the behavior, not the implementation

Unit Test Smells

These are the signs that there is possibly something wrong with the test - either because the test itself is not well written, or the code under test is not test friendly (which probably means that this code is not well factored):

No assertions
Irrelevant assertions
Use of Mocks
Expected results are calculated rather than explicitly specified
Test code reuse (that is test logic reuse, test utilities are good)
Test data reuse
"Flickering" tests(tests with nondeterministic behavior)
Interdependencies between tests (e.g., execution order)
Long running tests
@Ignore'd or commented out tests

These smells often come together. For example, sharing test data can lead to tests' success depending on the execution order.

Unit Test Quality

How can we measure the quality of the unit tests in a system? One metric, which is used broadly, is test coverage. Some things to keep in mind:

Test coverage
- %-ge of LOC, methods, classes covered by tests
- Does not guarantee the covered code is actually tested
- Does identify the code that is definitely not tested
So having a coverage target in not entirely pointless, but...
...don't optimize just for coverage
How do we make sure that the tests actually test? Mutation Testing is one way to ensure relevance of unit tests.

Mutation Testing

Mutation testing is a way to validate the quality of unit tests. It means introducing changes in the code and observing the behavior of the unit tests. Assuming that all the tests were passing before the mutation, some of the unit tests will either start failing (good) or all the tests will keep on passing (bad). The latter scenario means that the unit tests do not really validate outcomes of the code under test: the results for all intents and purposes become random, yet all the tests pass.

Test Driven Development

Adopting Test Driven Development (TDD) will result in better tests, better interfaces, less unnecessary code, and more confident and steady development process. Just follow these steps:

Write a test
- Take the user's perspective: "What is the API that would make my job the easiest?"
- Think small increments
Make the test pass
- Do whatever it takes: Duplication? Fine! Hardcoding the expected result? Fine!
Refactor
- Remove duplication
Repeat for all meaningfully different scenarios
Reap the benefits
- Almost all code is tested
- You know when to stop coding
- User friendly interfaces
- Well factored code
- Just enough abstraction
- Just enough code
- Develop with confidence!

Sounds too good to be true? The secret is that TDD does require a lot of discipline from its practitioners to work in tiny increments, diligently following the steps above, and not cutting corners. Without the discipline it is likely you will end up with tests (and code under tests) of the usual "quality".

Useful Links

JUnit
PITest

vmzakharov / mutate-test-kata Goto Github PK

mutate-test-kata's Introduction

Using Mutation Testing to Weed Out Fake Unit Tests

Summary

What Is a Code Kata?

Running This Kata

Unit Tests - Necessary, But Not Sufficient...

Unit Test Best Practices

Unit Test Smells

Unit Test Quality

Mutation Testing

Test Driven Development

Useful Links

mutate-test-kata's People

Contributors

Stargazers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent