Specification Testing for Move

Aptos Labs
7 min readMay 22, 2024

by Eiger and Aptos Labs

Mutation testing is a powerful technique to evaluate and improve the quality of unit tests, going beyond the capabilities of code-coverage tools. Recent research has shown that similar techniques can be applied to evaluate and improve the quality of formal specifications. To help further increase the assurance levels of Move smart contracts on Aptos, we are releasing the beta versions of two open-source tools: move-mutator and move-spec-test, available here. Another tool, move-mutation-test, which applies mutation results for measuring test suite quality, is under preparation.

The rest of this blog delves deeper into these tools and why you might want to use them.

Introduction
Mutation Testing: A Quick Background
Application to Specification Testing
Brief Case Study
What’s Next
Some Relevant References and Related Work

Introduction

With the Move Prover tool, smart contracts written in the Move programming language can be formally proven correct with respect to user-written specifications. The Move Prover uses state-of-the-art techniques to provide one of the strongest forms of assurance that can be obtained for modern software, compared to other good software assurance practices like fuzzing, high-coverage testing, static analysis, and linters.

However, the assurance provided by the Prover is only as good as the user-provided specifications. If the specifications are weak, incorrect, or have blind spots, then proving that the program conforms to the specifications could provide a false sense of correctness and security.

To help developers find and understand such issues in their specifications, Eiger, with inputs from Aptos Labs, has developed two tools, openly available here:

  • move-mutator to mutate Move code, and
  • move-spec-test to check if we can still prove the specifications despite the mutations to the Move code.

There are very few programming languages and tools out there that match this functionality, so we are excited to announce the beta-release of these open-source tools to garner developer feedback from the community. We plan to continually improve the tools in a feedback-driven manner, so we would love to hear your experience in using the tool.

We expect using these tools would help build more robust specifications, increasing the overall assurance of the Move smart contracts on the Aptos blockchain.

Mutation Testing: A Quick Background

Consider the following Move code:

public fun inc_and_check(x: &mut u64): bool {
*x = *x + 1;
x > 42
}

#[test]
fun test_it() {
let x = 3;
inc_and_check(&x);
assert!(x == 4, 0);
}

Here, the function inc_and_check does two things: (i) increment x, and (ii) check if the incremented value of x is greater than 42. The test function test_it however, checks for only one of these two behaviors, i.e, (i), the incrementing behavior.

If we run a code coverage tool, we find that inc_and_check has full coverage, because all lines of this function are run by the test. However, not all aspects of inc_and_check have been tested; e.g., one can change x > 42 to x < 42, and the tests would still pass. Therefore, the testing of inc_and_check, while having full coverage, has a blind spot.

Mutation testing is a technique whose goal is to automatically find these kinds of blind spots in tests. It accomplishes this by introducing random (but controlled) mutations in the source code (e.g., replace x < 42 by x > 42), and then checking if all the tests still pass.

Each actual mutation of the source code is called a mutant. If any of the tests fail on a mutant, then the mutant is killed by the tests, because the tests were good enough to catch the code change. If a large number of mutants survive, then it could be an indicator that the tests have some blind spots. In this case, we could likely improve the test by adding an assertion about the return value of inc_and_check.

The developer would typically use their subjective and contextual judgment to figure out whether this is a worthwhile improvement (similar to how a coverage tool points to lines of code not covered by tests, but the developer needs to figure out whether adding tests to get this extra coverage is a worthwhile effort).

The move-mutator tool applies random mutations to Move programs and creates a directory with mutants (each mutant has one random mutation, and can be diffed with the original code). We then use these mutants for testing the quality of specifications, as we will see below. As future work, we plan to use the move-mutator tool for assessing the quality of unit tests, which is the more typical application of mutation testing, as described above.

The move-mutator tool is architected to make it easy to add new kinds of mutations, and we anticipate adding more kinds of mutations over time. Some of the existing mutations are:

  • Replace one binary operator with another (e.g., use of + would be replaced with -).
  • Replace one literal with another (e.g., true would be replaced with false).
  • Replace break and continue with each other or delete them.
  • Replace conditions in if expressions with constant boolean values.
  • Delete certain kinds of statements.
  • Swap the arguments of binary operators (e.g., expression a-b would be replaced with b-a).

Application to Specification Testing

As alluded to in the introduction, one of the weakest links in a system with formally proven code is the quality of the specifications (i.e., what is being proven). The Move specifications are no exception. Authoring good specifications remains somewhat of an art: the developer has to balance how abstract the specification is versus how complete it is.

An ideal specification needs to be complete enough to capture all intentions from stakeholders and yet abstract enough to allow flexibility in implementation choices.

— “Finding Specification Blind Spots via Fuzz Testing”, Ji and Xu, IEEE Security and Privacy 2023.

Our move-spec-test tool applies ideas from mutation testing to find specification blind spots. It first runs the move-mutator tool to generate mutants of the original Move code. Then, it runs the Move Prover for each mutant. A mutant is killed if the Move Prover is unable to prove that the mutant conforms to the untouched specifications.

If a large number of mutants survive, then it is indicative that the specifications are potentially insufficient for that code. A developer can then decide if this is acceptable or if the specifications need improvement, based on the additional context they have about the code and its usage in a real system.

Brief Case Study

We have run the move-spec-test tool on a few real-world Move programs with specifications. One example is the Diem Payment Network, which was chosen because it is a fairly large Move program with extensive formal specifications, and we often use it for testing the Move Prover. It was developed at Meta in Move for the once planned Meta payment network. We manually triaged a small number of surviving mutants resulting from this exercise to do a brief case study. As a result, we found issues with some of the specifications involved, e.g., with them being incomplete, imprecise, or verification locally turned off for certain specifications. These issues were validated by specification experts at Aptos Labs as “very useful issues to find out about these specifications”.

As the specifications and the code involved in the above discussion are fairly complicated, we will instead use a very simple example to showcase the usefulness of move-spec-test.

Consider a Move function:

fun div(x: u128, y: u128): u128 {
assert!(y != 0, 42);
x / y
}

And a corresponding Move specification:

spec div {
aborts_if y == 0 with 42;
}

The move-spec-test tool shows that the following mutant (with comments manually added to call out attention, not by the tool) survives the specification.

fun div(x: u128, y: u128): u128 {
assert!(y != 0, 42);
x % y // mutated line, `/` replaced with `%`
}

The reason for the survival of this mutant is that the specification only checks for the abort behavior of the div function, not the functional behavior. Both / and % operators have the same abort behavior, so the tool showed that the operator replacement had no effect on proving the specification!

If we want to make the specification more robust to mutations and implementation changes, we can update it to include the functional behavior as well:

spec div {
aborts_if y == 0 with 0;
ensures result == x / y;
}

What’s Next

We have some exciting future work planned in relation to these tools:

  • Driven by user feedback, improve the user experience to expedite triaging issues with specifications.
  • Implement more kinds of mutations.
  • Improve the performance of the tools.
  • Apply mutation testing in the traditional setting, i.e., to test the quality of the tests. Our modular architecture, where move-mutator is an independent tool, has been explicitly designed keeping this future capability in mind.

Some Relevant References and Related Work

Eiger is a core contributor to various blockchain technologies. Their teams boast years of experience, collaborating seamlessly to build robust solutions. With a focus on excellence in crypto, web3, and blockchain development, Eiger is a trusted partner to elevate and scale your projects effectively.

--

--

Aptos Labs

Aptos Labs is a premier Web3 studio of engineers, researchers, strategists, designers, and dreamers building on Aptos, the Layer 1 blockchain.