Merge pull request #1418 from HackTricks-wiki/update_Use_mutation_testing_to_find_the_bugs_your_tests_d_20250918_124237

Use mutation testing to find the bugs your tests don't catch
2025-10-10 18:36:50 +00:00 · 2025-09-30 21:17:37 +02:00 · 2025-09-30 21:17:37 +02:00 · ea5b9fdb1d
commit ea5b9fdb1d
parent 773ac4128f 401de9774b
4 changed files with 274 additions and 0 deletions
--- a/resolve_searchindex_conflicts.sh
+++ b/resolve_searchindex_conflicts.sh
@ -0,0 +1,139 @@
 #!/bin/bash
 # Script to resolve searchindex.js conflicts by accepting master branch version
 # This script is designed to handle merge conflicts that occur when PRs become
 # desynchronized due to the auto-generated searchindex.js file.
 # 
 # The searchindex.js file is automatically generated by the build process and
 # frequently causes conflicts when multiple PRs are waiting to be merged.
 # This script automatically resolves those conflicts by accepting the master
 # branch version of the file.
 #
 # Usage: resolve_searchindex_conflicts.sh <pr_number> <head_branch> <base_branch>
 set -euo pipefail
 # Validate arguments
 if [ $# -ne 3 ]; then
    echo "Usage: $0 <pr_number> <head_branch> <base_branch>"
    exit 1
 fi
 PR_NUMBER="$1"
 HEAD_BRANCH="$2"
 BASE_BRANCH="$3"
 # Validate required environment variables
 if [ -z "${GITHUB_REPOSITORY:-}" ]; then
    echo "Error: GITHUB_REPOSITORY environment variable is required"
    exit 1
 fi
 if [ -z "${GH_TOKEN:-}" ]; then
    echo "Error: GH_TOKEN environment variable is required"
    exit 1
 fi
 echo "Resolving conflicts for PR #$PR_NUMBER (branch: $HEAD_BRANCH -> $BASE_BRANCH)"
 # Get current directory for safety
 ORIGINAL_DIR=$(pwd)
 # Create a temporary directory for the operation
 TEMP_DIR=$(mktemp -d)
 echo "Working in temporary directory: $TEMP_DIR"
 cleanup() {
    echo "Cleaning up..."
    cd "$ORIGINAL_DIR"
    rm -rf "$TEMP_DIR"
 }
 trap cleanup EXIT
 # Clone the repository to the temp directory
 echo "Cloning repository..."
 cd "$TEMP_DIR"
 gh repo clone "$GITHUB_REPOSITORY" . --branch "$HEAD_BRANCH"
 # Configure git
 git config user.email "action@github.com"
 git config user.name "GitHub Action"
 # Fetch all branches
 git fetch origin
 # Make sure we're on the correct branch
 git checkout "$HEAD_BRANCH"
 # Try to merge the base branch
 echo "Attempting to merge $BASE_BRANCH into $HEAD_BRANCH..."
 if git merge "origin/$BASE_BRANCH" --no-edit; then
    echo "No conflicts found, merge successful"
    # Push the updated branch
    echo "Pushing merged branch..."
    git push origin "$HEAD_BRANCH"
    exit 0
 fi
 # Check what files have conflicts
 echo "Checking for conflicts..."
 conflicted_files=$(git diff --name-only --diff-filter=U)
 echo "Conflicted files: $conflicted_files"
 # Check if searchindex.js is the only conflict or if conflicts are only in acceptable files
 acceptable_conflicts=true
 searchindex_conflict=false
 for file in $conflicted_files; do
    case "$file" in
        "searchindex.js")
            searchindex_conflict=true
            echo "Found searchindex.js conflict (acceptable)"
            ;;
        *)
            echo "Found unacceptable conflict in: $file"
            acceptable_conflicts=false
            ;;
    esac
 done
 if [ "$acceptable_conflicts" = false ]; then
    echo "Cannot auto-resolve: conflicts found in files other than searchindex.js"
    git merge --abort
    exit 1
 fi
 if [ "$searchindex_conflict" = false ]; then
    echo "No searchindex.js conflicts found, but merge failed for unknown reason"
    git merge --abort
    exit 1
 fi
 echo "Resolving searchindex.js conflict by accepting $BASE_BRANCH version..."
 # Accept the base branch version of searchindex.js (--theirs refers to the branch being merged in)
 git checkout --theirs searchindex.js
 git add searchindex.js
 # Check if there are any other staged changes from the merge
 staged_files=$(git diff --cached --name-only || true)
 echo "Staged files after resolution: $staged_files"
 # Complete the merge
 if git commit --no-edit; then
    echo "Successfully resolved merge conflicts"
    # Push the updated branch
    echo "Pushing resolved branch..."
    if git push origin "$HEAD_BRANCH"; then
        echo "Successfully pushed resolved branch"
        exit 0
    else
        echo "Failed to push resolved branch"
        exit 1
    fi
 else
    echo "Failed to commit merge resolution"
    exit 1
 fi
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@ -83,6 +83,7 @@
  - [Basic Python](generic-methodologies-and-resources/python/basic-python.md)
 - [Threat Modeling](generic-methodologies-and-resources/threat-modeling.md)
 - [Blockchain & Crypto](blockchain/blockchain-and-crypto-currencies/README.md)
  - [Mutation Testing With Slither](blockchain/smart-contract-security/mutation-testing-with-slither.md)
  - [Defi/AMM Hook Precision](blockchain/blockchain-and-crypto-currencies/defi-amm-hook-precision.md)
 - [Lua Sandbox Escape](generic-methodologies-and-resources/lua/bypass-lua-sandboxes/README.md)
--- a/src/blockchain/blockchain-and-crypto-currencies/README.md
+++ b/src/blockchain/blockchain-and-crypto-currencies/README.md
@ -176,6 +176,14 @@ Transactions in Ethereum involve a sender and a recipient, which can be either u
 These practices and mechanisms are foundational for anyone looking to engage with cryptocurrencies while prioritizing privacy and security.
 ## Smart Contract Security
 - Mutation testing to find blind spots in test suites:
 {{#ref}}
 ../smart-contract-security/mutation-testing-with-slither.md
 {{#endref}}
 ## References
 - [https://en.wikipedia.org/wiki/Proof_of_stake](https://en.wikipedia.org/wiki/Proof_of_stake)
--- a/src/blockchain/smart-contract-security/mutation-testing-with-slither.md
+++ b/src/blockchain/smart-contract-security/mutation-testing-with-slither.md
@ -0,0 +1,126 @@
 # Mutation Testing for Solidity with Slither (slither-mutate)
 {{#include ../../../banners/hacktricks-training.md}}
 Mutation testing "tests your tests" by systematically introducing small changes (mutants) into your Solidity code and re-running your test suite. If a test fails, the mutant is killed. If the tests still pass, the mutant survives, revealing a blind spot in your test suite that line/branch coverage cannot detect.
 Key idea: Coverage shows code was executed; mutation testing shows whether behavior is actually asserted.
 ## Why coverage can deceive
 Consider this simple threshold check:
 ```solidity
 function verifyMinimumDeposit(uint256 deposit) public returns (bool) {
    if (deposit >= 1 ether) {
        return true;
    } else {
        return false;
    }
 }
 ```
 Unit tests that only check a value below and a value above the threshold can reach 100% line/branch coverage while failing to assert the equality boundary (==). A refactor to `deposit >= 2 ether` would still pass such tests, silently breaking protocol logic.
 Mutation testing exposes this gap by mutating the condition and verifying your tests fail.
 ## Common Solidity mutation operators
 Slither’s mutation engine applies many small, semantics-changing edits, such as:
 - Operator replacement: `+` ↔ `-`, `*` ↔ `/`, etc.
 - Assignment replacement: `+=` → `=`, `-=` → `=`
 - Constant replacement: non-zero → `0`, `true` ↔ `false`
 - Condition negation/replacement inside `if`/loops
 - Comment out whole lines (CR: Comment Replacement)
 - Replace a line with `revert()`
 - Data type swaps: e.g., `int128` → `int64`
 Goal: Kill 100% of generated mutants, or justify survivors with clear reasoning.
 ## Running mutation testing with slither-mutate
 Requirements: Slither v0.10.2+.
 - List options and mutators:
 ```bash
 slither-mutate --help
 slither-mutate --list-mutators
 ```
 - Foundry example (capture results and keep a full log):
 ```bash
 slither-mutate ./src/contracts --test-cmd="forge test" &> >(tee mutation.results)
 ```
 - If you don’t use Foundry, replace `--test-cmd` with how you run tests (e.g., `npx hardhat test`, `npm test`).
 Artifacts and reports are stored in `./mutation_campaign` by default. Uncaught (surviving) mutants are copied there for inspection.
 ### Understanding the output
 Report lines look like:
 ```text
 INFO:Slither-Mutate:Mutating contract ContractName
 INFO:Slither-Mutate:[CR] Line 123: 'original line' ==> '//original line' --> UNCAUGHT
 ```
 - The tag in brackets is the mutator alias (e.g., `CR` = Comment Replacement).
 - `UNCAUGHT` means tests passed under the mutated behavior → missing assertion.
 ## Reducing runtime: prioritize impactful mutants
 Mutation campaigns can take hours or days. Tips to reduce cost:
 - Scope: Start with critical contracts/directories only, then expand.
 - Prioritize mutators: If a high-priority mutant on a line survives (e.g., entire line commented), you can skip lower-priority variants for that line.
 - Parallelize tests if your runner allows it; cache dependencies/builds.
 - Fail-fast: stop early when a change clearly demonstrates an assertion gap.
 ## Triage workflow for surviving mutants
 1) Inspect the mutated line and behavior.
   - Reproduce locally by applying the mutated line and running a focused test.
 2) Strengthen tests to assert state, not only return values.
   - Add equality-boundary checks (e.g., test threshold `==`).
   - Assert post-conditions: balances, total supply, authorization effects, and emitted events.
 3) Replace overly permissive mocks with realistic behavior.
   - Ensure mocks enforce transfers, failure paths, and event emissions that occur on-chain.
 4) Add invariants for fuzz tests.
   - E.g., conservation of value, non-negative balances, authorization invariants, monotonic supply where applicable.
 5) Re-run slither-mutate until survivors are killed or explicitly justified.
 ## Case study: revealing missing state assertions (Arkis protocol)
 A mutation campaign during an audit of the Arkis DeFi protocol surfaced survivors like:
 ```text
 INFO:Slither-Mutate:[CR] Line 33: 'cmdsToExecute.last().value = _cmd.value' ==> '//cmdsToExecute.last().value = _cmd.value' --> UNCAUGHT
 ```
 Commenting out the assignment didn’t break the tests, proving missing post-state assertions. Root cause: code trusted a user-controlled `_cmd.value` instead of validating actual token transfers. An attacker could desynchronize expected vs. actual transfers to drain funds. Result: high severity risk to protocol solvency.
 Guidance: Treat survivors that affect value transfers, accounting, or access control as high-risk until killed.
 ## Practical checklist
 - Run a targeted campaign:
  - `slither-mutate ./src/contracts --test-cmd="forge test"`
 - Triage survivors and write tests/invariants that would fail under the mutated behavior.
 - Assert balances, supply, authorizations, and events.
 - Add boundary tests (`==`, overflows/underflows, zero-address, zero-amount, empty arrays).
 - Replace unrealistic mocks; simulate failure modes.
 - Iterate until all mutants are killed or justified with comments and rationale.
 ## References
 - [Use mutation testing to find the bugs your tests don't catch (Trail of Bits)](https://blog.trailofbits.com/2025/09/18/use-mutation-testing-to-find-the-bugs-your-tests-dont-catch/)
 - [Arkis DeFi Prime Brokerage Security Review (Appendix C)](https://github.com/trailofbits/publications/blob/master/reviews/2024-12-arkis-defi-prime-brokerage-securityreview.pdf)
 - [Slither (GitHub)](https://github.com/crytic/slither)
 {{#include ../../../banners/hacktricks-training.md}}