Integration Test Suite¶
The instar project includes a Python-based integration test suite that verifies
instar info produces output identical to qemu-img info, that instar check
correctly detects structural corruption in QCOW2 images, that instar compare
produces byte-for-byte identical output to qemu-img compare, and that
instar convert produces output identical to qemu-img convert. Since instar
aims to be a drop-in replacement, any difference in output is considered a bug.
Architecture¶
The test suite uses:
- testtools - Extended unittest framework with better assertions
- testscenarios - Parameterized test scenarios
- stestr - Parallel test runner with result storage
Tests compare instar output against either:
1. Live qemu-img output (for safe images - info, compare, convert)
2. Stored expected output files (for malicious images)
Test Categories¶
Safe Images (test_info_safe.py)¶
Tests against known-safe disk images. These run qemu-img directly and
compare outputs character-for-character.
Malicious Images (test_info_malicious.py)¶
Tests against images designed to exploit vulnerabilities (e.g., backing file
references to /etc/passwd). These use pre-stored expected output files
instead of running qemu-img, since running qemu-img on malicious images
defeats the security purpose of instar.
Check Validation Tests (test_check_formats.py)¶
Tests for the instar check operation:
- Format detection: Verifies check correctly identifies QCOW2, VMDK, VHD formats
- Corrupt images: Tests against deliberately corrupt format headers (VMDK, VHDX, VHD)
- QCOW2 structural validation: Uses 4 script-generated corrupt QCOW2 images:
- Clean baseline (should pass with 0 errors)
- Overlapping clusters (two L2 entries pointing to same host cluster)
- Refcount-zero (referenced cluster with refcount=0)
- Leaked cluster (refcount>0 but no L2 reference)
- Unsafe quirks mode: Verifies non-QCOW2 formats are treated as raw with
--unsafe-quirks
Corrupt QCOW2 test images are generated by
instar-testdata/custom/check-validation/create-corrupt-images.py, which creates
images with qemu-img/qemu-io and then surgically corrupts specific QCOW2
structures via binary manipulation.
Repair Tests (test_check_repair.py)¶
Tests for instar check --repair, codifying the phase-7-verified
behaviour against the corrupt-fixture matrix (each on a tempdir
copy, never the committed fixture):
- Leaks/all tiers repair to clean:
leaked+--repair=leaks, andrefcount-zero/refcount-too-high/stale-copied+--repair=all, are assertedqemu-img check-clean afterward with the surviving data patterns still readable viaqemu-io read -P. - Refuse paths stay byte-identical:
corrupt-bit-set,snapshot-leak, andcompressed-leakare sha256-identical after repair, with the snapshot and compressed data intact. - Overlapping is a safe partial repair: the genuine leak is
reclaimed, the structural overlap remains, no new error classes
appear, and instar exits 2 with
repair-incomplete. - CLI:
--repair+--chainrejection, clean-image no-op, qcow2-only not-supported on a raw image, and idempotence. - Repaired counters (
TestRepairCounters): the per-classrepaired-leaks/repaired-refcounts/repaired-corruptionscounts reach the host over theCheckResultMessageprotobuf and render in both the JSON and human--repairoutput, and are omitted on a read-only check so its schema is unchanged.
The structural tests assert post-repair state (qemu-img
check-clean, data reads, byte-identity) rather than instar's exit
code, since the exit code still reflects the detected corruption
after a clean repair; the repaired-counter output is asserted
separately by TestRepairCounters now that those counts travel on
the guest→host wire.
Compare Tests (test_compare.py)¶
Tests for the instar compare operation, cross-validated against qemu-img
compare:
Raw-vs-raw (TestCompareRawIdentical, TestCompareRawDifferent,
TestCompareRawSizeMismatch, TestCompareRawJson):
- Identical images: Self-compare and two identical files
- Different content: Mismatch at offset 0 and at mid-file offsets
- Size mismatch: Non-strict (zeros = identical), non-strict (non-zero =
differs), and strict mode (always fails on size difference)
- JSON output: Validates identical, first-mismatch-offset,
total-bytes-compared, and size-mismatch fields
QCOW2-vs-raw (TestCompareQcow2VsRaw):
- Identical content across formats (including all-zeros)
- Different content reports correct mismatch offset
- Cross-validated against qemu-img compare
QCOW2-vs-QCOW2 (TestCompareQcow2VsQcow2):
- Identical and different content between two QCOW2 images
- Virtual size mismatch handling
- Cross-validated against qemu-img compare
Compressed QCOW2 (TestCompareQcow2Compressed):
- Compressed QCOW2 vs raw with same content (zlib decompression)
- Compressed vs uncompressed QCOW2 with same content
- Cross-validated against qemu-img compare
Backing chains (TestCompareBackingChain):
- QCOW2 overlay with raw backing file vs flattened raw (identical)
- QCOW2 overlay vs different raw (mismatch detected)
- Deep chain (3-level: top -> mid -> base) vs flattened raw (identical)
- Two different QCOW2 backing chains with same virtual content (identical)
- All scenarios cross-validated against qemu-img compare
Test images are created at runtime using qemu-img create, qemu-io write,
qemu-img convert -c (for compressed), and qemu-img create -b (for backing
chains), so no external testdata is needed.
qemu-img cross-validation: Every scenario verifies byte-for-byte
identical stdout and matching exit codes with qemu-img compare.
Convert Tests (test_convert.py)¶
Tests for the instar convert operation (QCOW2 to raw), cross-validated
against qemu-img convert:
Basic conversion (TestConvertBasicQcow2ToRaw):
- Empty QCOW2 image to raw
- QCOW2 with written data to raw
- Output size matches virtual size
- All cross-validated against qemu-img convert
Compressed QCOW2 (TestConvertCompressed):
- Compressed QCOW2 to raw (zlib decompression)
- Compared against original raw source
Backing chains (TestConvertBackingChain):
- QCOW2 overlay with raw backing flattened to raw
- Deep chain (3-level: top -> mid -> base) flattened to raw
- Cross-validated against qemu-img convert
Raw passthrough (TestConvertRawToRaw):
- Raw to raw identity conversion
Error handling (TestConvertErrors):
- Unsupported output format rejected
- Nonexistent input file rejected
Manifest images (TestConvertManifestImages):
- Converts real-world QCOW2 images from the test manifest
- Cross-validates against qemu-img convert output
- Skips images with cluster_size > 64KB (unsupported)
- Skips images whose virtual_size exceeds available temp space
Resize Tests (test_resize.py)¶
Tests for the instar resize operation, structured around six
surfaces totalling 114 tests:
Schema-drift tripwire
(TestResizeBaselineMatrix.test_resize_cases_match_baselines):
- Walks instar-testdata/expected-outputs/resize-info-json/<target>/<version>/
and asserts the on-disk case set matches the in-test RESIZE_CASES
mirror. Catches drift between this mirror and the testdata
generator.
Cross-version baseline matrix (TestResizeBaselineMatrix):
- Per-(target, case) factory diffs instar create → instar resize
output against the phase-10 qemu-img info JSON baseline.
- ~22 active cases for qcow2 + raw; ~20 skipped (vmdk/vhd/vhdx where
qemu rejects, -no-shrink rejection cases that surface 6 covers,
and KNOWN_RESIZE_DIVERGENCES carry-forwards).
Live cross-validation (TestResizeCrossValidation):
- 7 curated qcow2 + raw cases comparing instar end-to-end against
the system qemu-img via instar info on both outputs.
Round-trip check (TestResizeRoundTripCheck):
- instar create → resize → check for ~30 non-raw cases.
- Catches resize-emitter regressions that produce files qemu-img
info accepts but instar check flags.
Internal consistency for vmdk/vpc/vhdx (TestResizeConsistency):
- 14 cases for the formats qemu can't resize, verified via
instar info virtual-size match + instar check.
Targeted error paths (TestResizeErrorPaths):
- 9 fixed tests pinning the host CLI rejection contracts:
shrink-without-flag for raw + qcow2, subtractive-size without
--shrink, invalid size strings, --preallocation=metadata on
raw, --preallocation=falloc + --shrink, --object,
--image-opts.
Wall clock: ~32 s serial / ~10 s with make test-container's
--concurrency 4.
Snapshot Tests (test_snapshot.py)¶
Tests for the instar snapshot subcommand (phase 11 of
PLAN-snapshot), 94 tests across five families:
- List matrix (human): per baselined image,
TZ=UTC instar snapshot -lbyte-equals the host-resolved profile baseline frominstar-testdata/expected-outputs/snapshot-list-human/(80 qemu-img versions captured; instar tracks the modern ≥9.0 layout); plus the bare-filename-defaults-to-list check. - List goldens (JSON):
--output=jsonbyte-equals the instar-side self-baselines intests/golden/snapshot-list/, plus a structural vmstate cross-check and a QMP-key schema test. - Mutation round-trips: create / delete / apply on tempdir
copies with post-op
qemu-img checkclean, content verified viaqemu-img compare, and structural behaviour on the name-collision / duplicate-name / cap-boundary fixtures. - Error paths: qcow2-only enforcement across raw / vmdk /
vhdx, feature-gate refusals (zstd, dirty bit, external data
file, LUKS), not-found exit codes,
-U+mutating refusal,--image-optsrejection. - Empty table: empty stdout + exit 0; JSON emits
[].
The suite is the CI regression net. The snapshot shell
harnesses (tools/snapshot-{create,delete,apply}-{matrix,
refusals}.sh and tools/snapshot-cli-parity.sh — seven
scripts, 241 assertions) are the live differential layer:
byte-identity against the host qemu-img from identical
inputs, run with make snapshot-harnesses (requires a built
instar and /dev/kvm; CI runs it in the functional-tests
workflow's snapshot-harnesses job). They overlap by design.
Adversarial Image Tests (test_adversarial.py)¶
Tests verifying that instar safely handles malicious and malformed images
without crashing, hanging, or consuming excessive resources. Uses the
run_adversarial() helper in base.py which enforces timeouts (hang
detection), memory limits via RLIMIT_AS (resource exhaustion), and signal
checks (crash detection).
Phase 1 — CVE-adjacent attacks: - Compression bombs: Zlib and ZSTD compressed QCOW2 images with extreme expansion ratios. Verifies decompression buffer bounds are enforced and output files stay small. - Circular backing chains: 2-level cycle (A→B→A), 3-level cycle (A→B→C→A), and self-referencing (A→A). Verifies chain discovery detects the cycle and rejects it. - Deep backing chains: Chains at 16 levels (device limit) and 17 levels (exceeds limit). Verifies depth enforcement and correct rejection. - Integer overflow: L1 table size near u32::MAX, L1 size = 0, cluster_bits below minimum (8) and above maximum (22). Verifies checked arithmetic prevents undefined behavior.
Phase 2 — boundary value cases: - Refcount order edges: refcount_order = 7 (128-bit, invalid) and 255 (extreme value). Verifies clamping or rejection. - Oversized virtual size: 1 petabyte and u64::MAX virtual sizes. Verifies info reports the size and check doesn't allocate based on virtual size alone. - VMDK grain size: Zero and huge (2^63) grain sizes. Verifies checked_mul prevents division by zero and overflow. - VHDX conflicting headers: Dual headers with different sequence numbers and valid CRC-32C checksums. - BAT beyond EOF: VHD and VHDX images with BAT entries pointing past end of file. Verifies I/O error handling.
Phase 3 — format confusion: - Polyglot files: QCOW2 magic header with VMDK descriptor body, and QCOW2 magic with ELF binary content. Verifies format detection works (magic wins) and structural validation catches inconsistencies. - Truncated headers: QCOW2 v2 header cut at 32 bytes, VMDK with only 8 bytes (magic + version), VHD footer at 48 bytes. Verifies all operations fail gracefully with no crash. - VMDK descriptor attacks: Null bytes in descriptor, multiple extent declarations, and inflated 1MB descriptor size claim. Verifies parser handles adversarial text safely.
Test images are generated by scripts in instar-testdata/scripts/ (the private
testdata repository). Scripts that generate adversarial or CVE-reproducer images
must always be placed in instar-testdata, never in the public instar repository.
All generated images live in instar-testdata/custom/audit/.
Security Tests (test_security.py)¶
Tests verifying instar's security properties: - Backing file references are detected but not followed - External data file references are reported but not read - VMDK descriptor extent paths are not accessed
Running Tests¶
Make Targets¶
# Create Python virtual environment (first time only)
make test-venv
# Run safe tests (default, suitable for development)
make test
# Run CI-suitable tests
make test-ci
# Run all tests including malicious images (explicit opt-in)
make test-malicious
# Run with verbose output (useful for debugging diffs)
make test-report
# Clean test artifacts
make clean-tests
Direct stestr Usage¶
cd tests
source .venv/bin/activate
# Run all tests
stestr run
# Run specific test module
stestr run test_info_safe
# Run with verbose output
stestr run --serial -- --verbose
# List available tests
stestr list
CI job layout and the partition guard¶
On a pull request the integration suite is split across several jobs
(in .github/workflows/functional-tests.yml) so the long-running
convert tests can run in parallel with everything else. Each job
selects a subset with stestr regex filters defined in the Makefile:
| Job | Make target | Selection |
|---|---|---|
integration-core |
test-container-core |
everything except test_convert., test_compare., test_info_malicious. (the catch-all) |
integration-convert-qcow2 |
test-container-convert-qcow2 |
test_convert./test_compare. minus Vhd |
integration-convert-vhd |
test-container-convert-vhd |
test_convert.TestConvert.*Vhd |
oslo-crossval-master |
(inline) | test_oslo_crossval re-run against oslo.utils master |
test_info_malicious.py is intentionally excluded from every PR job
and runs only via make test-malicious.
Because integration-core is exclude-based, it silently absorbs
any new test_*.py module — convenient, but it means a future
refactor (e.g. turning it into an include-list like the convert jobs)
could drop a whole module from CI with no test failure. The
class-level convert split has a subtler gap: a test_convert class
containing Vhd but not matching TestConvert.*Vhd would be excluded
by the qcow2 job and missed by the vhd job.
The test-partition CI job guards against both. It runs
tools/ci/check-test-partition.sh, which enumerates the suite with
stestr list, reads the actual job selectors from the Makefile and
workflow (no duplicated copy to drift), and fails if any test is run
by zero jobs. The only hand-maintained input is the allowlist of
intentional exclusions in tools/ci/check-test-partition.py
(currently just the malicious suite); each entry carries a documented
reason. Run it locally with:
Its own logic is unit-tested (stdlib only, no venv) via:
When you add a new integration job or change a test-container selector, the guard validates the new partition automatically; if it reports an orphan, either add the test to a job or (rarely) to the documented allowlist.
Test Image Manifest¶
Test images are defined in tests/manifest.json:
{
"id": "cirros-qcow2",
"path": "downloaded/cirros/cirros-0.6.3-x86_64-disk.img",
"format": "qcow2",
"safety": "safe",
"run_in_ci": true,
"description": "CirrOS minimal cloud image",
"tags": ["qcow2", "cloud-image"]
}
Manifest Fields¶
| Field | Description |
|---|---|
id |
Unique identifier for the test image |
path |
Path relative to testdata root |
format |
Expected disk format (qcow2, vmdk, vhd, etc.) |
safety |
safe, caution, or malicious |
run_in_ci |
Whether to include in CI test runs |
unsafe_quirks_required |
If true, requires --unsafe-quirks flag for qemu-img compatibility |
description |
Human-readable description |
tags |
Searchable tags for filtering |
expected_override |
Path to expected output file (for malicious images) |
Unsafe Quirks Testing¶
Images marked with unsafe_quirks_required: true do not have valid format
headers or partition tables. In default (secure) mode, instar rejects these
files as "unknown format" rather than accepting them as raw images.
To test qemu-img compatibility for these images, use --unsafe-quirks:
# Default mode: rejects files without valid structure
instar info random-garbage.raw
# Error: Unknown format (no valid disk image header or partition table)
# Unsafe quirks mode: matches qemu-img behavior
instar info --unsafe-quirks random-garbage.raw
# file format: raw
See configuration.md and quirks.md for details on safe vs unsafe quirks.
Test Data Location¶
Test images are stored in a separate repository (instar-testdata) to keep
the main repository small. The location is resolved in order:
INSTAR_TESTDATA_PATHenvironment variable../instar-testdata(sibling directory)
Expected Output Overrides¶
For malicious images where running qemu-img would be dangerous, store the
expected output in tests/expected_outputs/:
Reference this file in the manifest:
{
"id": "qcow2-backing-passwd",
"expected_override": "expected_outputs/qcow2_backing_etc_passwd.txt"
}
Image Notes¶
The docs/image_notes/ directory documents which test images exposed
specific quirks or implementation details. When a test image reveals
unexpected qemu-img behavior that requires compatibility work, create a
markdown file documenting:
- The specific values that revealed the behavior
- How qemu-img handles the case
- How instar now handles it
- Links to relevant quirks documentation
See Image Notes for existing documentation.
Adding New Test Images¶
- Add the image to
instar-testdatarepository - Add entry to
tests/manifest.json - For safe images: add scenario to
test_info_safe.py - For malicious images:
- Create expected output file in
tests/expected_outputs/ - Add scenario to
test_info_malicious.py - If the image exposes new quirks: create
docs/image_notes/<image-id>.md
Output Comparison¶
The test suite performs exact string comparison. On failure, it shows:
- Unified diff with whitespace made visible
␣for trailing spaces→for tabs↵for trailing newlines- Raw repr() of both outputs for debugging
Environment Variables¶
| Variable | Description |
|---|---|
INSTAR_TESTDATA_PATH |
Override default testdata location |
INSTAR_BINARY_PATH |
Override default instar binary location |
Differential Fuzzing¶
The project includes a differential fuzzer (scripts/differential-fuzz.py)
that compares instar against qemu-img on randomly generated images. This is
Phase 3 of the security audit plan (PLAN-audit.md).
How it works¶
For each iteration the fuzzer:
- Picks a random seed (logged for reproducibility).
- Generates a random disk image with
qemu-img create, varying format (qcow2, raw, vmdk, vpc), virtual size (1M-1G), cluster size, compression, and data patterns (zeros, random, sparse, MBR). - Creates separate copies for instar and qemu-img.
- Runs a random chain of 2-4 operations (info, check, convert, compressed convert, create, measure, resize, rebase, commit, map, snapshot) against both tools.
- Compares outputs at each stage: exit codes, normalized JSON info output, and converted file content (SHA-256 of raw-flattened output).
The op_snapshot arm (phase 13 of PLAN-snapshot) generates a
fresh qcow2, then applies an identical random chain of
snapshot -c / -d / -a and qemu-io write elements to the
instar and qemu-img copies, asserting byte-identity of the
whole image after every chain element (the qemu side runs with
file.discard=ignore; dates and dead padding bytes are
normalized per docs/quirks.md). Its first runs surfaced a real
multibyte list-padding bug and delete's missing surviving-L2
COPIED refresh, both since fixed.
The op_repair arm (phase 10 of PLAN-check-repair) builds a fresh
qcow2 with known data, injects one random qcow2 corruption
(refcount-zero / too-high / leaked / stale-copied / overlapping),
forks the corrupt file to two byte-identical copies, and repairs
one with instar check --repair and the other with qemu-img check
-r at a random tier. A three-tier oracle asserts: safety
(unconditional — instar must never produce check-errors or raise
the corruption/leak count above the original, even when it
refuses); convergence (all tier only, where the two tools
have matching scope — when instar claims a complete repair via
repair-incomplete == false, the image must be qemu-img
check-clean); and data equivalence (when both reach clean, the
raw-flattened guest data must match). instar's deliberate
refuse/partial behaviour is recorded as conservative, never a
divergence; the leaks tier is excluded from convergence because it
is intentionally narrower than qemu-img -r leaks (see
quirks.md).
Known quirks (see quirks.md) are excluded from comparison: disk size fields and format-specific metadata.
libyal cross-validation¶
When libyal tools are installed in the environment (libvmdk-utils,
libvhdi-utils, libqcow-utils), the fuzzer adds two additional
comparison layers:
- Info cross-check: Parsed fields from
vmdkinfo,vhdiinfo, andqcowinfo(virtual size, format version, cluster size, etc.) are compared against instar's JSON output for the same image. - Parse-success consistency: For each format, if the libyal tool successfully parses the image, instar check should report no errors (and vice versa). Disagreements are flagged as divergences.
This closes the gap where VMDK/VHD/VHDX had no differential reference
for check validation (qemu-img check only supports QCOW2), and
provides a third independent opinion for QCOW2. libyal tools are
optional — the fuzzer degrades gracefully when they are unavailable.
Running locally¶
python3 scripts/differential-fuzz.py \
--instar src/target/release/instar \
--iterations 100 \
--seed 42 \
--log-dir ./fuzz-logs
CI integration¶
The fuzzer runs automatically via .github/workflows/differential-fuzz.yml
at three tiers:
| Trigger | Iterations | When |
|---|---|---|
pull_request |
100 | PR changes fuzzer script or workflow |
push to develop |
200 | Post-merge smoke test |
schedule |
1000 | Nightly at 02:00 UTC |
workflow_dispatch |
configurable | Manual trigger |
On failure, the workflow uploads logs as artifacts and auto-files GitHub
Issues with the security-audit label, including the seed, iteration,
image attributes, and a reproduction command.
Reproducing a divergence¶
Each divergence report includes the seed and iteration number. To reproduce:
python3 scripts/differential-fuzz.py \
--instar src/target/release/instar \
--iterations <ITERATION + 1> \
--seed <SEED> \
--fail-fast
Phase 6: Coverage-Guided Fuzzing¶
Coverage-guided fuzzing uses cargo-fuzz (libFuzzer) to exercise the
no_std parser crates directly, without the full VMM/KVM stack. A
mock CallTable backed by fuzzer input provides the I/O layer,
allowing libFuzzer to explore malformed input space that differential
fuzzing (Phase 3) cannot reach.
Fuzz targets¶
27 targets across the parser and planner crates, organized in
src/fuzz/:
| Target | Crate | Type |
|---|---|---|
fuzz_format_detect |
shared | Buffer-based |
fuzz_qcow2_header |
qcow2 | Buffer-based |
fuzz_qcow2_l1l2 |
qcow2 | CallTable |
fuzz_qcow2_refcount |
qcow2 | CallTable |
fuzz_qcow2_decompress |
qcow2 | CallTable |
fuzz_vmdk_header |
vmdk | Buffer-based |
fuzz_vmdk_grain |
vmdk | CallTable |
fuzz_vhd_footer |
vhd | Buffer-based |
fuzz_vhd_bat |
vhd | CallTable |
fuzz_vhdx_header |
vhdx | Buffer-based |
fuzz_vhdx_metadata |
vhdx | CallTable |
fuzz_raw_partition |
raw | Buffer-based |
fuzz_luks_header |
luks | Buffer-based |
fuzz_measure_calc |
measure | Buffer-based |
fuzz_measure_scan |
all parsers | CallTable |
fuzz_create_emitters |
create | Buffer-based |
fuzz_resize_planners |
resize | Buffer-based |
fuzz_rebase_planners |
rebase | Buffer-based |
fuzz_commit_planners |
commit | Buffer-based |
fuzz_amend_planners |
amend | Buffer-based |
fuzz_map_iter |
all parsers | CallTable |
fuzz_snapshot_parse |
qcow2 | CallTable |
fuzz_snapshot_refcount |
snapshot | Buffer-based |
fuzz_check_repair |
check | Buffer-based |
fuzz_dd_window |
dd | Buffer-based |
fuzz_chs_rounded_size |
dd | Buffer-based |
fuzz_dd_read |
dd | CallTable |
Buffer-based targets call parser functions that take &[u8]
directly (e.g. QcowHeader::parse(data)). CallTable targets
use the mock CallTable from src/fuzz/src/lib.rs to simulate
sector-based I/O from the fuzzer input.
The two snapshot targets (phase 12 of PLAN-snapshot) cover the
streaming snapshot-table parser (fuzz_snapshot_parse drives
for_each_snapshot_entry and the planner converter against
adversarial qcow2 fragments) and the mutator primitives
(fuzz_snapshot_refcount dispatches one of seven ops per exec —
refcount increment/decrement/swap, the COPIED-flag walker, the
contiguous allocator, the flag helpers, and the
table-serialisation round-trip — asserting semantic invariants
such as precheck-never-mutates, inc/dec byte-identity, and
allocator containment).
fuzz_check_repair (phase 9 of PLAN-check-repair) dispatches one
of four crates/check repair planners per exec — leak
reclamation, count accumulation, refcount correction, and COPIED
reconciliation — asserting sub-byte-masked containment (co-resident
refcount entries in a shared byte are preserved at widths 1/2/4),
raised/lowered/freed tally correctness, the overflow→AmbiguousCorruption
and bounds→MisalignedAccess error classifications, idempotence,
and the "correct generalises reclaim" cross-check. The COPIED
walker's deep invariants are delegated to fuzz_snapshot_refcount.
Running locally¶
# Inside the instar-build container:
cd src/fuzz
# Run a single target for 60 seconds
cargo fuzz run fuzz_qcow2_header -- -max_total_time=60
# Run with specific corpus
cargo fuzz run fuzz_qcow2_header corpus/fuzz_qcow2_header/
# Minimize a crash
cargo fuzz tmin fuzz_qcow2_header artifacts/fuzz_qcow2_header/<crash>
# Generate coverage report
cargo fuzz coverage fuzz_qcow2_header
Corpus seeding¶
The seed corpus is extracted from instar-testdata using:
This copies test images into per-target corpus directories under
src/fuzz/corpus/, filtered by format. Header-only targets receive
truncated copies. Hand-crafted minimal valid inputs are also generated
for each format.
It additionally restores the accumulated corpus pushed by prior
nightly runs. Each nightly run pushes coverage-increasing inputs to
instar-testdata/custom/fuzz-corpus/<target>/; on the next run those
entries are copied straight back into corpus/<target>/ by target
name (restore_pushed_corpus()), so coverage compounds across runs.
This matters most for the targets whose inputs are not recognizable
image formats — the window-math, CHS-geometry and planner fuzzers —
which would otherwise re-seed cold every night because format-based
routing cannot place their entries. Entries are content-addressed, so
restoration is idempotent.
CI integration¶
The CI workflow (.github/workflows/coverage-fuzz.yml) runs:
- Nightly at 04:00 UTC, all targets, with tiered per-target
durations (see below).
- PR validation: single-target smoke test when fuzz/parser code
changes.
- Post-merge (push to develop): 15s per target.
- Manual dispatch: configurable duration and target selection.
Tiered nightly durations¶
The nightly run has a fixed wall-clock budget (450 min, inside the
480 min job timeout). Rather than splitting it evenly, the run plan is
computed by tools/ci/fuzz-tier.sh: the fast-saturating targets (pure
window math, CHS rounding, and the planner/emitter crates) take a short
fixed slice (300s — they reach steady coverage in well under a minute),
and the deep parser/format targets split the remainder. With the
current 27 targets that gives the 17 deep targets ~24 min each versus
~17 min under an even split.
This is one of two levers for keeping per-target time useful as the
target count grows; the other is corpus persistence (see Corpus
seeding). When the deep-tier share computed by fuzz-tier.sh falls
to the fast-tier floor (~300s), stop cutting time and shard the targets
across multiple CI jobs instead. Sharding only adds real throughput
if the self-hosted runner pool has spare physical cores during the
nightly window, since each libFuzzer target pins a core — confirm core
availability before adding jobs.
Crashes are minimized with cargo fuzz tmin and filed as GitHub
Issues with the security-audit label immediately when found. New
corpus entries are pushed to instar-testdata/custom/fuzz-corpus/
after nightly runs, and restored by target name on the next run so
coverage compounds.
Automated bug fixes¶
The CI workflow (.github/workflows/fuzz-autofix.yml) runs daily
at 06:00 UTC and picks up open security-audit issues. It invokes
Claude Code (30-turn limit) to diagnose and fix the crash, then
verifies the fix by rebuilding and running core tests. Two attempts
per issue; failed issues are labelled autofix-failed for human
attention. Complexity guardrails prevent runaway fixes (max 3 files,
no cross-crate changes, no new dependencies).
Related Documentation¶
- Format Coverage - Comparison with oslo.utils format_inspector
- Format Detection Safety - Security model for format auto-detection
- Security Analysis - CVE analysis and threat model