Integration Test Suite¶

The instar project includes a Python-based integration test suite that verifies instar info produces output identical to qemu-img info, that instar check correctly detects structural corruption in QCOW2 images, that instar compare produces byte-for-byte identical output to qemu-img compare, and that instar convert produces output identical to qemu-img convert. Since instar aims to be a drop-in replacement, any difference in output is considered a bug.

Architecture¶

The test suite uses:

testtools - Extended unittest framework with better assertions
testscenarios - Parameterized test scenarios
stestr - Parallel test runner with result storage

Tests compare instar output against either: 1. Live qemu-img output (for safe images - info, compare, convert) 2. Stored expected output files (for malicious images)

Test Categories¶

Safe Images (`test_info_safe.py`)¶

Tests against known-safe disk images. These run qemu-img directly and compare outputs character-for-character.

Malicious Images (`test_info_malicious.py`)¶

Tests against images designed to exploit vulnerabilities (e.g., backing file references to /etc/passwd). These use pre-stored expected output files instead of running qemu-img, since running qemu-img on malicious images defeats the security purpose of instar.

Check Validation Tests (`test_check_formats.py`)¶

Tests for the instar check operation: - Format detection: Verifies check correctly identifies QCOW2, VMDK, VHD formats - Corrupt images: Tests against deliberately corrupt format headers (VMDK, VHDX, VHD) - QCOW2 structural validation: Uses 4 script-generated corrupt QCOW2 images: - Clean baseline (should pass with 0 errors) - Overlapping clusters (two L2 entries pointing to same host cluster) - Refcount-zero (referenced cluster with refcount=0) - Leaked cluster (refcount>0 but no L2 reference) - Unsafe quirks mode: Verifies non-QCOW2 formats are treated as raw with --unsafe-quirks

Corrupt QCOW2 test images are generated by instar-testdata/custom/check-validation/create-corrupt-images.py, which creates images with qemu-img/qemu-io and then surgically corrupts specific QCOW2 structures via binary manipulation.

Compare Tests (`test_compare.py`)¶

Tests for the instar compare operation, cross-validated against qemu-img compare:

Raw-vs-raw (TestCompareRawIdentical, TestCompareRawDifferent, TestCompareRawSizeMismatch, TestCompareRawJson): - Identical images: Self-compare and two identical files - Different content: Mismatch at offset 0 and at mid-file offsets - Size mismatch: Non-strict (zeros = identical), non-strict (non-zero = differs), and strict mode (always fails on size difference) - JSON output: Validates identical, first-mismatch-offset, total-bytes-compared, and size-mismatch fields

QCOW2-vs-raw (TestCompareQcow2VsRaw): - Identical content across formats (including all-zeros) - Different content reports correct mismatch offset - Cross-validated against qemu-img compare

QCOW2-vs-QCOW2 (TestCompareQcow2VsQcow2): - Identical and different content between two QCOW2 images - Virtual size mismatch handling - Cross-validated against qemu-img compare

Compressed QCOW2 (TestCompareQcow2Compressed): - Compressed QCOW2 vs raw with same content (zlib decompression) - Compressed vs uncompressed QCOW2 with same content - Cross-validated against qemu-img compare

Backing chains (TestCompareBackingChain): - QCOW2 overlay with raw backing file vs flattened raw (identical) - QCOW2 overlay vs different raw (mismatch detected) - Deep chain (3-level: top -> mid -> base) vs flattened raw (identical) - Two different QCOW2 backing chains with same virtual content (identical) - All scenarios cross-validated against qemu-img compare

Test images are created at runtime using qemu-img create, qemu-io write, qemu-img convert -c (for compressed), and qemu-img create -b (for backing chains), so no external testdata is needed.

qemu-img cross-validation: Every scenario verifies byte-for-byte identical stdout and matching exit codes with qemu-img compare.

Convert Tests (`test_convert.py`)¶

Tests for the instar convert operation (QCOW2 to raw), cross-validated against qemu-img convert:

Basic conversion (TestConvertBasicQcow2ToRaw): - Empty QCOW2 image to raw - QCOW2 with written data to raw - Output size matches virtual size - All cross-validated against qemu-img convert

Compressed QCOW2 (TestConvertCompressed): - Compressed QCOW2 to raw (zlib decompression) - Compared against original raw source

Backing chains (TestConvertBackingChain): - QCOW2 overlay with raw backing flattened to raw - Deep chain (3-level: top -> mid -> base) flattened to raw - Cross-validated against qemu-img convert

Raw passthrough (TestConvertRawToRaw): - Raw to raw identity conversion

Error handling (TestConvertErrors): - Unsupported output format rejected - Nonexistent input file rejected

Manifest images (TestConvertManifestImages): - Converts real-world QCOW2 images from the test manifest - Cross-validates against qemu-img convert output - Skips images with cluster_size > 64KB (unsupported) - Skips images whose virtual_size exceeds available temp space

Adversarial Image Tests (`test_adversarial.py`)¶

Tests verifying that instar safely handles malicious and malformed images without crashing, hanging, or consuming excessive resources. Uses the run_adversarial() helper in base.py which enforces timeouts (hang detection), memory limits via RLIMIT_AS (resource exhaustion), and signal checks (crash detection).

Phase 1 — CVE-adjacent attacks: - Compression bombs: Zlib and ZSTD compressed QCOW2 images with extreme expansion ratios. Verifies decompression buffer bounds are enforced and output files stay small. - Circular backing chains: 2-level cycle (A→B→A), 3-level cycle (A→B→C→A), and self-referencing (A→A). Verifies chain discovery detects the cycle and rejects it. - Deep backing chains: Chains at 16 levels (device limit) and 17 levels (exceeds limit). Verifies depth enforcement and correct rejection. - Integer overflow: L1 table size near u32::MAX, L1 size = 0, cluster_bits below minimum (8) and above maximum (22). Verifies checked arithmetic prevents undefined behavior.

Phase 2 — boundary value cases: - Refcount order edges: refcount_order = 7 (128-bit, invalid) and 255 (extreme value). Verifies clamping or rejection. - Oversized virtual size: 1 petabyte and u64::MAX virtual sizes. Verifies info reports the size and check doesn't allocate based on virtual size alone. - VMDK grain size: Zero and huge (2^63) grain sizes. Verifies checked_mul prevents division by zero and overflow. - VHDX conflicting headers: Dual headers with different sequence numbers and valid CRC-32C checksums. - BAT beyond EOF: VHD and VHDX images with BAT entries pointing past end of file. Verifies I/O error handling.

Phase 3 — format confusion: - Polyglot files: QCOW2 magic header with VMDK descriptor body, and QCOW2 magic with ELF binary content. Verifies format detection works (magic wins) and structural validation catches inconsistencies. - Truncated headers: QCOW2 v2 header cut at 32 bytes, VMDK with only 8 bytes (magic + version), VHD footer at 48 bytes. Verifies all operations fail gracefully with no crash. - VMDK descriptor attacks: Null bytes in descriptor, multiple extent declarations, and inflated 1MB descriptor size claim. Verifies parser handles adversarial text safely.

Test images are generated by scripts in instar-testdata/scripts/ (the private testdata repository). Scripts that generate adversarial or CVE-reproducer images must always be placed in instar-testdata, never in the public instar repository.

All generated images live in instar-testdata/custom/audit/.

Security Tests (`test_security.py`)¶

Tests verifying instar's security properties: - Backing file references are detected but not followed - External data file references are reported but not read - VMDK descriptor extent paths are not accessed

Running Tests¶

Make Targets¶

# Create Python virtual environment (first time only)
make test-venv

# Run safe tests (default, suitable for development)
make test

# Run CI-suitable tests
make test-ci

# Run all tests including malicious images (explicit opt-in)
make test-malicious

# Run with verbose output (useful for debugging diffs)
make test-report

# Clean test artifacts
make clean-tests

Direct stestr Usage¶

cd tests
source .venv/bin/activate

# Run all tests
stestr run

# Run specific test module
stestr run test_info_safe

# Run with verbose output
stestr run --serial -- --verbose

# List available tests
stestr list

Test Image Manifest¶

Test images are defined in tests/manifest.json:

{
    "id": "cirros-qcow2",
    "path": "downloaded/cirros/cirros-0.6.3-x86_64-disk.img",
    "format": "qcow2",
    "safety": "safe",
    "run_in_ci": true,
    "description": "CirrOS minimal cloud image",
    "tags": ["qcow2", "cloud-image"]
}

Manifest Fields¶

Field	Description
`id`	Unique identifier for the test image
`path`	Path relative to testdata root
`format`	Expected disk format (qcow2, vmdk, vhd, etc.)
`safety`	`safe`, `caution`, or `malicious`
`run_in_ci`	Whether to include in CI test runs
`unsafe_quirks_required`	If true, requires `--unsafe-quirks` flag for qemu-img compatibility
`description`	Human-readable description
`tags`	Searchable tags for filtering
`expected_override`	Path to expected output file (for malicious images)

Unsafe Quirks Testing¶

Images marked with unsafe_quirks_required: true do not have valid format headers or partition tables. In default (secure) mode, instar rejects these files as "unknown format" rather than accepting them as raw images.

To test qemu-img compatibility for these images, use --unsafe-quirks:

# Default mode: rejects files without valid structure
instar info random-garbage.raw
# Error: Unknown format (no valid disk image header or partition table)

# Unsafe quirks mode: matches qemu-img behavior
instar info --unsafe-quirks random-garbage.raw
# file format: raw

See configuration.md and quirks.md for details on safe vs unsafe quirks.

Test Data Location¶

Test images are stored in a separate repository (instar-testdata) to keep the main repository small. The location is resolved in order:

INSTAR_TESTDATA_PATH environment variable
../instar-testdata (sibling directory)

Expected Output Overrides¶

For malicious images where running qemu-img would be dangerous, store the expected output in tests/expected_outputs/:

tests/expected_outputs/
└── qcow2_backing_etc_passwd.txt

Reference this file in the manifest:

{
    "id": "qcow2-backing-passwd",
    "expected_override": "expected_outputs/qcow2_backing_etc_passwd.txt"
}

Image Notes¶

The docs/image_notes/ directory documents which test images exposed specific quirks or implementation details. When a test image reveals unexpected qemu-img behavior that requires compatibility work, create a markdown file documenting:

The specific values that revealed the behavior
How qemu-img handles the case
How instar now handles it
Links to relevant quirks documentation

See Image Notes for existing documentation.

Adding New Test Images¶

Add the image to instar-testdata repository
Add entry to tests/manifest.json
For safe images: add scenario to test_info_safe.py
For malicious images:
Create expected output file in tests/expected_outputs/
Add scenario to test_info_malicious.py
If the image exposes new quirks: create docs/image_notes/<image-id>.md

Output Comparison¶

The test suite performs exact string comparison. On failure, it shows:

Unified diff with whitespace made visible
␣ for trailing spaces
→ for tabs
↵ for trailing newlines
Raw repr() of both outputs for debugging

Environment Variables¶

Variable	Description
`INSTAR_TESTDATA_PATH`	Override default testdata location
`INSTAR_BINARY_PATH`	Override default instar binary location

Differential Fuzzing¶

The project includes a differential fuzzer (scripts/differential-fuzz.py) that compares instar against qemu-img on randomly generated images. This is Phase 3 of the security audit plan (PLAN-audit.md).

How it works¶

For each iteration the fuzzer:

Picks a random seed (logged for reproducibility).
Generates a random disk image with qemu-img create, varying format (qcow2, raw, vmdk, vpc), virtual size (1M-1G), cluster size, compression, and data patterns (zeros, random, sparse, MBR).
Creates separate copies for instar and qemu-img.
Runs a random chain of 2-4 operations (info, check, convert, compressed convert) against both tools.
Compares outputs at each stage: exit codes, normalized JSON info output, and converted file content (SHA-256 of raw-flattened output).

Known quirks (see quirks.md) are excluded from comparison: disk size fields and format-specific metadata.

libyal cross-validation¶

When libyal tools are installed in the environment (libvmdk-utils, libvhdi-utils, libqcow-utils), the fuzzer adds two additional comparison layers:

Info cross-check: Parsed fields from vmdkinfo, vhdiinfo, and qcowinfo (virtual size, format version, cluster size, etc.) are compared against instar's JSON output for the same image.
Parse-success consistency: For each format, if the libyal tool successfully parses the image, instar check should report no errors (and vice versa). Disagreements are flagged as divergences.

This closes the gap where VMDK/VHD/VHDX had no differential reference for check validation (qemu-img check only supports QCOW2), and provides a third independent opinion for QCOW2. libyal tools are optional — the fuzzer degrades gracefully when they are unavailable.

Running locally¶

python3 scripts/differential-fuzz.py \
    --instar src/target/release/instar \
    --iterations 100 \
    --seed 42 \
    --log-dir ./fuzz-logs

CI integration¶

The fuzzer runs automatically via .github/workflows/differential-fuzz.yml at three tiers:

Trigger	Iterations	When
`pull_request`	100	PR changes fuzzer script or workflow
`push` to develop	200	Post-merge smoke test
`schedule`	1000	Nightly at 02:00 UTC
`workflow_dispatch`	configurable	Manual trigger

On failure, the workflow uploads logs as artifacts and auto-files GitHub Issues with the security-audit label, including the seed, iteration, image attributes, and a reproduction command.

Reproducing a divergence¶

Each divergence report includes the seed and iteration number. To reproduce:

python3 scripts/differential-fuzz.py \
    --instar src/target/release/instar \
    --iterations <ITERATION + 1> \
    --seed <SEED> \
    --fail-fast

Phase 6: Coverage-Guided Fuzzing¶

Coverage-guided fuzzing uses cargo-fuzz (libFuzzer) to exercise the no_std parser crates directly, without the full VMM/KVM stack. A mock CallTable backed by fuzzer input provides the I/O layer, allowing libFuzzer to explore malformed input space that differential fuzzing (Phase 3) cannot reach.

Fuzz targets¶

13 targets across all parser crates, organized in src/fuzz/:

Target	Crate	Type
`fuzz_format_detect`	shared	Buffer-based
`fuzz_qcow2_header`	qcow2	Buffer-based
`fuzz_qcow2_l1l2`	qcow2	CallTable
`fuzz_qcow2_refcount`	qcow2	CallTable
`fuzz_qcow2_decompress`	qcow2	CallTable
`fuzz_vmdk_header`	vmdk	Buffer-based
`fuzz_vmdk_grain`	vmdk	CallTable
`fuzz_vhd_footer`	vhd	Buffer-based
`fuzz_vhd_bat`	vhd	CallTable
`fuzz_vhdx_header`	vhdx	Buffer-based
`fuzz_vhdx_metadata`	vhdx	CallTable
`fuzz_raw_partition`	raw	Buffer-based
`fuzz_luks_header`	luks	Buffer-based

Buffer-based targets call parser functions that take &[u8] directly (e.g. QcowHeader::parse(data)). CallTable targets use the mock CallTable from src/fuzz/src/lib.rs to simulate sector-based I/O from the fuzzer input.

Running locally¶

# Inside the instar-build container:
cd src/fuzz

# Run a single target for 60 seconds
cargo fuzz run fuzz_qcow2_header -- -max_total_time=60

# Run with specific corpus
cargo fuzz run fuzz_qcow2_header corpus/fuzz_qcow2_header/

# Minimize a crash
cargo fuzz tmin fuzz_qcow2_header artifacts/fuzz_qcow2_header/<crash>

# Generate coverage report
cargo fuzz coverage fuzz_qcow2_header

Corpus seeding¶

The seed corpus is extracted from instar-testdata using:

python3 scripts/extract-fuzz-corpus.py --testdata /path/to/instar-testdata

This copies test images into per-target corpus directories under src/fuzz/corpus/, filtered by format. Header-only targets receive truncated copies. Hand-crafted minimal valid inputs are also generated for each format.

CI integration¶

The CI workflow (.github/workflows/coverage-fuzz.yml) runs: - Nightly: 1 hour per target at 04:00 UTC - PR validation: 60-second smoke test when fuzz/parser code changes - Manual dispatch: configurable duration and target selection

Crashes are minimized with cargo fuzz tmin and filed as GitHub Issues with the security-audit label immediately when found. New corpus entries are pushed to instar-testdata/custom/fuzz-corpus/ after nightly runs.

Automated bug fixes¶

The CI workflow (.github/workflows/fuzz-autofix.yml) runs daily at 06:00 UTC and picks up open security-audit issues. It invokes Claude Code (30-turn limit) to diagnose and fix the crash, then verifies the fix by rebuilding and running core tests. Two attempts per issue; failed issues are labelled autofix-failed for human attention. Complexity guardrails prevent runaway fixes (max 3 files, no cross-crate changes, no new dependencies).

Format Coverage - Comparison with oslo.utils format_inspector
Format Detection Safety - Security model for format auto-detection
Security Analysis - CVE analysis and threat model

📝 Report an issue with this page

Integration Test Suite¶

Architecture¶

Test Categories¶

Safe Images (test_info_safe.py)¶

Malicious Images (test_info_malicious.py)¶

Check Validation Tests (test_check_formats.py)¶

Compare Tests (test_compare.py)¶

Convert Tests (test_convert.py)¶

Adversarial Image Tests (test_adversarial.py)¶

Security Tests (test_security.py)¶

Running Tests¶

Make Targets¶

Direct stestr Usage¶

Test Image Manifest¶

Manifest Fields¶

Unsafe Quirks Testing¶

Test Data Location¶

Expected Output Overrides¶

Image Notes¶

Adding New Test Images¶

Output Comparison¶

Environment Variables¶

Differential Fuzzing¶

How it works¶

libyal cross-validation¶

Running locally¶

CI integration¶

Reproducing a divergence¶

Phase 6: Coverage-Guided Fuzzing¶

Fuzz targets¶

Running locally¶

Corpus seeding¶

CI integration¶

Automated bug fixes¶

Related Documentation¶

Safe Images (`test_info_safe.py`)¶

Malicious Images (`test_info_malicious.py`)¶

Check Validation Tests (`test_check_formats.py`)¶

Compare Tests (`test_compare.py`)¶

Convert Tests (`test_convert.py`)¶

Adversarial Image Tests (`test_adversarial.py`)¶

Security Tests (`test_security.py`)¶