Phase 3: host VMM subcommand + clap surface¶
Master plan: PLAN-map.md · Previous phase: PLAN-map-phase-02-guest-op.md
Status: Complete¶
MapArgs clap struct and Commands::Map(MapArgs) arm
shipped in src/vmm/src/main.rs; run_map writes
MapConfig per-field at OPERATION_CONFIG_ADDR, attaches
the source read-only as input device 0, runs the vCPU loop,
and decodes streamed MapExtentMessage records into a
placeholder renderer (replaced in phase 4). Host-side guards
reject --image-opts, VMDK monolithicFlat descriptors via
peek_is_vmdk_descriptor, and invalid sector sizes
(non-power-of-2 or outside [512, MAX_SECTOR_SIZE]).
Mission¶
Ship the host-side CLI surface for the map operation. After
phase 3, instar map FILENAME parses arguments, launches
the map.bin guest binary from phase 2 with a populated
MapConfig, consumes the streamed MapExtentMessage records
and the terminating MapResultMessage, and emits valid
human / JSON output. The polish to byte-for-byte qemu-img
parity (column widths, JSON field ordering, exact hex
formatting, streaming JSON array writer) lands in phase 4.
The wire from CLI → guest → host renderer is the whole deliverable: command parses, guest runs, host prints something. The "something" is correct in structure and content; whether it is byte-identical to qemu-img is phase 4's problem.
Why this is its own phase¶
Phase 2 left a guest binary that can be launched but no host
caller. Phase 3 plugs the binary into the existing clap
dispatch table and the Commands enum, populates the
MapConfig byte layout from a MapArgs struct, threads the
serial-channel consumer to recognise the two new payload
variants, and gates the polished renderer behind a small
placeholder so phase 4 can swap in the qemu-img-compatible
formatter without changing any of the plumbing.
Bundling the CLI + minimal-render together avoids an
intermediate state in which instar map parses but does
nothing visible — the renderer is small enough to land here
without overflowing the phase. Splitting the renderer out as
phase 4 lets the byte-for-byte work (and its testdata
sweep) be a self-contained polish step.
Architecture¶
CLI surface¶
MapArgs mirrors qemu-img map's surface plus an
instar-specific --sector-size:
#[derive(Args, Debug)]
struct MapArgs {
/// Source image file. Required.
input: String,
/// Source format override (rare; usually auto-detected).
/// Accepted for parity with qemu-img -f.
#[arg(short = 'f', long = "format")]
source_format: Option<String>,
/// Output format: human (default) or json.
#[arg(long, default_value = "human", value_parser = ["human", "json"])]
output: String,
/// Start emission at this virtual byte offset. Accepts
/// K/M/G/T suffixes (parsed by parse_memory_size).
/// Default: 0 (start of image).
#[arg(long = "start-offset")]
start_offset: Option<String>,
/// Stop emission after this many virtual bytes from
/// --start-offset. Accepts K/M/G/T suffixes. Default:
/// emit to end of image.
#[arg(long = "max-length")]
max_length: Option<String>,
/// Sector size for source I/O. Default: 65536. Not
/// part of qemu-img's surface; instar-specific.
#[arg(long, default_value = "65536")]
sector_size: u32,
/// Refused for parity-rejection: qemu-img's --image-opts
/// descriptor-based source specification is deferred.
/// Documented in docs/quirks.md.
#[arg(long = "image-opts")]
image_opts: bool,
}
Commands::Map(MapArgs) is appended to the enum after
Commands::Commit(CommitArgs). The dispatch arm in main()
goes to run_map(args, verbose).
run_map orchestration¶
The function mirrors run_measure's structure
(src/vmm/src/main.rs:8856) with these changes:
- No
--sizemode: map always reads a source image. Rejectargs.input.is_empty()(clap already enforces the positional). No size-mode stub file plumbing. - Refuse
--image-optsearly with the exact qemu-img message:"map: --image-opts is not supported (instar accepts FILENAME directly; see docs/quirks.md)". - Validate
sector_size(power of two, 512 ≤ N ≤MAX_SECTOR_SIZE). - Resolve window bytes:
start_offsetandmax_lengtheach parsed via the existingparse_memory_sizehelper. Unset values map to 0 (the guest treatsmax_length == 0as "emit to end"). - Refuse
--start-offset >= source_sizewith a clear error message — qemu-img returns an error too, and the guest'sclip_to_windowwould silently emit nothing. - Build the
MapConfigbyte layout, write field-by- field toOPERATION_CONFIG_ADDRviaguest_mem.write_obj. Magic0x4D41505F. Field offsets:0magic,4flags,8sector_size,12input_device_count (1),16start_offset (u64),24max_length (u64),32..64reserved (zero). - Load
core.bin+map.binviaget_binary_path/load_guest_binary. - KVM / VM / guest memory setup identical to
run_measure(no output device; one read-only input). - Source attach: open
args.inputwithBackingStore::open(_, true, None, false)(read-only), wrap inVirtioBlockDevice, attach as device 0. - vCPU + serial setup identical to
run_measure. - Run loop:
- On
MapExtentpayload: push the record into aVec<MapExtentMessage>for the renderer. (Streaming JSON writer is phase 4.) - On
MapResultpayload: setmap_error = result.error, markmap_result_seen = true, store theresultclone inOption<MapResultMessage>. - Other payloads: log via
format_messagewhen verbose.
- On
- After the loop: pass the collected extents + result
to
print_map_result(args, &extents, &result, &output). - Error mapping: if the guest reports
ERROR_HAS_BACKING, exit1with a stderr message pointing at the chain follow-up.ERROR_INVALID_OPTION,ERROR_INVALID_SOURCE,ERROR_IOget matching stderr messages.
Renderer (phase 3 placeholder)¶
print_map_result produces valid output of both formats
but does not chase byte-for-byte qemu-img parity. Phase 4
replaces this function with the polished formatter.
fn print_map_result(
extents: &[guest_::MapExtentMessage],
result: &guest_::MapResultMessage,
output_format: &str,
) {
if result.error != MAP_RESULT_ERROR_OK {
let msg = match result.error {
MAP_RESULT_ERROR_INVALID_SOURCE => "map: source format unrecognised",
MAP_RESULT_ERROR_INVALID_OPTION => "map: invalid config",
MAP_RESULT_ERROR_HAS_BACKING => {
"map: source has a backing/parent reference; \
chain composition is deferred (see PLAN-map.md)"
}
MAP_RESULT_ERROR_IO => "map: I/O failure walking the source",
_ => "map: unknown error",
};
eprintln!("{}", msg);
return;
}
if output_format == "json" {
print_map_result_json(extents);
} else {
print_map_result_human(extents);
}
}
Phase 3 human renderer (will be replaced in phase 4 to match qemu-img's column widths exactly):
Phase 3 JSON renderer (will be replaced in phase 4 to match qemu-img's whitespace + field ordering exactly):
[
{ "start": 0, "length": 1048576, "depth": 0, "present": true,
"zero": false, "data": true, "offset": 327680 },
...
]
Mapping from the state string to the present / zero /
data triple matches qemu-img:
| state | present | zero | data | offset emitted |
|---|---|---|---|---|
| "hole" | false | true | false | no |
| "zero" | true | true | false | no |
| "data" | true | false | true | yes |
depth is always 0 in v1 (chain composition deferred).
Host-side constants¶
Mirror the measure pattern: declare top-of-file constants that name the magic / error values so call sites don't repeat raw hex literals:
const MAP_CONFIG_MAGIC: u32 = 0x4D41505F; // "MAP_"
const MAP_RESULT_MAGIC: u32 = 0x4D505253; // "MPRS"
const MAP_RESULT_ERROR_OK: u32 = 0;
const MAP_RESULT_ERROR_INVALID_SOURCE: u32 = 1;
const MAP_RESULT_ERROR_INVALID_OPTION: u32 = 2;
const MAP_RESULT_ERROR_HAS_BACKING: u32 = 3;
const MAP_RESULT_ERROR_IO: u32 = 4;
These go in the same constant block as
MEASURE_CONFIG_MAGIC / MEASURE_RESULT_*.
format_message already handles the two new payloads¶
Step 2b of phase 2 added the MapExtent and MapResult
arms to format_message. No further work needed for the
debug-log path; phase 3 just routes MapResult through the
explicit renderer instead of the debug logger and
accumulates MapExtent payloads in a Vec.
Open questions¶
-
Source-format override (
-f): phase 3 accepts the flag but ignores it (matches measure's behaviour). The guest re-detects the format from the first sector regardless. Recommendation: keep the flag in the surface for parity with qemu-img, and silently ignore for now; phase 4 may surface a warning if-fdisagrees with the detected format. -
--start-offsetalignment: qemu-img map silently clamps--start-offsetto a cluster boundary on output (the extent that contains the offset is emitted in full, starting from the cluster boundary). instar's guest-sideclip_to_windowclips at the byte level, which can produce a leading partial extent that qemu-img would not. Recommendation: phase 3 accepts this divergence and documents it indocs/quirks.md. The qemu-img semantic can be replicated later if any consumer complains. -
Empty / zero-extent output: if the source is zero bytes, the guest emits one
MapResultand no extents. The renderer should produce[](JSON) or just the header row (human). Edge case worth a unit test. -
Source-file metadata fast paths: qemu-img map opens the file via the block-driver layer; instar opens via
BackingStore::open. Both surface the same byte stream to the guest. No semantic difference. -
Renderer streaming: phase 3 buffers extents into a
Vec<MapExtentMessage>. For a maximally-fragmented 1 TiB qcow2 source emitting 17 M extents, the buffer reaches ~3 GiB host memory. Acceptable for phase 3 (the buffer is one-shot and the OS will swap if needed) but not for production. Recommendation: phase 4 swaps in a streaming writer that emits each extent as it arrives, bringing host memory back to constant. Phase 3 keeps the Vec for renderer simplicity; the test cases stay small enough not to OOM. -
--output=jsonwith the buffered renderer: the buffered Vec naturally drives the standard "open[, join with commas, close]" pattern. Phase 4's streaming writer will produce identical output by tracking afirst_extentboolean in the message-handler inside the vCPU loop. -
VMDK monolithicFlat source rejection: phase 3 should refuse these the same way measure does (
peek_is_vmdk_descriptor). The phase 2 guest refuses them throughVmdkState::init's natural binary-header rejection, but the resultingERROR_INVALID_SOURCEis less helpful than a host-side pre-check pointing atqemu-img mapas an escape hatch. Recommendation: include the pre-check in phase 3 (one if-statement, copies the measure pattern). -
get_input_device_countfor chain mode: phase 3 hard-codesinput_device_count = 1. The chain follow-up will lift this to N; for v1 the host enforces the same invariant the guest checks.
Execution¶
| Step | Effort | Model | Isolation | Brief for sub-agent |
|---|---|---|---|---|
| 3a | medium | sonnet | none | Add MapArgs struct after CommitArgs in src/vmm/src/main.rs per the schema above (input, source_format, output, start_offset, max_length, sector_size, image_opts). Add Map(MapArgs) to the Commands enum after Commit(CommitArgs). Add the dispatch arm Commands::Map(args) => run_map(args, verbose), to the match in main() (line 3130 region). Add a stub fn run_map(_args: MapArgs, _verbose: bool) -> Result<(), Box<dyn std::error::Error>> { Err("map: not yet implemented".into()) } so the module compiles. Add the MAP_CONFIG_MAGIC / MAP_RESULT_MAGIC / MAP_RESULT_ERROR_* top-of-file constants per the schema above. Run make instar, make lint, make test-rust. instar map --help should produce the expected surface. Touches only src/vmm/src/main.rs. |
| 3b | high | opus | none | Implement the body of run_map per the "run_map orchestration" section. Validate sector_size, refuse --image-opts, refuse VMDK monolithicFlat sources via peek_is_vmdk_descriptor (same pattern as run_measure), refuse --start-offset >= file size, parse start_offset / max_length via parse_memory_size. Write the MapConfig byte layout at OPERATION_CONFIG_ADDR using per-field write_obj calls at known offsets (cross-check against the MapConfig struct in src/shared/src/lib.rs). KVM / VM / guest memory setup identical to run_measure; source attach via BackingStore::open(path, true, None, false) + VirtioBlockDevice::new(..., read_only=true). Run the vCPU loop, push every MapExtent payload into a Vec<MapExtentMessage>, stash the MapResult payload into an Option<MapResultMessage>. After the loop, call print_map_result(&extents, &result, &args.output). Error path: if the result has a non-ok error, the renderer prints to stderr and run_map returns Err(...) so the process exits non-zero. High effort because: this binds together the KVM plumbing, the byte-layout write, the streaming message consumer, and the error-mapping table. Subtle bugs (wrong config offset, wrong message arm, missing source-attach permission) produce silent wrong output in phase 6 integration tests. |
| 3c | medium | sonnet | none | Implement print_map_result, print_map_result_human, and print_map_result_json per the "Renderer (phase 3 placeholder)" section. Use the state-to-(present, zero, data) translation table above. Phase 3 output is valid and correct but not byte-for-byte qemu-img compatible — phase 4 polishes. Add ≥6 unit tests inside #[cfg(test)] mod map_renderer_tests in src/vmm/src/main.rs (the existing test module convention): empty extents list renders empty JSON array + header-only human; all-Hole extents render correctly; all-Data extents with file_offset render correctly; mixed states; large file_offset (verify hex formatting); error path produces no stdout output. Run make lint, make test-rust. Touches only src/vmm/src/main.rs. |
| 3d | low | sonnet | none | Update ARCHITECTURE.md to add a one-paragraph entry under the host-CLI surface section noting the new instar map subcommand: mirrors qemu-img map (FILENAME, -f, --output, --start-offset, --max-length), single-image v1, refuses --image-opts and VMDK monolithicFlat, renders a placeholder human / JSON output in phase 3 with the byte-for-byte polish in phase 4. Update CHANGELOG.md Unreleased / Added with one line citing the new subcommand. Run pre-commit run --all-files. |
Total: 4 commits.
Out of scope for phase 3¶
- Byte-for-byte qemu-img output parity (phase 4).
- Streaming JSON array writer (phase 4 — phase 3 buffers).
- Backing-chain composition (master-plan follow-up).
- Snapshot-targeted mapping (
-l SNAPSHOT, master-plan future work). --image-optsacceptance (rejected here; future work).- Cross-version baseline generation (phase 5).
- Integration tests against real testdata images (phase 6).
- Fuzz target updates (phase 7).
output-profileinfrastructure additions for map (phase 5 if baselines reveal version drift).
Success criteria¶
instar map --helpproduces the documented surface (FILENAME, -f, --output, --start-offset, --max-length, --sector-size, --image-opts).Commands::Map(MapArgs)lands in the clap enum.run_maporchestrates the guest launch end-to-end and consumes streamed extents + the summary into a renderer.print_map_resultproduces valid JSON arrays and valid human-readable tables for both--output=humanand--output=json.- Error paths (HAS_BACKING, INVALID_SOURCE, INVALID_OPTION, IO) print a clear stderr message and exit non-zero.
make instarbuilds the full toolchain (the placeholderrun_mapstub from 3a is replaced by 3b).make lintclean.make test-rustpasses; new tests inmod map_renderer_testsadd ≥6.pre-commit run --all-filesclean.ARCHITECTURE.mdandCHANGELOG.mdupdated.- Running
target/release/instar map fixtures/*.qcow2produces non-error output (informally verified during 3b; formal coverage lands in phase 6).
Risks and mitigations¶
-
MapConfig byte-layout drift between host and guest. The host writes the struct via per-field
write_objcalls at hard-coded offsets; the guest reads via a*const MapConfigcast. If the offsets don't agree the guest reads garbage. Mitigation: step 3b's brief directs the sub-agent to cross-check every offset againstMapConfiginsrc/shared/src/lib.rs. A field-by-field comment block lists the offsets and field widths. (Same approach asrun_measure.) -
Streaming consumer drops MapExtentMessage before the result. The vCPU loop must push every extent into the buffer regardless of order; only the result is the trigger to render. Mitigation: step 3b's brief enforces the pattern "extent → push; result → store + flag" with no early break.
-
Renderer divergence from qemu-img. Phase 3 ships an unpolished renderer. Mitigation: phase 4 has the full byte-for-byte polish and the testdata sweep; phase 3 just needs valid + correct output. Document the placeholder status in the function's doc comment so phase 4 finds it.
-
Empty source / zero virtual size. The renderer must not crash on an empty extents vector or a
virtual_size == 0result. Mitigation: step 3c's tests cover both cases. -
--start-offsetovershoot. If the user passes--start-offsetlarger than the file's virtual size, the guest'sclip_to_windowsilently emits nothing. Phase 3 catches this host-side and returns a clear error. Mitigation: step 3b's brief includes the pre-check; a unit test in 3c does not cover the host-side check (it's caught before the renderer). -
Large-extent buffering host-side. A pathologically fragmented source emits millions of extents. Phase 3 buffers them all. Mitigation: phase 6 / 7 cover this with bounded corpora; phase 4 replaces the buffer with a streaming writer.
-
VMDK descriptor source UX. The guest will reject multi-extent VMDK sources via
VmdkState::init's binary-header parse failure, surfacingERROR_INVALID_SOURCE. The user sees "map: source format unrecognised". Mitigation: the host pre-check (peek_is_vmdk_descriptor) produces a more helpful error pointing at qemu-img.
Back brief¶
Before executing any step, the executing agent should
back-brief: which file is being edited (almost always
src/vmm/src/main.rs), which existing function is the
closest template (run_measure for run_map,
print_measure_result for print_map_result, MeasureArgs
for MapArgs), and which boundary writes use raw memory
casts (the per-field MapConfig write). The reviewer should
verify no step bleeds into phase 4 (output polish), phase 5
(baselines), phase 6 (integration tests), or phase 7
(fuzz).