Bug-report trigger-time snapshot (2026-04-23)¶
Situation¶
Today the bug-report flow takes the screenshot at submit
time: the user hits F12, fills in the type + description,
clicks Capture, and (for Display reports) drags out a region;
generate_bug_report at ryll/src/app.rs:923 then reads the
live surface pixels and encodes them.
This breaks for short-lived display artefacts. The user saw
"static" on the macbook on 2026-04-23, filled the form, and
by the time they submitted the glitch had already left the
screen. screenshot.png in
ryll-bugreport-2026-04-23T08-22-47Z.zip shows a clean
kernel-boot console, not the static the report is about.
The ring-buffer pcap covers the right time range, but without a timestamp marker the analyst has no reliable way to locate "when the user noticed it" in a long capture.
Mission¶
Capture bug-report evidence at the moment the user opens the dialog, not at submit time. Record a trigger timestamp in metadata so the pcap can be correlated. Keep the submit-time surface as a secondary image when the user does region selection, so before/after is visible.
Design¶
When to trigger a capture¶
A trigger event is any of:
- F12 keypress (
ryll/src/app.rs:1244). - Status-bar bug-report button (
ryll/src/app.rs:1437). - USB menu bug-report shortcut (
ryll/src/app.rs:1783). - Channel-error auto-open (
ryll/src/app.rs:2063).
The pedantic observer path (bugreport.rs:995 /
write_pedantic) fires its own report with no user dialog;
it does not need the trigger-time mechanism because the
observer is the moment of interest. It still writes the
new metadata fields so the schema stays uniform.
What to capture at trigger time¶
Only when the report type is (or defaults to) Display:
- Pick the largest surface (same rule as
generate_bug_reportuses today). - Clone its RGBA pixels +
(width, height). - Record
SystemTime::now()andself.traffic.elapsed().as_secs_f64(). - Hand the clone to a
std::thread::spawnworker that PNG-encodes it viabugreport::encode_pngand drops the output into anArc<Mutex<Option<Vec<u8>>>>stored on the app.
Non-Display report types skip the snapshot entirely (space saving — operator's call). They still record the trigger timestamps.
If the user changes the report type in the dialog after opening it, the snapshot taken at open-time may no longer be appropriate:
- Opened as Display → switched to Input: discard the snapshot worker and its output on submit (don't include in zip).
- Opened as Input → switched to Display: we've missed the
moment. Fall back to encoding the current surface at submit
time, and mark in metadata
(
triggered_uptime_secs: null, perhaps, or an explicittrigger_snapshot: "late"field — to be decided in Phase 2).
Simplest implementation: always take the snapshot on open, regardless of selected type, and only use it for Display submissions. Cheap and covers the type-switch edge case.
What goes into the zip¶
Display reports:
screenshot.png— moment-of-trigger surface. This is the substantive behavioural change.screenshot-region.png— region-crop of the submit-time surface, only when region selection produced a non-empty rectangle. Shows what was on screen when the user finished describing the issue.regionstays inmetadata.jsonas today (coordinates apply to both images if the surface size didn't change; otherwise toscreenshot-region.pngonly).
Non-Display reports: no PNGs (unchanged).
Metadata schema additions¶
metadata.json gains:
triggered_at— ISO 8601 timestamp of when the dialog opened (for pedantic reports, same astimestamp).triggered_uptime_secs— session uptime at trigger.submitted_uptime_secs— session uptime at zip write. Existingsession_uptime_secsis renamed to this for clarity, with a deprecation note. (Or keepsession_uptime_secsas submit time and add onlytriggered_*— less churn; decide in Phase 1.)
Fallback path¶
If the background PNG encode hasn't finished by submit time
(user clicked Capture faster than we encoded), fall back to
encoding the current surface synchronously at submit, same
as today. Log at debug. This keeps the code path bounded
and removes any risk of the UI thread blocking on
Arc<Mutex<Option<Vec<u8>>>>.
Cancel path¶
On dialog cancel (show_bug_dialog = false with no
dialog_action = Some(true)): drop the Arc. The worker
finishes and its output is GC'd with the Arc. No explicit
cancellation primitive needed.
RAM / disk cost¶
A 1024×768 RGBA buffer is ~3 MB. A 2560×1600 is ~16 MB. These sit in RAM only from trigger until the worker finishes encoding — on a modern CPU PNG-encoding ~16 MB is well under a second, during which the user is typing a description. After that we hold only the compressed bytes (tens to hundreds of KB).
No temp file needed; if profiling later shows the clone cost matters we can revisit. The trade-off was explicitly discussed with the operator on 2026-04-23.
Thread choice¶
std::thread::spawn, not tokio::spawn_blocking. PNG
encoding is CPU-bound, has no async dependencies, and should
stay out of the tokio reactor that drives the SPICE
channels. The thread ends when encoding returns and its
JoinHandle is dropped.
Open questions¶
- Do we rename
session_uptime_secsor add new fields and keep it as submit time? Leaning "add new fields" — older bug reports in the wild will still parse. - Should
screenshot-region.pngbe a crop of the submit-time surface (what I've proposed), or a cropped overlay on top ofscreenshot.png? The operator's original wording was "include that as a second image" — reading that literally means a separate image of the current surface, cropped to the region. Confirm in back-brief. - Where does the
Arc<Mutex<Option<Vec<u8>>>>live onstruct App? Natural home is next to the other bug-report UI fields atryll/src/app.rs:251-260. - Does the traffic viewer (
PLAN-bug-reports-phase-06-traffic-viewer.md) need to know about the new fields? Probably just for a "trigger" marker on the timeline — nice-to-have, not in scope here.
Execution¶
Phased plan. Each phase lands as its own commit (or a small stack of commits if the phase ends up larger than expected) and its own PR. Phases run sequentially — phase 2 depends on phase 1's metadata, phase 3 depends on phase 2's snapshot plumbing, phase 4 documents the finished behaviour.
| Phase | Plan | Status |
|---|---|---|
| 1. Metadata plumbing | PLAN-bugreport-trigger-snapshot-phase-01-metadata.md | Complete |
| 2. Trigger snapshot | PLAN-bugreport-trigger-snapshot-phase-02-snapshot.md | Complete |
| 3. Region image | PLAN-bugreport-trigger-snapshot-phase-03-region.md | Complete |
| 4. Docs | PLAN-bugreport-trigger-snapshot-phase-04-docs.md | Complete |
Phase 2 is the risky one (cross-thread lifetime management around the Arc and the egui update loop); its detail plan will call for worktree isolation so we can throw the experiment away if the approach doesn't land cleanly. The other phases are well-understood and can work directly in the tree.
Success criteria¶
pre-commit run --all-filespasses after each commit.make testpasses.- A display bug report captured during a transient artefact
shows the artefact in
screenshot.png, not the post-artefact clean surface. metadata.jsoncontains a usabletriggered_uptime_secsthe analyst can subtract 2-3 seconds from to locate the event in the pcap.- Non-Display reports are no larger than they were before.
- README and ARCHITECTURE describe the new behaviour so future contributors don't have to re-derive it.
Future work¶
- Traffic-viewer marker for the trigger timestamp so the user
can jump straight to the right point in
ryll --captureplayback. - Optional: capture a short video loop (last N frames) at trigger time. Probably over-engineering unless the static comes back and the still snapshot still isn't enough.
Bugs fixed during this work¶
To be filled in during implementation.
Documentation index maintenance¶
Master plan — add a row under Master plans in
docs/plans/index.md dated 2026-04-23 with links to each
phase plan file, and add a line to order.yml so the plan
appears in the documentation navigation bar. Phase files
should not be added to order.yml.