# Interface Theory Notes ## The opacity problem with voice-driven development Voice is rich and fast for expressing intent, but it hides the artifact. You cannot skim a diff by listening. Each response says "done" but you don't see what was actually changed. **Risk**: accumulated invisible drift. Each individual change seems correct in isolation, but over time the architecture moves away from what you would have chosen had you read it. The AI makes reasonable local decisions that conflict with your broader intentions. ## The interleaving principle Voice and visual interfaces are complementary, not alternatives: - **Voice** — intent, direction, questions, routing. Fast, natural, hands-free. - **Visual** — verification, review, architectural oversight. Necessary to catch drift. Neither alone is sufficient. A purely voice-driven workflow loses quality control. A purely visual workflow loses speed and naturalness. ## Practical mitigation - **Commit checkpoints**: review diffs visually before committing, even if the work was directed by voice - **Periodic code review**: especially before publishing or sharing — catch architectural drift early - **Voice for high-level, visual for detail**: use voice to say what you want, then verify the implementation matches your intent - **Ask for plans before changes**: "tell me what you're going to do" before "do it" — keeps intent explicit and reviewable ## Git workflow for voice-driven development To make drift visible and reversible: - Assistant works on a **dedicated branch**, not directly on main - **Frequent small commits** per change — much easier to review than one large diff covering a whole session - User reviews the branch visually, cherry-picks or merges what they agree with - Keeps the user in control of what actually lands in the canonical history This pairs naturally with the voice interface: voice for direction, branch history for transparency. --- ## Related This applies to any AI-assisted workflow, not just voice. The opacity of LLM output is an inherent property. Voice just amplifies it by removing the reading step entirely.