94 lines
4.0 KiB
Markdown
94 lines
4.0 KiB
Markdown
# claude-remote — Plan
|
|
|
|
## Purpose
|
|
|
|
Send text to the Claude Code instance running inside a Docker container from the host,
|
|
by locating the correct Konsole window and using xdotool to focus it and paste input.
|
|
|
|
## Current Status
|
|
|
|
Core paste/focus machinery works. The remaining problem is **reliably identifying the correct Konsole window**.
|
|
|
|
---
|
|
|
|
## Window Detection: What We Tried and Why It Failed
|
|
|
|
### Approach 1: `_NET_WM_PID` via xprop
|
|
|
|
Walk up the process tree from the `docker-compose` process to find the Konsole ancestor,
|
|
then call `xprop -id <window> _NET_WM_PID` to match that PID to a window.
|
|
|
|
**Why it failed:** Konsole runs as a single process managing multiple windows. All windows
|
|
owned by the same Konsole instance report the same `_NET_WM_PID`. With 15+ Konsole windows
|
|
running, `get_net_client_list()` returns the first matching window — which is the wrong one.
|
|
|
|
### Approach 2: Read `/proc/<pid>/environ` for `WINDOWID`
|
|
|
|
Konsole sets `WINDOWID` in the environment of each shell it spawns. That variable is
|
|
inherited by child processes (bash → sudo → docker → docker-compose). Reading it from
|
|
the right process in the chain would give us the exact window ID without any ambiguity.
|
|
|
|
**Why it failed:** `/proc/<pid>/environ` is not readable for processes owned by other
|
|
users (or protected by `ptrace_scope`). Permission denied even for same-user processes
|
|
in this environment.
|
|
|
|
### Approach 3: Konsole D-Bus API
|
|
|
|
Query `org.kde.konsole-<pid>` via `qdbus` to enumerate sessions and find which one owns
|
|
the relevant pts/tty.
|
|
|
|
**Why it was ruled out:** D-Bus is not available in this environment.
|
|
|
|
---
|
|
|
|
## Planned Fix: Pass `WINDOWID` into Docker
|
|
|
|
Modify the docker launch command to pass `WINDOWID` as an environment variable:
|
|
|
|
```bash
|
|
docker compose run --rm -e WINDOWID=$WINDOWID claude-code claude --dangerously-skip-permissions
|
|
```
|
|
|
|
Inside the container, `WINDOWID` is then available to the Claude process. It can write
|
|
this value to a known path on a bind-mounted volume (e.g. `/workspace/.claude-windowid`)
|
|
at startup.
|
|
|
|
`find-window.js` on the host reads that file as its primary window detection strategy,
|
|
falling back to the existing process-tree approach for cases where the file is absent.
|
|
|
|
### Steps to implement
|
|
|
|
1. Update the docker-compose config (or launch script) to pass `-e WINDOWID=$WINDOWID`
|
|
2. Add startup logic inside the container to write `$WINDOWID` to `/workspace/.claude-windowid`
|
|
3. Update `find_claude_window` in [find-window.js](find-window.js) to check that file first
|
|
4. Clean up the file on container exit (optional)
|
|
|
|
---
|
|
|
|
## Future: Replace xdotool with Direct PTY Write
|
|
|
|
The current xdotool approach has two significant drawbacks:
|
|
- **Focus stealing** — every paste steals window focus from whatever the user is doing,
|
|
which is especially disruptive when input arrives from background sources like mail-buddy
|
|
- **Fragility** — depends on X11, clipboard state, and window geometry
|
|
|
|
Docker containers launched with `-t` are backed by a `/dev/pts/X` device on the host.
|
|
Writing directly to that device would send input to the container's Claude process with
|
|
no window manager involvement at all — completely invisible to the desktop.
|
|
|
|
**Approach:**
|
|
1. Find the pts device by reading `/proc/<container_pid>/fd/0` symlink on the host
|
|
(points to e.g. `/dev/pts/7`)
|
|
2. Write the text + newline directly: `echo "text" > /dev/pts/7`
|
|
3. May require membership in the `tty` group or a small sudo helper for write permission
|
|
|
|
This would make [claude-remote](claude-remote.mjs) dramatically simpler and eliminate
|
|
focus stealing entirely.
|
|
|
|
**Important caveat — keep PTY input short:**
|
|
Writing large payloads (e.g. full email bodies) directly to the PTY risks triggering
|
|
terminal control sequence interpretation, line length limits, and input buffer overflows.
|
|
The PTY should only carry short trigger commands like `check email` or `new message from mikael`.
|
|
Actual message content should be fetched by Claude via a CCC action, keeping the PTY
|
|
as a lightweight signalling channel only.
|