Replace the three former modal/dialog booleans (dialog_killer, cancel_on_modal, allow_modal) and the
runtime escape hatch doNotCancelOnModalityStateChange() with one modal enum parameter plus a small
set of composable McpScriptContext methods. The enum has three profiles:
smart_non_modal(default) — sweep leftover modal dialogs (deepest-first) → require non-modal (fail + screenshot if one survives) → commit + save documents + refresh VFS → wait for indexing → run with a monitor that closes + FAILS the call on a modal appearing mid-run. The safe choice for any PSI / editing / build / test work.non_modal— assert a non-modal IDE at the start only (fail + screenshot if modal); no sweep, sync, smart-mode wait, or monitor. For "I need a non-modal start but will prep myself."unleashed— no sweep / checks / validation; run against whatever state exists. Trivial, hardcoded, non-PSI IDE actions only.
Fine control (callable from any mode): closeModalDialogs(): Int, monitorAndCloseModalDialogs(),
allowModalDialog(), syncDocuments(), waitForSmartMode(). The profiles are just sugar over these.
The booleans were co-dependent and contradictory, not orthogonal (see
docs/exec-code-options-redesign.md Revision 4's full 8-row S×A×R table):
dialog_killer∧cancel_on_modalwere AND-ed;cancel_on_modalwas vestigial (never in schema, always true, its monitor already removed).allow_modal(tolerate a modal) ∧ a "close-and-fail" stance form a genuine contradiction — one says proceed, the other fails — yet the booleans let you set both.doNotCancelOnModalityStateChange()was misnamed (it no longer cancels execution, only stops the killer) and was a runtime hatch where a request-time stance belongs.
One enum makes the contradictory combinations unrepresentable, collapses the real 3-state intent space
into a single named choice, and moves every genuinely-independent action onto explicit context methods that
unleashed / non_modal code can opt into on demand. Net surface: 1 enum + 5 context methods, zero
co-dependent combos. Behavior table: docs/exec-code-options-redesign.md (LOCKED, Revision 5 + final
refinements); canonical agent-facing reference: prompts/src/main/prompts/skill/execute-code-tool-description.md.
- Deliberate gap: there is no "close a mid-run dialog and keep going" mode.
smart_non_modalcloses a mid-run modal and fails. A script that must tolerate dialogs popping up has to useunleashed+closeModalDialogs()and accept no PSI-consistency guarantees. This is documented inexecute-code-tool-description.md; revisit only if a real caller needs the close-and-continue stance. - The corpus claimed
waitForSmartMode()"runs automatically before your script" unconditionally. That is true only under the defaultsmart_non_modal— it is skipped undernon_modal/unleashed. Added a short qualifier inide/overview.md,lsp/overview.md,prompt/skill.md,skill/coding-with-intellij-intro.md,skill/coding-with-intellij-threading.md, andide/find-duplicates.md. The unqualified phrasing was the same over-promise flagged in the redesign doc's quorum notes. mcp-steroid-info.md(server system prompt) had no mention of modality at all; added a brief default- behavior note pointing at the canonical tool description.- Old
dialog_killer/allow_modal/cancel_on_modal/doNotCancelOnModalityStateChangereferences had already been purged from the corpus by the prior commit; this pass confirmed none remain.
The original mcp-5 notes below describe the history of the npx-kt monitor
work. Current code has since moved on:
- The user-facing tool is
devrig; Gradle module names remain:npx-ktand:npx. - The active Kotlin package is
com.jonnyzzz.mcpSteroid.devrig. - The npm TypeScript MCP proxy and Kotlin attic implementation were removed.
- Remaining
Npx*names are bridge protocol names for the IDE-side/npx/v1/*routes, not active npm proxy code.
Process notes and friction encountered while implementing the npx-kt
project-monitoring service (push-style HTTP, JSON PID markers).
- PID file format: replace the existing
.<pid>.mcp-steroidtext file with a single-file JSON document (schema=1,pid,mcpUrl,ide,plugin,createdAt). Rejected: sibling-file or hybrid layouts (extra surfaces to keep in sync). - Streaming:
application/x-ndjson, one complete JSON object per line. Rejected: SSE (extra framing for no benefit; we don't need named event types here). - Event payload: full snapshot every emit
(
{type:"snapshot", seq, projects:[...]}). Consumer state stays trivial — replace, don't merge. Rejected: delta events (more consumer state, more edge cases on missed messages / reconnect). - npx-kt wiring: instantiate the new monitoring services in
Main.ktalongside the new stdio MCP server. Legacy proxy path (legacyProxyMain,ServerRegistry,NpxBeacon) is left alone.
- Existing IDE marker file is consumed by three readers
(
npx-kt::Utils.kt::scanMarkers, the npm-distributednpx/TS proxy,:test-helper:NpxProxyInstaller). The TS reader is out of scope for this branch; the Kotlin readers are updated in lockstep with the writer. NoLargeInlineStringsTestand themcp-steroid://URI lint rules don't apply here — no prompt content, nomcp-steroid://URIs added.
Eight commits on top of main, intentionally split so each is reviewable in
isolation:
mcp-5: seed branch with IMPROVEMENTS.mdmcp-5: pid marker file is now schema-versioned JSONmcp-5: legacy npx-kt + test-helper consume the JSON pid markerij-plugin: NDJSON projects-stream endpoint + ProjectsStreamServicenpx-kt: IDE monitoring stack — discovery + per-IDE NDJSON consumernpx-kt: tests for IdeDiscoveryService + IdeMonitorService roundtripmcp-5: close out IMPROVEMENTS.md with test summary + follow-up listmcp-5: pid marker carries the IDE's MCP port + bearer tokenmcp-5: log self-review findings + port/token addition in IMPROVEMENTSmcp-5: attach IntelliJ's bundled MCP server to pid marker via optional descriptormcp-5: log IntelliJ HTTP-server research + optional-descriptor pattern in IMPROVEMENTSnpx-kt: active port-scan discovery of IntelliJ-family IDEs
Test coverage:
:mcp-steroid-server:test—PidMarkerTest(6: roundtrip, pretty-print includes new port + token fields, forward-compat unknown fields, required-field rejection, filename contract, legacy marker without port/token falls back to defaults).:npx-kt:test—MarkerScanTest(7),IdeDiscoveryServiceTest(4),IdeMonitorServiceTest(3: roundtrip snapshots, multi-snapshot updates, Authorization header sent / suppressed), legacyStdioServerProtocolTest(61, untouched).:ij-plugin:test—NpxProjectsStreamRouteTest(4: initial snapshot, flow update, periodic ping, client-info parse with future-field tolerance), full pre-existing suite still green.
- PidMarker omits the MCP server's port and bearer token. The IDE
already owns both (
SteroidsMcpServer.portandNpxBridgeService.token); without them on the marker, npx-kt must parse the URL and has no way to authenticate. Addressed by commit 8 — the newport: Int+token: Stringfields are optional (defaults0/"") so older markers still decode. npx-kt'sIdeMonitorServicenow setsAuthorization: Bearer <token>when the marker carries a non-empty token. IdeMonitorServicedoes not detect when a marker is rewritten with a differentmcpUrlfor the same pid. Workers are keyed by pid; if an IDE restarts its MCP server on a different port within the same process, the worker keeps reconnecting to the old URL. Discovery polls the file every 2 s and picks up the newDiscoveredIdevalue, but the orchestrator'sif (workers.containsKey(pid)) continueskips respawn. Filed as a follow-up; not load-bearing for the current open/close push goal.- The
/projects/streamroute is not yet auth-gated. With the token now on the marker, the IDE can enforceNpxBridgeService.isAuthorized()on the projects-stream route whenever it wants —IdeMonitorServicealready sends the header. Not in this branch to keep the behaviour change focused.
docs/intellij-builtin-servers.mdcatalogues both the platform's always-on Netty HTTP server (REST under/api/*—about,file,settings,installPlugin,toolbox,projectSet,logs,startUpMeasurement, plus plugin-provided handlers) and the optional MCP Server plugin (com.intellij.mcpServer). Use the doc before adding any cross-process integration that talks to the IDE outside of themcp-steroidktor server.- MCP Server plugin is bundled in IDEA 2025.3+ but off by
default. Default port 64342, bound to 127.0.0.1, exposes
/sse(and/streamin 2026.1+). Force-enable system properties:-Didea.mcp.server.force.enable=true,-Didea.mcp.server.force.port=<int>. - Optional dependency wiring (no reflection). We expose the
bundled MCP server's endpoint shape on
PidMarker.intellijMcpServervia the canonical IntelliJ optional-plugin pattern:bundledPlugin("com.intellij.mcpServer")in Gradle for compile access,<depends optional="true" config-file="mcpServer-integration.xml">inplugin.xml, andmcpServer-integration.xmlregisteringIntelliJMcpServerProbeImplonly when the dep is satisfied. When the dep is missing the class is never loaded, so there's noNoClassDefFoundErrorwindow and no reflection involved. - API version skew. The 253 bundle of
McpServerServiceexposesisRunning,getPort,getServerSseUrl;getServerStreamUrlwas added later. The probe derives the streamable HTTP URL from the SSE URL (same listener, sibling path) so the marker carries both. If the/streamendpoint isn't live on an older bundle, the client observes that and falls back to SSE.
- Why active scan, on top of marker discovery? The
.<pid>.mcp-steroidmarker only fires for IDEs that have themcp-steroidplugin installed and started. Active port scanning finds any JetBrains IDE running on localhost (vanilla IntelliJ, PyCharm without our plugin, etc.) by probing/api/abouton the IntelliJ Platform's known port ranges. - Default scan ranges:
63342..63361(Netty built-in HTTP server, the platform picks the first free port in that 20-port window) and64342..64361(bundled MCP Server plugin'sDEFAULT_MCP_PORT + 19fallback range). - Threading model: a fixed-size daemon-thread pool named
mcp-steroid-port-scan-<n>is wrapped as aCoroutineDispatcherviaExecutors.asCoroutineDispatcher(). Probes are launched withasync(scanDispatcher)+awaitAll(). This keeps a slow TCP connect on one port from stalling the stdio MCP server's dispatcher or the marker discovery's polling. - Failure modes are normal: connection-refused on a port (no IDE
listening) and JSON-200-without-IDE-fields (a non-IDE web server
happens to share the port) both filter to
nullwithout propagating. The scan is a probe, not a contract — a non-IDE port is not an error. - Shutdown discipline:
IntelliJPortDiscoveryimplementsCloseable.Main.ktcallsclose()after cancelling the scan loop; the executor is also drained inside thestart { … }job'sfinallyblock (onNonCancellable) so in-flight probes don't leak when the parent scope is cancelled.
- The port-discovery output is currently informational only — it
isn't consumed by
IdeMonitorService(which still streams projects only frommcp-steroid-aware IDEs). Future work: cross-reference the two flows so the monitor can also surface "IntelliJ detected at :63344 but nomcp-steroidplugin loaded" states. - We don't yet probe the MCP server plugin's
/sseendpoint directly to confirm it's enabled. The/api/aboutprobe only tells us the IDE itself is alive. A second pass on the bundled MCP server port range that does a HEAD on/ssewould close that gap.
- The npm-distributed
npx/TypeScript proxy still parses the legacy text format. Updating it to consume the JSON marker (and the new streaming endpoint) is a separate piece of work — different language, different deploy pipeline. - The monitoring stack does not yet feed back into
legacyProxyMain'sServerRegistry. Replacing the polling refresh loop with the push-based state fromIdeMonitorServiceis the natural next step but was kept out of this branch to keep changesets small. - Reconnect-on-half-open:
IdeMonitorServicereconnects on stream close, but does not yet treat "no envelope received in N×ping" as a hint to proactively drop and reconnect. Trivial to add once we have telemetry on how often the IDE actually pings under load.
- Forward/backward compat is universal:
ignoreUnknownKeys = trueapplies to every decoder we touch in this branch — JSON marker file, NDJSON wire frames, any request/response body. PidMarker already does this; the npx-kt monitor and the IDE-side stream parsers must follow suit. - Liveness: IDE emits a
pingenvelope on the projects stream every N seconds (target 5s) so the monitor can distinguish "no project changes" from "TCP socket silently dead". Reading apingresets a stale-watchdog on the consumer; missing it pastN * 3triggers a reconnect. - Client identification: npx-kt announces itself to the IDE on connect
(clientId, clientPid, clientVersion, platform/arch). Cleanest fit is a
POST /npx/v1/projects/streamwhose request body carries the client-info JSON; the response keeps the streaming NDJSON shape. IDE logs the announcement and includesclientInstanceIdon the streamed envelopes for traceability.
-
monitorAndCloseModalDialogs()is missing from the "prep yourself" method lists in three places — agents undernon_modal/unleashedcan't discover the during-run monitor.- In
ExecuteCodeTool.ktthe schema string reads:do those yourself via the context methods (syncDocuments, waitForSmartMode, closeModalDialogs).→ change to(syncDocuments, waitForSmartMode, closeModalDialogs, monitorAndCloseModalDialogs, allowModalDialog). - In the
NON_MODALenum KDoc:The script prepares what it needs via [McpScriptContext] (closeModalDialogs,syncDocuments,waitForSmartMode, ...).→ spell out all five and drop the...:(closeModalDialogs,monitorAndCloseModalDialogs,allowModalDialog,syncDocuments,waitForSmartMode). - In
mcp-steroid-info.md:Finer control (closeModalDialogs(),syncDocuments(),waitForSmartMode(),allowModalDialog()) lives in the script-context methods.→ addmonitorAndCloseModalDialogs()to the list. - Rationale: the article body lists all five context methods, but the three reference surfaces an agent is most likely to read first each omit the one method that lets
non_modalcode reproducesmart_non_modal's during-run protection — anon_modaluser has no path to the monitor. A trailing...is not a discoverable API.
- In
-
allowModalDialog()has no documented scope/duration — agents can't tell if one call covers one dialog, all subsequent dialogs, or the rest of the run.- Article line:
- `allowModalDialog()` — suspend that watcher so a dialog your script opens **on purpose** is left alone (call it just before opening the dialog).→ append a scope sentence, e.g.It suppresses the close-and-fail watcher for the remainder of the call (it does not re-arm); after it the monitor no longer guards against unexpected modals.(Confirm the actual semantics againstMcpScriptContextImplbefore wording — the redesign doc itself never pins down whether the suppression is one-shot or run-long.) - Rationale: an agent opening two dialogs in sequence, or wanting the monitor back after its own dialog closes, has no way to reason about behavior from the current text. The same ambiguity is in the enum KDoc and schema string (
call allowModalDialog() from the script first), which both imply a per-dialog "first" without saying so.
- Article line:
-
The "refresh VFS" step in
smart_non_modalcollides with the standalone "VFS refresh before and after every call" section — it reads as two different, possibly redundant refreshes.- The
smart_non_modalrow/KDoc/schema all listrefresh VFSas a pre-flight step, while the later section statesMCP Steroid schedules two refreshes for youon every call regardless of mode. - Suggested fix in
execute-code-tool-description.md: in the Modality section add one clause —(the before/after VFS refreshes below run in every mode; whatsmart_non_modaladds on top is the commit + save of documents viasyncDocuments()). - Rationale: as written, an agent cannot tell whether
unleashedskips the VFS auto-refresh (it does not) or whethersmart_non_modaldoes a third refresh. Separating "always-on VFS refresh" from "mode-gated commit+save" removes the apparent contradiction.
- The
-
No mode documents post-flight document sync — an agent that edits a
Documentundersmart_non_modaland returns may leave it uncommitted/unsaved.- The article documents only the pre-flight
syncDocuments()forsmart_non_modaland a tail VFS refresh; it never says whethersmart_non_modalre-runs commit+save after the body (earlier redesign revisions had this as step 6; Revision 5's equivalence list dropped it). - Suggested fix: add one line to the
smart_non_modaldescription stating whether a post-flight commit+save runs, e.g.Documents your script edits are committed + saved again after the body returns (post-flight) before the VFS refresh.— or, if it does not, state that explicitly so agents know to callsyncDocuments()at the end of an editing script. - Rationale: the threading table already warns
You still need ...commitAllDocuments() inside your script if the same script both writes and reads back PSI, but that is about intra-script reads; it does not answer whether edits survive to disk after the call under the default mode. This is the single most load-bearing unknown for editing scripts.
- The article documents only the pre-flight
-
Schema-string
smart_non_modalomits the diagnostic-capture detail that the KDoc and article both promise.- Schema string:
a modal that appears mid-run is closed and the run FAILS (if your script opens a dialog on purpose, call allowModalDialog() from the script first).→ add the capture, matching the KDoc/article:...is closed and the run FAILS (a screenshot + thread dump are captured; if your script opens a dialog on purpose, call allowModalDialog() first). - Rationale: the screenshot+thread-dump capture is the agent's primary debugging signal when a run fails on an unexpected modal. The enum KDoc and the article table both state it; the schema string — the surface most clients render inline — drops it, so a schema-only reader doesn't know failure output includes a screenshot to inspect.
- Schema string:
Added a new "## Reviewer suggestions" section with five concrete suggestions (context-method list gaps across 3 files, allowModalDialog() scope, VFS-refresh double-documentation, missing post-flight-sync wording, and schema↔KDoc diagnostic-capture parity.
-
waitForSmartMode()timeout outcome is undefined — agents don't know if a long indexing pass fails the script or is just skipped.- Article/KDoc:
waitForSmartMode() — wait for indexing; asserts non-modal (fails on a modal).→ change to(fails on a modal or if the internal deadlock-safety timeout is reached). - Rationale: The redesign doc identifies the timeout as a "deadlock safety net," but neither the agent-facing tool description nor the KDoc state that hitting this limit is a fatal error. Knowing it fails helps agents decide whether to wait or use
smartReadAction {}for best-effort reads.
- Article/KDoc:
-
unleashedmode has no modal-safe way to commit or save documents —syncDocuments()is unusable when running under a modal.syncDocuments()asserts non-modal and fails. Forunleashedscripts (which specifically run under modals, e.g. to test dialog state), this means they cannot use the standard helper to flush their logs or edits to disk.- Suggested fix: Add
saveAllDocuments()(no assert) or aforceparameter tosyncDocuments(force: Boolean = false)that skips the modal assert for thesaveAllDocumentsandrefreshVfsportions. - Rationale: Scripts running under a modal still need a way to persist their results to the VFS/disk before finishing.
-
closeModalDialogs()returns a low-signalIntcount — agents cannot tell what they closed without a heavy screenshot-analysis turn.closeModalDialogs(): Int→ change tocloseModalDialogs(): List<String>(returning dialog titles or class names) or explicitly document that it logs closed titles to the console automatically.- Rationale: High-signal text feedback (e.g. "Closed 'Extract Method' dialog") is much cheaper for an agent to process than fetching and analyzing a screenshot to confirm it nuked the right thing.
-
Missing standalone
assertNonModal()context method —non_modalscripts have no way to perform a manual gate check without side effects.- The
non_modalprofile is defined as "assert non-modal + nothing else," but this check is currently only exposed to scripts as a side effect ofsyncDocuments()orwaitForSmartMode(). - Suggested fix: Add
assertNonModal()toMcpScriptContext(fails with screenshot if modal). - Rationale: Completes the composable context-API model by exposing the "gate policy" logic as an independent method, allowing
unleashedscripts to check state without triggering a VFS sync or indexing wait.
- The
-
Idempotency of
monitorAndCloseModalDialogs()is undocumented — complicates logic in complex/multi-block scripts.- Suggested fix: Explicitly state in the KDoc and article that calling
monitorAndCloseModalDialogs()is a no-op if the monitor is already active. - Rationale: Simplifies script design by allowing agents to "ensure monitoring is on" before sensitive operations without worrying about double-registering listeners or throwing errors.
- Suggested fix: Explicitly state in the KDoc and article that calling
-
The top-level
modalwording still frames indexing as "modality", even though the enum also controls preparation steps.- In
ExecuteCodeTool.ktenum KDoc:How `steroid_execute_code` treats IDE modality (modal dialogs / indexing) around the script.→ change toHow `steroid_execute_code` prepares the IDE and handles modal dialogs around the script. - In the schema string:
How to treat IDE modality around the script.→ change toIDE preparation and modal-dialog policy for the script. - Rationale: indexing is not modal-dialog handling, and the default also commits/saves documents, so schema-only readers need a broader but more precise frame.
- In
-
non_modalreads like a whole-run guarantee, but the documented behavior only checks the start state.- Article row:
Require a non-modal IDE at the start (fail with a screenshot if modal); do **nothing** else — no sweep, no commit, no indexing wait. **Not sufficient for PSI/editing** unless you call `syncDocuments()` / `waitForSmartMode()` yourself.→ change toRequire a non-modal IDE at the start (fail with a screenshot if modal); do **nothing** else — no sweep, no commit, no indexing wait, and no during-run monitor. The guarantee is start-only: modals appearing later are ignored unless you call `monitorAndCloseModalDialogs()`. **Not sufficient for PSI/editing** unless you call `syncDocuments()` / `waitForSmartMode()` yourself. - Schema string fragment:
'non_modal': only assert a non-modal IDE at the start (fail with a screenshot if modal) and do NOTHING else — no dialog sweep, no commit, no indexing wait;→ change to'non_modal': only assert a non-modal IDE at the start (fail with a screenshot if modal) and do NOTHING else — no dialog sweep, no commit, no indexing wait, no during-run monitor; later modals are ignored unless the script calls monitorAndCloseModalDialogs(); - Rationale: agents may pick
non_modalexpecting protection against modals for the entire script, but it is only the initial gate.
- Article row:
-
monitorAndCloseModalDialogs()does not clearly distinguish monitoring from the immediate sweep.- KDoc:
Start watching for modal dialogs for the rest of the execution. When one appears it is closed→ change toStart watching for modal dialogs for the rest of the execution. This does not perform an immediate sweep; call closeModalDialogs() first if you need to handle a dialog already on screen. When a modal dialog is detected it is closed - Article line:
- `monitorAndCloseModalDialogs()` — start a watcher for the rest of the run: a modal that appears is closed→ change to- `monitorAndCloseModalDialogs()` — start a watcher for the rest of the run; it does not perform an immediate sweep, so call `closeModalDialogs()` first for dialogs already on screen. A modal detected by the watcher is closed - Rationale:
unleashedscripts with an existing modal need to know that starting the monitor is not the same as calling the one-shot cleaner.
- KDoc:
-
closeModalDialogs()diagnostic wording is ambiguous about per-dialog versus per-sweep artifacts.- Context KDoc:
Captures a diagnostic screenshot and a thread dump (recorded with the execution) before closing.→ change toCaptures one thread dump for the sweep and a diagnostic screenshot before each dialog is closed (recorded with the execution). - Article line:
close all showing modal dialogs (deepest-first), capturing a screenshot + thread dump first→ change toclose all showing modal dialogs (deepest-first), capturing one thread dump for the sweep and a screenshot before each dialog is closed - Rationale: this matches the current implementation shape and prevents agents from expecting either one screenshot for the whole sweep or one thread dump per dialog.
- Context KDoc:
Added four Codex reviewer suggestions covering modal framing, start-only non_modal, monitor-vs-sweep behavior, and close-dialog diagnostic wording.
-
The
smart_non_modaldescriptions promise "wait for indexing" but never carry the point-in-time /smartReadActioncaveat — only the standalone context-method bullet does, and a default-mode agent never reads that bullet.- The caveat exists only at
execute-code-tool-description.mdline 82–83 (waitForSmartMode() — ... Point-in-time only — still use `smartReadAction { }` for index-dependent reads.). The three places that describe the default's automatic smart-mode wait all omit it:- Enum KDoc (
ExecuteCodeTool.kt):wait for indexing (smart mode), then run with the modal-dialog monitor active→wait for indexing (smart mode; point-in-time — still use smartReadAction { } for index-dependent reads), then run with the modal-dialog monitor active. - Article table row:
commit + save documents, refresh the VFS, wait for indexing — then run→...refresh the VFS, wait for indexing (point-in-time only; index-dependent reads still need smartReadAction { }) — then run. - Schema string:
commit+save documents, refresh VFS, wait for indexing, then run while watching for modals→...refresh VFS, wait for indexing (point-in-time — use smartReadAction { } for index reads), then run while watching for modals.
- Enum KDoc (
- Rationale: the redesign doc explicitly required "wait_for_smart_mode description must not over-promise (point-in-time; smartReadAction still needed)" (
docs/exec-code-options-redesign.mdline 290). Becausesmart_non_modalruns the wait for the agent, the agent has no reason to read the context-method bullet where the caveat currently lives — so the one surface they do read (the default's own description) is exactly where the over-promise survives.
- The caveat exists only at
-
smart_non_modal's automatic "wait for indexing" is dumb→smart-mode only; it does NOT await external-system (Gradle/Maven) configuration, so a PSI query on a freshly-opened Gradle project can still race import — and the repo's own guidance already prefersawaitConfiguration.- Add a note to the Modality section of
execute-code-tool-description.mdafter the table:The default's indexing wait is dumb→smart-mode only. On a freshly-opened or re-synced Gradle/Maven project the real readiness boundary is project configuration, not smart mode — call Observation.awaitConfiguration(project) yourself (see mcp-steroid://skill/execute-code-gradle) before index-dependent reads; smart_non_modal does not await it. - Rationale:
docs/CLAUDE.mdand the arena recipes already codify "preferObservation.awaitConfiguration(project)+smartReadAction(project)overwaitForSmartMode()for indexed reads" (regressionIntelliJThisLoggerLookupTest). An agent trusting the default to "wait for indexing" before aReferencesSearch/FilenameIndexcall on a just-opened Gradle project gets stale/empty results; nothing in the modal docs warns of this, and the default's reassuring "wait for indexing" phrasing actively hides it.
- Add a note to the Modality section of
-
No guidance tells agents the default is correct for read-only scripts — the visible "commit + save documents" step invites defensive downgrading to
non_modal, which silently drops the smart-mode wait their indexed reads depend on.- Add one line to the Modality section:
For a read-only script (navigation, find-references, inspection report) keep smart_non_modal: the commit + save step is a no-op when nothing is dirty, and you still get the smart-mode wait and during-run monitor. Dropping to non_modal to "skip the write prep" is a mistake — it removes the indexing wait, and index-dependent reads then race dumb mode. - Rationale: an agent reading that the default "commits + saves documents" for a pure
ReferencesSearchwill reasonably assumenon_modalis the leaner, correct choice and switch — losing the exactwaitForSmartMode()that makes indexed reads reliable (the failure mode in #2). The docs say the default "is right for almost everything" but never close the loop that "almost everything" *includes read-only work and here's why the write-side prep is free."
- Add one line to the Modality section:
-
Under
smart_non_modala call can fail before the script body ever runs (gate, bounded commit guard, bounded smart-mode guard), but nothing tells the agent that — so a pre-flight failure gets debugged as a bug in the agent's Kotlin.- Add to the Modality section:
Note: under smart_non_modal the call can FAIL before your script body runs — a modal surviving the initial sweep (gate fail + screenshot), or the bounded commit / smart-mode pre-flight step hitting its deadlock-safety timeout. Such a failure is not a bug in your code; check the screenshot / error text before rewriting the script. - Rationale: the redesign doc gives
smart_non_modala multi-step pre-flight (sweep → gate → commit guard viawithTimeout → ToolCallErrorException→ boundedwaitForSmartMode()), each an independent failure point that runs before the body. The agent-facing surfaces describe these as setup the tool does for you, with no hint they can fail standalone — so the natural response to a failure is to edit the (innocent) script and burn a retry turn, rather than inspect the captured diagnostics.
- Add to the Modality section:
Added four new suggestions distinct from the earlier Claude/Gemini/Codex passes: default-mode point-in-time caveat parity, the awaitConfiguration gap for external-system projects, read-only "keep the default" guidance, and pre-flight failure attribution.
-
Explicitly distinguish "Modal Dialog" from "Modality State" in failure triggers.
- Target:
ExecuteCodeTool.ktschema/KDoc andexecute-code-tool-description.md. - Change: "fail ... if a modal survives" → "fail ... if a modal dialog survives (non-dialog modality like background progress is tolerated but skips prep steps like sync/wait)."
- Rationale: Revision 4 of the design doc clarifies that progress-only modality is tolerated. However, the agent-facing docs use the broad term "modal," which technically includes background indexing. Clarifying that only
DialogWrapperinstances trigger a fatal failure prevents agents from fearing background tasks will arbitrarily break their scripts.
- Target:
-
Include
project.save()insyncDocuments()or add a standalonesaveProject()context method.- Target:
McpScriptContextAPI andexecute-code-tool-description.md. - Change: Expand
syncDocuments()to includeproject.save()or addsaveProject(). - Rationale: Currently,
syncDocuments()focuses onDocumentandPSIpersistence. For scripts that modify project structure (adding modules, changing libraries, or editing.ideafiles), saving documents is insufficient. Ensuring project-level settings are flushed to disk is essential for subsequent external tools (likegreporBash) to see the updated state.
- Target:
-
Gate failure (
smart_non_modal/non_modal) should capture a Thread Dump for consistency.- Target:
ExecuteCodeTool.ktschema andModalModeKDoc. - Change: "fail with a screenshot" → "fail with a screenshot + thread dump."
- Rationale: The during-run monitor already captures both. The initial gate failure often occurs because a modal dialog is stuck due to a background process or a deadlock. Providing the thread dump at the gate prevents a diagnostic "blind spot" when the IDE is already in a bad state before the script starts.
- Target:
-
Provide a scoped
withModalDialogAllowed { ... }context method.- Target:
McpScriptContextAPI andexecute-code-tool-description.md. - Change: Add a lambda-based
withModalDialogAllowed { ... }helper. - Rationale:
allowModalDialog()currently leaves it ambiguous whether the suppression is one-shot or run-long (as noted by Claude). A scoped version is idiomatically safer for Kotlin scripts, ensuring the monitor is re-armed automatically after the intended interaction even if the block throws an exception.
- Target:
-
non_modaltells agents to self-prep withcloseModalDialogs(), but the body never runs if a modal exists at the initial gate.- Schema string fragment:
do those yourself via the context methods (syncDocuments, waitForSmartMode, closeModalDialogs)→do document/index prep yourself via context methods (syncDocuments, waitForSmartMode); if you may need to close an already-open modal from the script, use modal=unleashed and call closeModalDialogs() first. - Article row fragment:
"I need a non-modal IDE but will manage commits / indexing / dialogs myself."→"I need a clean non-modal start and will manage commits / indexing / later dialogs myself." - Rationale:
non_modalfails before user code on an existing modal, so listingcloseModalDialogs()as a way to prepare that mode is misleading for the exact leftover-dialog case agents would try to handle.
- Schema string fragment:
-
unleashedis documented as only "trivial" work, but the locked behavior table uses it for intentional modal-dialog workflows.- Article row:
Trivial / hardcoded IDE actions only. NOT for PSI or code-editing flows (no consistency guarantees).→Intentional modal-dialog workflows (open/inspect/screenshot/close a dialog yourself) and trivial hardcoded IDE actions. NOT for PSI or code-editing flows (no consistency guarantees). - Schema string fragment:
for trivial / hardcoded IDE actions ONLY, never for PSI/editing.→for intentional modal-dialog workflows or trivial / hardcoded IDE actions ONLY, never for PSI/editing. - Rationale: without a positive modal-dialog example, agents may avoid the one mode that the design explicitly requires for tests and UI-management scripts where a modal must survive long enough to inspect.
- Article row:
-
McpScriptContexttop-level KDoc still sayswaitForSmartMode()is automatic without naming the default mode.- Context KDoc:
waitForSmartMode() is called automatically before your script starts.→Under the default modal=smart_non_modal profile, waitForSmartMode() is called automatically before your script body starts; other modal modes must call it explicitly if they need it. - Quick-reference comment:
// waitForSmartMode() is called automatically before your script starts→// Under modal=smart_non_modal, waitForSmartMode() is called automatically before your script body starts - Rationale: this stale context API doc contradicts the new
non_modal/unleashedsemantics and can mislead agents reading generated context docs instead of the tool article.
- Context KDoc:
-
The MCP server instruction text says the default "monitors for modals" but omits that the monitor closes them and fails the call.
mcp-steroid-info.mdsentence:then monitors for modals during the run.→then closes any modal dialog that appears during the run and fails the call with diagnostics.- Rationale: "monitors" sounds observational; the default is actively destructive/failing, which is the key fact an agent needs before running UI actions under
smart_non_modal.
Added four Codex (GPT-5) suggestions covering non_modal self-prep wording, legitimate unleashed modal workflows, stale context KDoc auto-wait wording, and active monitor behavior in the server text.
-
The "Quick Start" bullet — the most-read surface in the whole article — describes
smart_non_modal's pre-flight but stops at "all before your script", omitting the during-run monitor and its fail-the-call behavior.execute-code-tool-description.mdlines 53–55:With the default \modal=smart_non_modal`, leftover modal dialogs are closed, the IDE is required non-modal, documents are committed/saved + VFS refreshed, and `waitForSmartMode()` runs — all before your script. See "Modality (the `modal` option)" below.→ append the run-time half:... runs — all before your script; then a monitor watches the run and closes any modal that appears mid-script and FAILS the call (call `allowModalDialog()` first if you open one on purpose). See "Modality (the `modal` option)" below.`- Rationale: an agent that opens a dialog mid-script under the default gets a failed call, which is the single most surprising behavior of the mode. The Quick Start is where agents calibrate expectations before reading the table; today it presents
smart_non_modalas purely pre-flight setup, so the mid-run fail reads as an inexplicable error rather than documented behavior. (Distinct from the earlier "pre-flight can fail before the body" suggestion — that is about steps before the body; this is the during-run monitor missing from Quick Start.)
-
"fail with a screenshot" / "screenshot + thread dump captured" appears ~10 times across all three surfaces, but nothing tells the agent HOW to retrieve the captured artifacts — agents are promised a debugging signal with no path to it.
- Every mode description and context-method bullet promises a screenshot/thread dump on failure (e.g. article line 69
the call **fails with a screenshot**, line 78screenshot + thread dump captured), but no surface states whether they arrive inline in the tool-call error payload, as a file path, or require a follow-upsteroid_take_screenshotcall. - Suggested fix: add one line to the Modality section of
execute-code-tool-description.md, e.g.When a call fails on a modal, the captured screenshot and thread dump are returned in the tool-call error payload — read them there before retrying; you do not need a separate steroid_take_screenshot call.(Confirm the actual delivery mechanism againstScriptExecutor.kt/ the failure-result builder before finalizing the wording.) - Rationale: the capture is repeatedly sold as the agent's primary diagnostic, but a signal the agent can't locate is worthless — without this line an agent that hits a gate/monitor failure either re-runs blind or burns a turn calling
steroid_take_screenshot(which captures current state, not the state at failure). This is the missing other half of every "fail with a screenshot" promise.
- Every mode description and context-method bullet promises a screenshot/thread dump on failure (e.g. article line 69
-
The relationship between the
timeoutrequest parameter andsmart_non_modal's bounded pre-flight guards (commit +waitForSmartMode) is undocumented — an agent cannot tell whethertimeoutcovers the pre-flight or only the script body.ExecuteCodeTool.ktline 104:"Execution timeout in seconds (default: $defaultTimeoutSeconds, configurable via mcp.steroid.execution.timeout registry key)"→ clarify scope, e.g."Execution timeout in seconds for your script body (default: $defaultTimeoutSeconds, configurable via mcp.steroid.execution.timeout registry key). smart_non_modal's pre-flight commit and smart-mode waits have their own internal deadlock-safety bounds and are not governed by this value."(verify the actual scoping inScriptExecutor.ktfirst).- Rationale: the redesign doc gives the commit and smart-mode steps their own
withTimeout → ToolCallErrorExceptionbounds independent of the usertimeout. An agent that lowerstimeoutexpecting a fast bail-out on a slow-indexing project will still wait out the internal smart-mode bound, and an agent debugging a "timed out" failure can't tell whether its body or the pre-flight blew the budget. Naming the boundary makes the failure attributable.
-
non_modal's "Use it for" cell is self-referential ("I need a non-modal IDE but will manage … myself"), giving no concrete task — unlikesmart_non_modal("PSI / code-editing / build / test") andunleashed("trivial / hardcoded IDE actions").execute-code-tool-description.mdline 70 "Use it for" cell:"I need a non-modal IDE but will manage commits / indexing / dialogs myself."→ give a real example:A non-PSI read that only needs a stable non-modal start and no commit/index prep — e.g. reading run-configuration or VCS-status state — where smart_non_modal's commit + smart-mode wait would be wasted work.- Rationale: every other mode anchors the choice to a concrete task shape;
non_modal's cell just restates the mechanism, so an agent weighing it against the default has nothing to pattern-match its task against and defaults back tosmart_non_modal(or, worse, picksnon_modalfor editing because "I'll manage it myself" sounds capable). A concrete "when this and not the default" example is what makes the three-way choice actionable.
Added a second-pass set of four suggestions distinct from all prior iterations: Quick Start omits the during-run monitor/fail, no surface explains how to retrieve the captured screenshot/thread dump, the timeout param vs pre-flight-bound scope is undocumented, and non_modal's "Use it for" cell lacks a concrete task example.
-
Add
isModal(): BooleanandisSmartMode(): Booleancontext methods for non-fatal state probing.- Rationale: Complements the existing
assertNonModal()suggestion. Essential forunleashedornon_modalscripts to perform safe conditional branching (e.g., "if smart then refactor else log-and-skip") without triggering the fatal assertions built intosyncDocuments()orwaitForSmartMode().
- Rationale: Complements the existing
-
Add an
awaitDispose: Boolean = trueparameter tocloseModalDialogs().- Rationale: Closing a dialog in IntelliJ is often an asynchronous
dispose()call. A script callingcloseModalDialogs()immediately followed bysyncDocuments()might still hit the "modal survives" gate if the IDE hasn't finished clearing the modality stack. Awaiting disposal makes the sequence deterministic and prevents race-condition failures.
- Rationale: Closing a dialog in IntelliJ is often an asynchronous
-
Add a
projectOnly: Boolean = falseparameter tosyncDocuments().- Rationale: In multi-project IDE setups,
FileDocumentManager.getInstance().saveAllDocuments()(whichsyncDocuments()likely uses) is a global, expensive operation that flushes every open project. Allowing agents to scope the sync to the currentprojectsignificantly improves performance and reduces disk I/O for scripts working in a single workspace.
- Rationale: In multi-project IDE setups,
-
Refactor the
modalparameter schema description inExecuteCodeTool.ktfor brevity.- Rationale: The current 20-line description bloats the tool-definition context sent to agents and makes CLI
helpoutput difficult to scan. Move the detailed mode-by-mode prose to themcp-steroid://skill/execute-code-tool-descriptionprompt/article and keep the JSON schema description to a 3-5 line summary with a pointer to the full docs.
- Rationale: The current 20-line description bloats the tool-definition context sent to agents and makes CLI
-
waitForSmartMode()should return the duration waited (in milliseconds) or a boolean indicating if it waited.- Rationale: Provides high-signal performance telemetry. A script that sees a 0ms wait (or
false) knows the indices are already "warm" and can proceed with heavy PSI queries immediately; a long wait informs the agent that the project is "cold" or resource-heavy.
- Rationale: Provides high-signal performance telemetry. A script that sees a 0ms wait (or
-
non_modalcurrently gets a hidden post-flightsyncDocuments(), contradicting the locked "assert-only" profile.- In
ScriptExecutor.kt, change:to:// Post-flight: re-sync to disk iff we are non-modal NOW (a fresh read — the body may have // opened or closed a modal). Skipped for `unleashed` (no disk-consistency contract). if (exec.modal != ModalMode.UNLEASHED && !isModalEdt()) {
// Post-flight: re-sync to disk only for `smart_non_modal`, whose profile promises the // document-consistency contract. `non_modal` is intentionally start-gate-only. if (exec.modal == ModalMode.SMART_NON_MODAL && !isModalEdt()) {
- Rationale: the design table says
non_modaldoes no sweep, sync, smart-mode wait, or monitor; silently syncing after the body makes the mode more stateful than its schema/KDoc/article promise.
- In
-
waitForSmartMode()is still unbounded in implementation, despite the locked design calling it bounded.- In
McpScriptContextImpl.kt, add a timeout constant afterSYNC_DOCUMENTS_TIMEOUT:and wrap the existing/** Deadlock guard for [waitForSmartMode] when indexing never reaches smart mode. */ private val WAIT_FOR_SMART_MODE_TIMEOUT = 60.seconds
suspendCancellableCoroutine { ... }body:to:try { suspendCancellableCoroutine { cont ->
with a matchingtry { withTimeout(WAIT_FOR_SMART_MODE_TIMEOUT) { suspendCancellableCoroutine { cont ->
catch (e: TimeoutCancellationException)that captureswaitForSmartMode-timeoutand throws aToolCallErrorException. - Rationale: agents are told the smart-mode wait is bounded; an unbounded wait can hang before the script body and makes the new
modaldefault less predictable.
- In
-
The default empty modal sweep captures a thread dump even when there is nothing to close.
- In
McpScriptContextImpl.closeModalDialogs(), change:to:captureThreadDump("closeModalDialogs") val found = dialogWindowsLookup().withDialogWindows(project) { it.size } // killProjectDialogs captures a screenshot before closing each dialog (VisionService). dialogKiller().killProjectDialogs(
val found = dialogWindowsLookup().withDialogWindows(project) { it.size } if (found == 0) return 0 captureThreadDump("closeModalDialogs") // killProjectDialogs captures a screenshot before closing each dialog (VisionService). dialogKiller().killProjectDialogs(
- Rationale:
smart_non_modalcalls this on every execution; empty, healthy runs should not attach diagnostic thread dumps when the docs say diagnostics are captured before closing dialogs.
- In
-
The monitor docs sound event-driven, but the implementation polls once per second.
- In
McpScriptContext.kt, changeStart watching for modal dialogs for the rest of the execution. When one appears it is closedtoPoll for showing modal dialogs for the rest of the execution. A modal dialog still showing at a poll tick is closed. - In
execute-code-tool-description.md, change- `monitorAndCloseModalDialogs()` — start a watcher for the rest of the run: a modal that appears is closedto- `monitorAndCloseModalDialogs()` — poll for showing modal dialogs for the rest of the run; a modal still showing at a poll tick is closed. - Rationale: a brief dialog that opens and closes between 1s checks will not be observed; the wording should set agent expectations to the actual polling semantics.
- In
Added four second-pass Codex suggestions covering hidden non_modal post-sync, the missing smart-mode timeout, empty-sweep diagnostics, and polling-vs-event wording.