ADR 0106: Hybrid Codebase Index - integration into CascadeIDE, freshness and Semantic Map¶
Status: Accepted · In progress
Date: 2026-05-07 (updated 2026-05-08)
Related ADRs¶
| ADR | Role |
|---|---|
| 0105 | Basic kernel and MCP; this ADR is a CascadeIDE outline (DAL/CCU/DataBus, freshness, Semantic Map) |
| 0102 | Data Acquisition Layer - boundary of external interfaces and adapters |
| 0097 | Cockpit Computing Units (CCU; analogue of LRU Unit) - a layer between transport, meaning and channel |
| 0099 | IDE DataBus - Typed Events and State Projections |
| 0098 | Semantics is primary; document and repository - projections (Semantic-First) |
| 0039 | Workspace navigation - multiple views and "current file + related" |
| 0053 | Intent map and control flow on PFD (control flow) |
| 0056 | Semantic Map adoption of Skia composition pipeline |
| 0067 | Graph-backed surfaces - a general contract for a family of graph screens |
| 0079 | IDS (Ide Display System) - IDE overlay pipeline, orthogonal to CDS |
Outside ADR¶
| Document | Role |
|---|---|
hybrid-codebase-index |
Kernel and MCP Host Repository |
Implementation snapshot¶
| Element | Meaning |
|---|---|
| — | in-proc orchestrator, UI settings, MFD HIS |
Summary¶
- In-proc HybridCodebaseIndex.Core in CascadeIDE; exe MCP - for external hosts.
- Fitting into DAL → Application → CCU → DataBus (0102, 0097, 0099).
- Index freshness, UI settings, border with Semantic Map (0053).
Solution¶
The hybrid-codebase-index tool is designed as a shared library (HybridCodebaseIndex.Core) and a thin MCP host (HybridCodebaseIndex.Mcp) on top of it - the same template as agent-notes (AgentNotes.Core + exe). CascadeIDE primarily connects the kernel in-proc (ProjectReference to Core from the hybrid-codebase-index repository): one process with an editor, a cheap reindex/search call, a common databasePath with what the external MCP expects when working from Cursor. The separately published exe MCP remains for external hosts and for the isolation scenario; placement of I/O and life cycle - in the cabin layers below, without the “divine” MainWindowViewModel.
Matching in the CascadeIDE loop¶
The built-in bundle (in-proc or child process raised by the IDE) must fit into the accepted architecture:
- DAL (0102): workspace bypass, reading files under the index, if necessary, network/processes for embeddings and other external I/O - in the spirit of
Features/<slice>/DataAcquisition/, without throwing raw I/O into the VM. - Orchestrators
Application: scriptsreindex/search, configuration fromsettings.toml, watcher bundle ↔ index core ↔ life cycle of SQLite files. - CCU (0097): collapsing the search result (FTS + vec, index version, hit metadata) into stable DTOs for channels and, if necessary, snapshots (top-N, explain) - without turning the CCU into a second “index engine”.
- File change/index progress and UI subscription events - within the meaning via IDE DataBus (0099).
Freshness in the IDE¶
With frequent saves, the index for .cs / .axaml and related file types should be updated cheaply incrementally (hash, rebuilding of affected chunks), without the UX lag of “full reindex for each keypress”. Semantics: either a scripted call with debounce from the orchestrator, or a single watcher thread agreed with the MCP contract where the agent watches the same databasePath.
For more information about motivation, see ADR 0105 § watchouts freshness - implementation in CIDE is specified here.
Semantic Map and layer B (border)¶
Semantic Map - graph-backed surface (intentions, control flow, Skia pipeline). The hybrid index (layer B in ADR 0105 terminology) is not a canonical Semantic Map graph and does not replace CFG/Roslyn symbolic truth in C#. It gives orientation: top hits, paths, ranges in files, index version and optional input for map declutter - with explicit hit_kind in DTO (0105 § hit_kind).
In 0097 § P3 the candidate SemanticMapInputSnapshot is recorded - layer B after normalization via CCU (0097 § semantic border map) specifies the content of such an input for graph-backed surfaces. More details about what HCI does and doesn’t give to the map (orientation vs graph, UI, non-goals) - ADR 0113.
Composition workflow in the product¶
Integration of the scenario "Hybrid search → Roslyn accuracy" (clause 5 of the roadmap ADR 0105) in the UI/IDE orchestration: hints for the next step (go-to-def, usages, diagnostics), without mixing hit_kind with symbolic truth.
Persistence and synchronization with the orchestrator¶
- Model:
CascadeIdeSettings.HybridIndex→ TOML[hybrid_index](general CascadeIDE user settings file, seeSettingsService; sample -docs/samples/settings.toml). - Clone / Is: HCI section changes participate in the
Save to disk' detector (SaveSettingsIfChanged`). - After changing the parameters through the UI of the main window,
ApplyHybridCodebaseIndexOrchestrationForCurrentSolutionis called: enabling watcher, debounce,scope_mode, taking into account themcp_onlymode withpause_when_mcp_stdio_host, changingindex_dirvia in-proc rebuildCodebaseIndexServiceand reinstalling watchers. - The open/change solution script additionally does
Pokeonauto_reindex_on_solution_open(as before onSolutionPath-change). - Update INDEX/HCI page in MFD:
HybridIndexStateChangedevent in IDE DataBus; subscription when you first go to the page (seeMainWindowViewModel.EnvironmentReadiness).
Agent/Development Operational Notes¶
A short checklist (supported by the file docs/agent-hci-cascadeide-notes-v1.md): where TOML is located, how not to confuse the in-proc path of the database and the external MCP, also index_dir, how to confirm the “aliveness” of the index through MFD or MCP codebase_index_status.
Rollout (sketch, CIDE only)¶
The order is indicative; the first steps fix the library in the IDE, then the life cycle and channels.
-
Connect
HybridCodebaseIndex.Coreinto the CascadeIDE solution
ProjectReferenceon Core (submodule or NuGet when publishing a package - build solution). One index API per IDE process; The response fields contract is the same as that described for tools MCP (0105 - layer B,hit_kind, format version). -
Orchestrator + DAL
Forwardingworkspace_rootand optionally the solution path - as in 0105 § area sketch; file reading and kernel calls are outside the VM boundary, in the spirit of 0102. One SQLite per pair (workspace, solution scope), the same directory that MCP uses with the same configuration. -
Freshness: saves → debounced incremental reindex
Subscription to save documents (or a single watcher consistent with the IDE policy), debounce in theApplicationorchestrator, incremental reindex via Core. Progress events/"index updated" - in IDE DataBus (0099). -
CCU and channels
Rolling up the search result / index status into stable DTOs for IDE Health and, if necessary, a separate “Index / Orientation” channel - without duplicating the index logic in the VM (0097). -
(Optional, parallel or later) Run the published
HybridCodebaseIndex.Mcp.exeas a child process - parity with Cursor, restart isolation, or until Core is built into sln. Does not replace step 1 for the main UX editor. -
(Optional) Link
SemanticMapInputSnapshotafter stabilizing the DTO and the CCU boundary (0097 § semantic map).
Consequences¶
- There is no duplication of index logic in the VM - CCU only packs; one core (Core), MCP - transport for external calls.
- The index format version (
indexFormatVersion/ status) must be consistent between the Core build built into CIDE and the optional MCP exe if the agent uses both loops to the same workspace - explicit check at startup or incompatibility.