How RecoverLand is built
A full A–Z map of every script, what it owns, how it talks to its neighbours, and which thread runs it. One page to understand the plugin without reading a single Python file.
Vision
RecoverLand replaces a server-side audit trigger with a client-side capture pipeline driven by QGIS events. Because the plugin must work where no DB server runs (Shapefile, GeoPackage, SpatiaLite, Memory) and where the user has no DBA rights (most PostgreSQL setups), the trigger’s guarantees must be rebuilt in Python: atomicity, identity, durability, schema typing, concurrent access.
The result is six functional layers, each with a single responsibility. Domain logic stays pure Python (testable in isolation). Qt and QGIS APIs are confined to the outer rings. SQLite holds the journal. A handful of background threads keep the UI fluid.
Listens to QGIS commit signals. Snapshots feature state before commit, computes the diff after commit, emits one AuditEvent per change.
Single file per project. Append-only event stream with five companion tables (sessions, datasources, aliases, settings, schema version).
Two modes (event-based and temporal). Plan-then-execute. STRICT applies via the QGIS editing buffer with rollback. BEST_EFFORT applies directly per entity.
Integrity check at startup, WAL checkpoint, pending events recovered from disk, disk-space watchdog, retention purge with VACUUM.
The six layers
Each box is a layer, each layer has one role. Dependencies flow downward only: the UI knows about workflows, workflows know about core domain, core domain knows about infrastructure. Nothing flows back up.
| Layer | Owns | Forbidden |
|---|---|---|
| L1 Entry | Plugin lifecycle, signal wiring, backend bootstrap | UI rendering, SQL writes |
| L2 UI | Widgets, dialogs, user interaction | Direct DB I/O, blocking work > 50 ms |
| L3 Threads | Async reads, stats, fetch | QGIS layer mutation (main thread only) |
| L4 Restore orchestration | Chunking, progress, cancellation | Feature-level matching logic |
| L5 Core domain | All business logic, all SQL, all snapshot logic | Qt widgets, blocking sleeps |
| L6 Infrastructure | Compat shims, logging, safety asserts | Domain knowledge |
Module dependency graph
The full plugin in one frame. Each node is a Python file. Each line is a real import. Hover any node to focus it: it lights up in gold, its dependencies and consumers stay readable, everything else fades to grey. A short description is streamed in below, plus clickable chips to walk the graph.
Threading model
Four threads run side by side. The UI thread owns all QGIS object mutations. The writer thread owns all SQL writes. Reader threads only do SELECT. No thread is shared across responsibilities.
| Thread | Owns | Started by |
|---|---|---|
| UI (main) | Widgets, QGIS layer reads/writes, signal handlers, QTimer-driven restore chunks | QGIS itself |
| Writer | Drains the event queue, batches INSERTs, runs WAL checkpoints, recovers pending events | WriteQueue.start() from recover.py |
| Stats | One-shot aggregate queries for the smart bar (debounced 300 ms) | journal_stats_thread |
| Search | One-shot paginated SELECT with light projection (no BLOBs) | local_search_thread |
Capture pipeline
From a user click in QGIS to a persisted row in SQLite. The path crosses three threads and seven modules. Every step has a clear contract.
| Step | Module | Cost | Failure mode |
|---|---|---|---|
| 1 Signal | QGIS itself | O(1) | Some providers do not fire all signals (WFS-T, memory) → support_policy classifies them |
| 2 Snapshot | edit_tracker + edit_buffer | O(N edited features) | Buffer cap 10 000 features / 200 MB → flush + WARNING log |
| 3 Serialize | serialization + geometry_utils | O(attributes + geom size) | Unknown QVariant types log a warning, fall back to string |
| 4 Fingerprint | identity | O(1) per feature | Weak identity on shapefile FID → classified MEDIUM, restore is best-effort |
| 5 Enqueue | write_queue | O(1) non-blocking | Queue full → pending JSON sidecar, replayed at next startup |
| 6 Persist | sqlite_schema + SQLite | O(batch size) | SQLite locked → 3 retries → pending JSON sidecar |
Restore pipeline
Restore is the hardest part of the plugin. It must reconstruct a past state in a present world that may have drifted. Two entry modes fork into one common applier.
_find_by_snapshot has six fallback levels (FID, PK, attrs full, attrs+geom, lenient match, max-FID heuristic). Each level was added because one provider broke the previous one. This is where 90% of the historical bugs live (see the tech debt section).Module catalog — Entry & lifecycle
Two files. Together they own everything that happens between "QGIS starts the plugin" and "QGIS unloads the plugin".
| File | Lines | Owns |
|---|---|---|
__init__.py | 30 | QGIS plugin factory. Compiles translations if needed. Returns RecoverPlugin. |
recover.py | 538 | Detects duplicate installs, opens the journal, starts the writer queue, instantiates the edit tracker, wires QGIS project signals (layersAdded, cleared, readProject), schedules orphan cleanup, periodic disk-space check, status bar widget. |
Module catalog — UI surface
Everything Qt. The dialog is the single point of entry to the plugin from a user perspective. The status bar widget is the always-visible indicator.
| File | Lines | Role |
|---|---|---|
recover_dialog.py | 2 797 | Monolith. Mixes widget construction, restore orchestration, geometry preview lifecycle, smart bar wiring, state machine. Largest tech debt. |
journal_info_bar.py | 236 | Smart bar at the top of the dialog: per-operation tile counters, color-coded health pill. |
journal_maintenance.py | 309 | Maintenance dialog: retention config, manual purge, async VACUUM, integrity check, export. |
status_bar_widget.py | 93 | Persistent indicator in the QGIS status bar (left-click toggles tracking, right-click opens dialog). |
themed_action_icon.py | 123 | SVG toolbar icon recoloured to match QGIS light/dark theme at runtime. |
widgets/themed_logo.py | 186 | Animated themed logo for the dialog header. |
widgets/time_slider.py | 161 | Cutoff date slider for the temporal Rewind mode. |
widgets/restore_mode_selector.py | 80 | Switch between Mode A (event) and Mode B (temporal). |
widgets/restore_preflight_dialog.py | 94 | Confirmation dialog showing the plan summary before apply. |
widgets/toggle_switch.py | 64 | iOS-style toggle switch used in several panels. |
Module catalog — Background threads
Each thread has a single job and lives for one operation. None of them mutate QGIS objects.
| File | Lines | Role |
|---|---|---|
local_search_thread.py | 74 | Runs search_events() in a worker thread. Emits result via Qt signal. |
journal_stats_thread.py | 118 | Debounced (300 ms) aggregate query for the smart bar. |
version_fetch_thread.py | 104 | Fetches post-cutoff events for the Rewind preview. |
qgs_task_support.py | 64 | Abstraction over QThread / QgsTask. Same API on QGIS 3.40 (Qt5) and 4.x (Qt6). |
Module catalog — Contracts & types
The vocabulary of the plugin. Pure data shapes and enums. Zero QGIS, zero Qt. Anything that crosses a layer boundary is one of these types.
| File | Lines | Owns |
|---|---|---|
core/constants.py | 2 | Single constant PLUGIN_NAME. |
core/audit_backend.py | 76 | Defines AuditEvent (21 fields), SearchCriteria, SearchResult, RestoreReport, and the abstract AuditBackend interface. |
core/restore_contracts.py | 164 | Enums (RestoreMode, ConflictPolicy, AtomicityPolicy, PreflightVerdict), PlannedAction, RestorePlan, PreflightReport, volume limits, COMPENSATORY_OPS matrix. |
core/support_policy.py | 136 | Per-provider capture/restore matrix. Decides if a layer is FULL/PARTIAL/INFO support and STRONG/MEDIUM/WEAK identity. |
core/audit_field_policy.py | 51 | Single source of truth for "audit metadata" field names (date_modif, updated_at, gid, etc.). Used by capture, delta, and restore so they agree on what to ignore. |
Module catalog — Identity & data
Everything that turns a live QGIS feature into stable bytes (and back). The reliability of every restore depends on these five modules agreeing on field and geometry comparisons.
| File | Lines | Owns |
|---|---|---|
core/identity.py | 165 | Datasource and feature fingerprints. Normalizes PostgreSQL / MSSQL / Oracle / OGR URIs into canonical strings. Hashes identity to a stable SHA. |
core/user_identity.py | 68 | Resolves the current user name: plugin config → RECOVERLAND_USER env → OS login → QGIS profile → unknown. |
core/serialization.py | 189 | QVariant ⇄ JSON-safe values. compute_update_delta() for the old/new attribute diff. iter_mapped_attributes() applies the field mapping at restore time. |
core/geometry_utils.py | 188 | WKB ⇆ QgsGeometry. Comparison, feature matching, CRS extraction, provider geometry probing. |
core/schema_drift.py | 142 | Compares the field schema captured at audit time with the current layer schema. Produces matched / missing / added / type-changed report. |
Module catalog — Capture path
The path that turns QGIS edits into rows in SQLite. Spans two threads.
| File | Lines | Role |
|---|---|---|
core/edit_buffer.py | 213 | In-memory feature snapshots per session per layer. Bounded at 10 000 features / 200 MB. Only the first snapshot per feature is kept (the pre-edit state). |
core/edit_tracker.py | 802 | Core of capture. Connects to six QGIS signals per layer, snapshots before commit, builds AuditEvents after commit, hands them to the write queue. |
core/write_queue.py | 244 | Dedicated writer thread (RecoverLand-Writer). Bounded queue (50k events). Batch executemany of 500 rows. 3-retry policy on transient errors. Passive WAL checkpoint every 60 s. If queue overflows: pending JSON sidecar on disk. |
Module catalog — Storage & registry
The SQLite layer: schema, opening/closing, file location, settings persistence, datasource bookkeeping.
| File | Lines | Role |
|---|---|---|
core/sqlite_schema.py | 269 | DDL of the six tables, ten indexes. PRAGMAs (WAL, mmap, cache, busy_timeout). Migration ladder v1 → v5. Schema version table. |
core/journal_manager.py | 324 | Locates or creates the SQLite file. Saved project → .recoverland/ next to the .qgz. Unsaved → QGIS profile under a content hash. PID-based file lock against duplicate QGIS instances. Read-only connections for worker threads. |
core/sqlite_backend.py | 54 | Facade implementing AuditBackend. Delegates writes to write_queue, reads to search_service. |
core/local_settings.py | 87 | Per-project settings persisted in backend_settings: retention days, max events, capture toggle, user override. |
core/datasource_registry.py | 223 | Stores URI / provider / authcfg / CRS / geometry type at first commit. Used at restore time to recreate a layer when it is not currently loaded. Resolves DB credentials via QGIS saved connections (passwords never persisted). |
core/datasource_alias.py | 125 | Links an old fingerprint to a new one when a layer moves (path change, renamed DB, provider switch). Transitive resolution bounded to 8 hops to prevent cycles. |
Module catalog — Read & search
The query side of the journal. All reads are paginated and bounded. The lightweight projection skips BLOBs when the UI does not need them.
| File | Lines | Role |
|---|---|---|
core/search_service.py | 272 | Paginated search with multi-criteria filtering. Lightweight mode strips geometry BLOBs. count_events, get_event_by_id, get_distinct_layers, get_distinct_users, summarize_scope. |
core/event_stream_repository.py | 161 | Temporal queries for restore: entity stream, events after cutoff (DESC for reverse replay), count-only variants. All bounded by MAX_EVENTS_PER_RESTORE. |
core/journal_audit.py | 143 | Single-query introspection: top N users, top N layers, per-operation counts, time range. Zero QGIS, safe for workers. |
core/layer_stats_cache.py | 95 | Cache of min/max dates and operation types per datasource. Built in one GROUP BY. Thread-safe for reads after build. |
Module catalog — Restore engine
The largest functional area. Five files share the work of turning past events into a present-day layer mutation, plus one helper for previews and one for in-canvas geometry display.
| File | Lines | Role |
|---|---|---|
core/rewind_dedup.py | 229 | Receives N events post-cutoff. Filters trace events and invalidated ones. Eliminates user events already compensated by a trace. Collapses INSERT+DELETE chains on the same entity. Pure deterministic logic, zero QGIS. |
core/restore_planner.py | 203 | Builds the RestorePlan. Mode A iterates selected events. Mode B calls the stream repository then the dedup. Runs the preflight (volume / drift / coverage). Pure data output. |
core/restore_executor.py | 624 | Applies a plan on a QGIS layer. Checks provider capabilities (AddFeatures, DeleteFeatures, ChangeAttributeValues, ChangeGeometries). Two strategies: STRICT (editing buffer + rollback) and BEST_EFFORT (per-entity direct). |
core/restore_service.py | 823 | Feature-by-feature primitives: re-insert deleted, revert updated, delete inserted. Hosts _find_by_snapshot with six fallback levels. Builds restore trace events (restored_from_event_id). Undo support. |
core/workflow_service.py | 181 | Groups events by datasource fingerprint, finds the target layer in the project, orchestrates per-group restore and per-group undo. Cleans up temporary layers added during restore. |
core/restore_preview.py | 77 | Formats RestorePlan and PreflightReport into a human-readable summary for the confirmation dialog. Zero QGIS. |
core/geometry_preview.py | 76 | Displays the captured geometry of an audit event on the QGIS canvas as a QgsRubberBand. One preview at a time, cleaned on dialog close. |
Module catalog — Health & maintenance
The plugin watches its own state: integrity at startup, disk space periodically, journal size against thresholds, retention purge with VACUUM. All operations are bounded and logged.
| File | Lines | Role |
|---|---|---|
core/health_monitor.py | 206 | Evaluates HEALTHY / INFO / WARNING / CRITICAL based on size, event count, age. Produces translated user messages and remediation suggestions. |
core/disk_monitor.py | 68 | Free-disk check on the journal volume. Triggers tracking disable below the critical threshold. |
core/integrity.py | 261 | Startup integrity check: PRAGMA integrity_check, WAL checkpoint, schema version verification. Reads recoverland_pending.json (events that did not reach SQLite on the last run) and replays them. |
core/retention.py | 187 | Purge by age and by volume. 5k-row batches. Async VACUUM under a mutex. Defaults: 365 days, 1M events max. |
core/db_maintenance.py | 71 | Periodic ANALYZE, quick integrity check, grouped WAL checkpoint. Safe with concurrent readers. |
Module catalog — Infrastructure
The cross-cutting layer. Every other module may use these. None of them know about the domain.
| File | Lines | Role |
|---|---|---|
compat.py | 254 | Single source for Qt5 / Qt6 and QGIS 3.40 / 4.x divergence. All Qt.X, Qgis.X, QgsWkbTypes.Y, QgsVectorDataProvider.Capability.Z go through here. Direct access elsewhere is forbidden. |
core/logger.py | 116 | Rotating file logger (5 × 5 MB) in the QGIS profile directory + QgsMessageLog mirror. flog(), qlog(), timed_op() context manager for elapsed-ms tracking, generate_trace_id() for correlation. |
core/sql_safety.py | 28 | Defense-in-depth assertion. Any f-string SQL fragment passes through assert_safe_fragment(), which rejects unsafe characters. Values are always parameterised separately. |
core/observability.py | 262 | CycleStats accumulator for restore/rewind cycles (raw / deduped / planned / applied / skipped / failed / elapsed_ms). log_cycle_summary() emits one summary line plus anomaly lines. log_state_transition() tracks critical flags. assert_invariant() escalates to CRITICAL on violation. |
core/time_format.py | 97 | Relative ("3 hours ago"), short absolute ("May 12 14:30"), full ISO. compute_history_span() for the smart bar. |
Tech debt map
The architecture is healthy on most axes. The debt is concentrated in three exact places. Naming them here makes them easier to attack later.
2 797 lines, five responsibilities (widget build, restore orchestration, geometry preview, smart bar, state machine). Every restore fix touches this file because the orchestration logic lives inside the UI.
The single decision "which live feature matches this snapshot?" lives in both restore_service.py and restore_executor.py with subtly different rules. Most historical bugs (RW-11 to RW-19) come from this duplication.
191 symbols re-exported. Any module can from .core import X. The real dependency graph is invisible without grep.
Contracts, persistence, capture, compat layer, observability and 30+ small core modules pass the "one file = one responsibility" rule and stay under 300 lines.