Smart local audit, delta storage, surgical restore: travel back in time through your QGIS edits
Every commit (attribute, geometry, deletion, insertion) is intercepted and recorded without any manual action.
Filter by layer, operation type, and time window. Up to 500 events per page with full attribute reconstruction.
Individual or batch selection. Schema compatibility check before restore. Rollback on error.
SQLite only. No server, no network connection, no database configuration required.
Journal size, event count, and disk space are monitored continuously. Alerts appear before a problem becomes critical.
Retention policy, purge by age or session, async VACUUM, integrity check, and journal export from a single panel.
What RecoverLand does
- Captures every committed edit on monitored editable layers (attributes, geometry, INSERT, UPDATE, DELETE)
- Delta storage: only the attributes that actually changed are recorded (10–50x volume reduction versus full-row snapshots)
- Smart edit buffer: merges multiple changes on the same feature into one net event before writing
- Automatically filters layer audit fields (
date_modif,updated_by, etc.) so metadata-only changes are not recorded - Paginated history search by layer, operation type, time window, and user
- Selective restore with schema compatibility report before applying
- Schema drift detection between the saved state and the current layer structure
- Datasource alias resolver: reconcile fingerprints after a project path move without losing history
- Writer lock: refuses to open the journal when another live QGIS instance is already writing to it (PID-based, Windows and POSIX)
- Asynchronous batch writes in a dedicated thread (zero UI blocking)
- Startup integrity check (WAL checkpoint, pending-event recovery, schema version validation)
- Automatic user identification from: plugin config → env variable
RECOVERLAND_USER→ OS login → QGIS profile name - Live health monitoring with thresholds on journal size (50 MB / 200 MB / 500 MB) and event count
- Disk space monitoring: auto-disable tracking at critical threshold (100 MB free), warning at 500 MB
- Maintenance dialog: retention policy, purge by age or session, async VACUUM, integrity check, journal export
- Orphan journal cleanup: unsaved-project journals older than 30 days are automatically removed
- Works with all editable formats: GeoPackage, Shapefile, PostGIS, SpatiaLite, GeoJSON, CSV, FlatGeobuf, MS SQL Server, Oracle Spatial
What RecoverLand does not do
- Does not replace a versioning system (Git, GeoGig)
- Does not back up source files, only feature-level changes
- Does not work on non-editable layers (WMS, rasters, virtual layers)
- Does not capture changes made outside QGIS (direct file editing, external scripts)
- Does not synchronize between workstations
- Cannot guarantee restore on weak-identity formats (CSV, repacked shapefiles)
Architecture
The plugin is organized in independent layers. Each layer has a single responsibility and communicates with others through defined interfaces.
Main components
| Component | Role | File |
|---|---|---|
| EditTracker | Listens to QGIS signals (beforeCommitChanges, afterCommitChanges, afterRollBack) and generates audit events | core/edit_tracker.py |
| EditSessionBuffer | In-memory snapshot buffer. Merges multiple changes on the same feature into a single net event before commit | core/edit_buffer.py |
| WriteQueue | Thread-safe queue that writes events to SQLite in batches of 500. Zero UI blocking. | core/write_queue.py |
| JournalManager | Resolves journal path (project folder or QGIS profile), opens/creates the SQLite file, manages read/write connections and orphan cleanup | core/journal_manager.py |
| SearchService | Paginated search and attribute reconstruction from the journal | core/search_service.py |
| RestoreService | Compatibility check, selective restore with per-batch isolation and rollback on error | core/restore_service.py |
| JournalAudit | Consolidated read-only introspection: operation counts, top users, top layers, time range | core/journal_audit.py |
| DatasourceAlias | Reconciles datasource fingerprints across project moves via a transitive alias table | core/datasource_alias.py |
| HealthMonitor | Evaluates journal health (size, event count, disk space) and produces actionable messages for the UI | core/health_monitor.py |
| DiskMonitor | Periodically checks free disk space on the journal volume; disables tracking at the critical threshold | core/disk_monitor.py |
| SupportPolicy | Defines capture/restore support level and identity strength for each QGIS provider | core/support_policy.py |
| AuditFieldPolicy | Identifies and filters layer audit metadata fields so they are never included in delta events | core/audit_field_policy.py |
| LocalSettings | Persists per-project configuration (retention policy, user override, capture options) inside the journal's backend_settings table | core/local_settings.py |
Local trust boundary
The design goal is simple: make the recovery system visible, local, and bounded. Users can understand where data lives, who writes it, and what RecoverLand does not touch.
The journal is a plain SQLite file stored next to the project or inside the QGIS profile for unsaved projects. It can be located, copied, exported, and inspected.
RecoverLand records events and lightweight restore traces, not full duplicated datasets or uncontrolled background replicas of the project.
SQLite Journal
The journal is a single SQLite file per QGIS project. It uses WAL mode (Write-Ahead Logging) to allow concurrent reads and writes without blocking.
Journal location
| Situation | Path |
|---|---|
Saved project at C:/projects/site.qgz | C:/projects/.recoverland/recoverland_audit.sqlite |
| Unsaved project | [QGIS profile]/recoverland/audit/audit_ |
Unsaved-project journals older than 30 days that no longer correspond to an open project are automatically removed at startup (orphan cleanup).
Schema of the audit_event table
Each row represents a single committed change on one feature.
| Column | Type | Role |
|---|---|---|
event_id | INTEGER PK | Auto-incremented unique identifier |
project_fingerprint | TEXT | Identifies the source QGIS project |
datasource_fingerprint | TEXT | Uniquely identifies the source layer |
feature_identity_json | TEXT | Identifies the feature (FID + primary key when available) |
entity_fingerprint | TEXT | Canonical restore key such as pk:id=42 or fid:7 |
operation_type | TEXT | INSERT, UPDATE, or DELETE |
attributes_json | TEXT | Attribute state before the change (delta or full snapshot depending on operation) |
geometry_wkb | BLOB | Geometry in WKB format |
new_geometry_wkb | BLOB | Post-edit geometry when the event schema provides forward geometry data |
user_name | TEXT | User who made the change (auto-resolved) |
created_at | TEXT | UTC timestamp of the event |
restored_from_event_id | INTEGER | Reference used by lightweight restore trace events |
event_schema_version | INTEGER | Schema version of the event payload written by the plugin |
Simplified schema. The real table also contains: layer_id_snapshot, layer_name_snapshot, provider_type, geometry_type, crs_authid, field_schema_json, and session_id.
Local data model
The local database is intentionally compact. A few tables cover settings, datasource registry, edit sessions, and the event stream. The center of gravity is audit_event, which stores the history at feature level.
| Table | Responsibility | Why users should care |
|---|---|---|
audit_event | Immutable event stream of edits and restore traces | Main recovery history, scoped at feature level |
audit_session | Groups edits made in the same session | Makes large operations explainable and purgeable by session |
datasource_registry | Remembers how to find the original layer source | Helps reconnect the history to the right layer later |
backend_settings | Stores local project settings | Avoids hidden system-wide configuration |
schema_version | Tracks database migrations | Prevents silent schema drift inside the journal |
Automatic capture
Capture is transparent. The user edits layers normally in QGIS. RecoverLand intercepts edit signals at commit time.
When capture triggers
| User action | QGIS signal | Result in the journal |
|---|---|---|
| Modify an attribute and save | beforeCommitChanges + afterCommitChanges | UPDATE event with the previous state |
| Move a geometry and save | Same | UPDATE event with the previous geometry |
| Delete a feature and save | Same | DELETE event with the full state before deletion |
| Add a feature and save | Same | INSERT event with the state after creation |
| Cancel edit (rollback) | afterRollBack | Nothing (pending snapshots are discarded) |
Smart edit buffer
The edit buffer (EditSessionBuffer) is the working memory of RecoverLand. It captures the initial state of every modified feature at the start of the edit session and computes the real net effect at commit time.
Why a buffer?
A user may edit the same feature multiple times before saving. Without a buffer, each micro-change would generate a separate event. The buffer merges everything into one event representing the actual delta between the initial state and the final state.
Special cases handled by the buffer
| Scenario | Result in the journal |
|---|---|
| Modify then revert to original value | No event (no-change detected) |
| Insert then delete before commit | No event (net effect is null) |
| Modify an attribute + move geometry | A single UPDATE event with attribute delta + previous geometry |
| Session rollback | Buffer flushed, nothing written |
Audit field filtering
Layer metadata fields (date_modif, updated_by, updated_at, etc.) are automatically excluded from the delta by the AuditFieldPolicy module. If a commit only touches these fields, it is treated as a no-change and nothing is written. This filtering is applied at three levels: recording, reading, and restore.
Delta storage
A naive audit would store the entire row on every change. On a 50-column table, changing one field would store 50 values. RecoverLand uses a differentiated strategy:
| Operation | Strategy | Content stored |
|---|---|---|
| DELETE | Full snapshot | All attributes + WKB geometry (the feature disappears; everything is needed to recreate it) |
| UPDATE | Delta only | Previous values of only the changed attributes + previous geometry if it moved |
| INSERT | Full snapshot | Feature state after creation (enables "undo insert") |
Storage saving
50 attributes stored for every UPDATE, even if only one changed. High volume, noisy preview.
Only the attributes that actually changed are stored. 10–50x volume reduction. The preview directly shows "what changed".
Delta format
The attributes_json field changes semantics depending on the operation:
DELETE: {"all_attributes": {"name": "Dupont", "age": 42, "city": "Paris"}}
UPDATE: {"changed_only": {"city": {"old": "Paris", "new": "Lyon"}}}
INSERT: {"all_attributes": {"name": "Martin", "age": 35, "city": "Lyon"}}
b64: prefix. NaN and Infinity are stored as null with a flag.Identification system
The system uses three levels of identification to precisely locate each feature in each layer of each project.
Layer fingerprint
The fingerprint is the core piece of the identification system. It distinguishes two layers that share the same name but come from different sources.
How it is computed
Format: provider::normalized_source
| Layer type | Fingerprint example |
|---|---|
| Shapefile | ogr::c:/data/plots/vegetation.shp |
| GeoPackage | ogr::c:/data/communes.gpkg|layername=buildings |
| PostGIS | postgres::host=10.241.228.107 port=5432 dbname=MYDB schema=public table=cables |
| SpatiaLite | spatialite::c:/data/local.sqlite|layername=roads |
| GeoJSON | ogr::c:/export/zones.geojson |
Smart restore without actor multiplication
RecoverLand does not restore "by name" and does not create a new actor for every historical row. It restores against the current entity stream, grouped by entity_fingerprint, and writes only lightweight trace records for auditability.
| Mechanism | Implementation intent | User-facing effect |
|---|---|---|
entity_fingerprint | Canonical key like pk:id=42 or fid:7 | The same feature is followed across its event stream instead of being treated as unrelated rows |
| FID pre-resolution cache | Batch lookup before restore | Fewer repeated scans, more predictable restore time |
| Tracker suppression | Capture is disabled while restore runs | No feedback loop, no accidental re-audit inflation |
| Trace event | Reference to the original event via restored_from_event_id | Restore remains auditable without cloning the full event history |
| Default search filters | Trace events excluded from standard totals and listings | The user sees the business history first, not technical bookkeeping noise |
Identity strength
Not all formats offer the same reliability for identifying a feature. RecoverLand qualifies each format with an identity strength level defined in support_policy.py.
| Strength | Meaning | Formats | Restore risk |
|---|---|---|---|
| STRONG | Stable primary key, does not change between sessions | GeoPackage, SpatiaLite, PostgreSQL, Oracle, MS SQL Server, FlatGeobuf | Minimal |
| MEDIUM | FID generally stable but can change after certain operations | Shapefile, GeoJSON | Possible if the file has been re-exported or repacked |
| WEAK | No primary key; FID = row number | CSV, delimited text, Excel | High: row number shifts as soon as a row is added or deleted |
| NONE | No stable identifier; data is not persisted | Memory layer | Restore impossible (capture is informational only) |
Schema drift
Between the time a change is recorded and the time a user tries to restore it, the layer structure may have changed. The schema_drift.py module detects these drifts before any restore.
What the module compares
Each event stores the field schema at capture time (field_schema_json). Before restore, this historical schema is compared against the current layer schema:
| Drift type | Impact | Plugin action |
|---|---|---|
| Field added since capture | The snapshot does not contain this field | Field is left at its default value |
| Field removed since capture | The snapshot contains a field that no longer exists | Value is ignored with a warning |
| Field type changed | Type incompatibility | Conversion attempted; refused if impossible |
| Field renamed | Old name not found | Treated as deleted field + new field |
Integrity and recovery
The integrity.py module verifies journal health at plugin startup and after any incident.
Startup checks
| Check | Action on failure |
|---|---|
PRAGMA integrity_check | User notification if corruption detected; offer to rename the file and create a fresh journal |
| WAL file check | Automatic checkpoint to consolidate pending writes |
schema_version table | Interrupted migration detected; resume or alert |
Recovery file (recoverland_pending.json) | Re-integration of pending events lost during a previous crash |
Asynchronous writes
Events are written to SQLite by a dedicated thread (WriteQueue). The UI thread never waits for the write to complete. The queue uses batches of 500 events per transaction to maximize throughput.
Retention and purge
The journal grows over time. The retention.py module provides four mechanisms to control its size:
Delete events older than a configurable threshold. The default retention policy is 365 days (up to 10 years).
Automatically delete the oldest events once the event count exceeds a configurable maximum (default: 1,000,000).
Delete all events from a specific edit session. Useful to clean up test sessions or mass imports.
Reclaim disk space after a purge. VACUUM runs in a background thread so the UI stays responsive.
Health monitoring
RecoverLand continuously evaluates journal health and disk space. Alerts appear in the info bar at the top of the main dialog and can be clicked to open the maintenance panel.
Journal health thresholds
| Level | Size trigger | Count trigger | Action |
|---|---|---|---|
| Healthy | < 50 MB | < 100,000 events | No message |
| Info | 50–200 MB | 100,000–500,000 events | Informational notice |
| Warning | 200–500 MB | 500,000–1,000,000 events | Suggestion to purge old events |
| Critical | > 500 MB | > 1,000,000 events | Purge strongly recommended |
Disk space thresholds
| Level | Free space | Action |
|---|---|---|
| Warning | < 500 MB free | Warning displayed in the info bar |
| Critical | < 100 MB free | Tracking automatically disabled to prevent data loss |
Memory management
RecoverLand is designed to operate safely even on large datasets.
Edit buffer cap
The EditSessionBuffer holds feature snapshots in RAM during an edit session. Beyond 10,000 features or 200 MB of snapshot data, snapshots are flushed to a temporary SQLite staging table to avoid memory overflow. This is transparent to the user.
Paginated search
Search results are never fully loaded into memory. The SearchService uses paginated SQL queries (up to 500 events per page) to keep RAM usage flat regardless of journal size.
Write queue
The WriteQueue is bounded. If the writer thread cannot keep up (e.g., very intensive editing), back-pressure is applied to prevent unbounded memory growth.
Restore isolation
Restore operations process features in isolated batches. If one feature fails, the others are still processed. No single large transaction locks the database for a long time.
Event lifecycle
If the user rolls back instead of committing, the in-RAM snapshot is simply discarded. Nothing is written to the journal.
User identification
Every event carries a user_name resolved automatically by the user_identity.py module. Sources are tested in order:
RECOVERLAND_USER
→
OS login (os.getlogin)
→
QGIS profile name
→
"unknown"
The first non-empty source is used. No event is ever written with an empty user_name.
Project saving and journal relocation
If a QGIS project transitions from unsaved to saved, the journal is relocated from the QGIS profile directory to the .recoverland/ subfolder next to the .qgz file. Events captured before saving are migrated automatically.