Journal maintenance & diagnostics
How RecoverLand keeps its audit journal healthy, compact and reliable. Every purge is explicit, every compaction is confirmed, zero data loss.
Philosophy
The audit journal grows forever by design: every INSERT, UPDATE and DELETE on every tracked layer produces an immutable event. Without maintenance, a journal used daily grows at 1–5 MB/day depending on edit intensity.
RecoverLand takes a conservative stance:
Events are never purged silently. Every delete requires user confirmation or an explicit auto-purge policy.
The user can export a full backup before any purge. Export uses sqlite3.backup() — crash-safe, lock-free.
Size, event count, disk space, schema version and SQLite integrity are checked and reported in one panel.
VACUUM runs in a dedicated thread. Purge batches avoid holding the SQLite write lock for more than milliseconds.
Journal lifecycle
From creation to compaction, the journal passes through four phases. Each phase has a clear trigger and a clear operation.
Health monitoring
The health_monitor.py module evaluates journal health from two axes: size and event count. Each axis has three escalation levels. The worst of both wins.
should_disable_tracking = True. The plugin stops capturing new events to avoid filling the disk and corrupting both the journal and the QGIS project file.| Level | UI feedback | Action proposed |
|---|---|---|
| HEALTHY | Green label “Sain” | None |
| INFO | Blue temporal coverage message | None |
| WARNING | Orange warning + size + count | Open maintenance, configure retention |
| CRITICAL | Red alert + suggestion text | Purge immediately + VACUUM |
Integrity check
The check_journal_integrity() function runs four sequential checks and returns an IntegrityResult NamedTuple. It can be triggered manually from the maintenance dialog or runs at startup.
recoverland_pending.json. At next startup, integrity check validates each event (required keys, known keys, valid operation_type) and re-inserts the good ones. Invalid events are kept in the file with a WARNING log.Retention policy
The retention policy has two axes: time-based (retention_days) and count-based (max_events). Both are user-configurable in the maintenance dialog and persisted in QgsSettings.
| Setting | Key | Default | Range |
|---|---|---|---|
| Conservation | RecoverLand/retention_days | 365 jours | 7 – 3650 |
| Maximum | RecoverLand/max_events | 1 000 000 | 10 000 – 10 000 000 |
| Auto-purge | RecoverLand/auto_purge | False | On / Off |
Purge engine
The purge engine in retention.py deletes events that fall outside the retention policy. It operates in batches of 5 000 to avoid holding the SQLite write lock for extended periods.
Orphan trace purge
After a time-based purge, some restore trace events may reference deleted user events. The _purge_orphan_traces() function cleans these:
DELETE FROM audit_event WHERE restored_from_event_id IS NOT NULL AND restored_from_event_id NOT IN (SELECT event_id FROM audit_event WHERE restored_from_event_id IS NULL)
Session-level purge
purge_by_session(session_id) deletes all events from a specific editing session. Used for test cleanup and debug scenarios.
VACUUM & compaction
SQLite DELETE does not reclaim disk space—it marks pages as free for reuse. Only VACUUM physically rebuilds the file. Because VACUUM rewrites the entire database, it must run in a dedicated thread.
| Step | Operation | Why |
|---|---|---|
| 1 | wal_checkpoint(TRUNCATE) | Flush all WAL pages into the main DB file before VACUUM |
| 2 | VACUUM | Rebuild the database file, releasing freed pages to the OS |
| 3 | Callback | Notify UI on main thread (Qt thread safety via QTimer.singleShot(0)) |
_vacuum_lock.acquire(blocking=False). No crash, no deadlock.Export (backup)
The export uses Python’s sqlite3.Connection.backup() API, which performs a page-level copy that is:
- Crash-safe: the copy is atomic at the page level
- Lock-free for readers: uses a separate read connection
- Consistent: the backup is a point-in-time snapshot
backup() API handles this atomically inside SQLite.Journal analysis
The diagnostic engine (journal_diagnostics.py, BL-OPT-08) runs read-only queries to measure storage patterns and identify optimization potential. It produces a JournalDiagnosticReport displayed in a monospace text dialog.
Metrics explained
| Metric | Meaning | Optimization signal |
|---|---|---|
| Entity distribution | How many events per entity. “51+” bucket means entities with long update chains. | High count in 51+ → UPDATE coalescing would save space (BL-OPT-01) |
| Geometry duplicates | Events where geometry_wkb == new_geometry_wkb (attribute-only UPDATE stored both BLOBs) | Many duplicates → skip BLOB on attr-only UPDATEs (BL-OPT-02) |
| Schema JSON redundancy | Total bytes in field_schema_json vs. distinct schemas | High dedup savings → normalize to lookup table (BL-OPT-03) |
| Invalidated traces | Restore trace events already marked invalid | > 0 → auto-purge them during maintenance (BL-OPT-07) |
| INSERT+DELETE pairs | Same entity created and deleted in same session | > 0 → offline compaction can annul them (BL-OPT-04) |
Thread safety
The maintenance dialog interacts with three connection types. Each has clear threading constraints.
Invariants
The maintenance system guarantees these properties at all times:
No event is deleted without (a) user confirmation or (b) explicit auto-purge policy. Purge count is logged with flog() at INFO level.
If a crash occurs mid-write, unsaved events survive in recoverland_pending.json and are recovered at next startup.
Purge in 5k batches. VACUUM in dedicated thread with non-blocking lock acquire. WriteQueue can always interleave.
Export is a page-level consistent snapshot. No manual file copy. No WAL corruption risk.
journal_maintenance.py (UI dialog) → core/retention.py (purge + VACUUM) · core/integrity.py (check + recovery) · core/health_monitor.py (thresholds + messages) · core/journal_diagnostics.py (analysis report)