HELIX Labelling Framework – Detailed User Guide

Modular QGIS toolbox for preparing heterogeneous Earth Observation labels as EO-grid-aligned, time-aware, context-aware, uncertainty-aware, ML/AI-ready supervision.

Version 1.0.0 · QGIS 3.34+ / QGIS 4.x · User-facing HTML guide

Core concept

HELIX Labelling Framework does not treat labels as perfect, timeless, isolated class values. It prepares label information so that it can be used consistently with Earth Observation feature grids and with machine-learning training pipelines.

PreflightSpatial reconstructionTemporal reconciliationHelical featuresContext riskSoft targets & weightsExport

Each module can run independently. A simple task can stop after Spatial reconstruction. A richer workflow can continue with context features, temporal matching, helical time features and UST outputs.

HELIX uncertainty is label/context/temporal supervision uncertainty. It can use label ambiguity, source agreement, boundary risk, temporal quality and quality priors. EO data quality masks can be added separately as feature-side reliability layers, but they are not the same as label uncertainty.

Typical workflows

I want to...Use this moduleDefault output
Inspect available class fields and class values.Preflight & class schemahelix_class_schema.csv and .json
Align vector or raster labels to an EO grid.Spatial reconstructionhelix_label_hard.tif
Use multiple label sources and inspect agreement.Spatial reconstruction with optional stackshard label + optional source/agreement/probability stacks
Match labels to EO dates.Temporal reconciliationmatch table and report
Create seasonal/cyclic date features.Helical / wave featuresCSV and optional raster feature stacks
Create boundary and neighbourhood risk layers.Context & risk featuresedge, diversity, entropy, margin, local purity
Create ML-ready soft supervision.Soft targets & weightssoft targets, uncertainty, training weights

Class schema and string classes

GeoTIFF class rasters store numeric values. If a vector class field contains strings such as building, tree or water, HELIX maps them automatically to stable integer class IDs before rasterisation.

class_id,class_name,source_value,source_type,include,merge_to,priority,quality_q
1,building,building,vector,1,1,1.0,1.0
2,tree,tree,vector,1,2,1.0,1.0
3,water,water,vector,1,3,1.0,1.0

The one-band hard label raster then stores values 1, 2, 3. The schema stores the meaning of those values. Context, UST and Export can read the schema to preserve class names and band descriptions.

ColumnMeaning
class_idInteger value used in hard label rasters and stack band order.
class_nameReadable class name for reports and band descriptions.
source_valueOriginal attribute value from the vector layer or numeric raster value.
include1 = use class, 0 = exclude class.
merge_toOptional target class ID for class merging.
priorityReserved source/class priority field for advanced fusion.
quality_qClass/source quality prior that can be used by UST workflows.

Module overview

ModuleMain taskCan run alone?Advanced outputs
1 Preflight & class schemaInspect CRS, fields, raster metadata and class values.YesCSV/JSON class schema, HTML report.
2 Spatial reconstructionAlign labels to the EO/reference grid.YesOne-hot class stack, probability/support stack, source stack, coverage, agreement, purity.
3 Temporal reconciliationMatch EO dates to label snapshots or validity windows.YesTolerance, backtracking, nearest/previous/next/static/valid-window matching.
4 Helical / wave featuresRepresent seasonal time as cyclic features.YesFourier harmonics, in-between dates, class × helical interactions.
5 Context & riskDescribe spatial ambiguity and neighbourhood structure.YesMulti-radius per-class support, local purity, and optional class-pair context.
6 Soft targets & weightsCreate UST: soft labels, uncertainty, weights.YesPer-class confidence, uncertainty and weights.
7 Export & reportBundle outputs and metadata.YesFile copy bundle and raster statistics.

2 Spatial reconstruction

Spatial reconstruction aligns vector and raster label sources to the selected EO/reference grid. The reference grid defines CRS, pixel size, extent, alignment and NoData logic.

Default output

The default output is a single-band hard class-ID raster:

helix_label_hard.tif
pixel value = class ID

One band can contain many classes. For example, values 1–10 represent ten different classes.

Optional stack outputs

OutputBand logicUse
helix_spatial_class_stack.tifone band per class, one-hot 0/1Class-wise context, ML stack workflows.
helix_spatial_probabilities.tifone band per class, source-vote/support fraction; weighted fusion can use schema priority × quality_qSoft supervision and ambiguity analysis.
helix_spatial_source_labels.tifone band per input sourceAudit multiple sources and source disagreement.
helix_spatial_source_agreement.tifone global bandUST source-risk input.
helix_spatial_purity.tifone global bandTop-class dominance/support.

Use the primary vector layer when you want QGIS field dropdowns. Additional vector layers are supported in the advanced section. If the selected class field contains strings, HELIX writes the string-to-ID mapping to the schema and avoids collisions with numeric class IDs. Raster source values can also be remapped through schema source_value → class_id rows.

3 Temporal reconciliation

Temporal reconciliation links EO acquisition dates with label snapshots or validity windows. This avoids pretending that all labels are valid at all times.

ConceptMeaning
Nearest labelUse the closest available label date.
Previous valid labelUse the latest earlier label.
Validity windowUse the label only inside valid-from/valid-to dates.
BacktrackingAllow previous labels to remain valid for a controlled number of days.
Temporal qualityA quality signal that can reduce UST weights when temporal mismatch is high.

4 Helical / wave features

Helical features encode cyclic time. Instead of using day-of-year as a linear number, HELIX represents seasonal position with sine/cosine waves:

sin(2π · DOY / year_length), cos(2π · DOY / year_length)

Additional harmonics represent annual, semi-annual and shorter seasonal cycles. The term “helical” reflects cyclic seasonal repetition combined with forward movement through time.

By default, helical outputs are date/feature bands, not class bands. In advanced mode, HELIX can multiply class/soft-target bands by helical features to create class × time interaction stacks for research workflows. If no class/soft-target stack is available, a single-band hard class-ID raster can be provided and HELIX one-hot encodes it internally.

5 Context & risk features

Context describes whether a label is spatially clear or ambiguous. A clean polygon interior is different from a mixed boundary pixel.

OutputMeaning
helix_context_edge_risk.tifBoundary/mixed-pixel risk using the main radius.
helix_context_diversity.tifHow often neighbouring pixels differ from the centre class.
helix_context_entropy.tifEntropy of class probabilities/support.
helix_context_margin.tifDifference between highest and second-highest class support.
helix_context_class_support_multiradius.tifOptional one band per class and radius with local neighbourhood support.
helix_context_local_purity_multiradius.tifOne purity/dominance band per radius.
helix_context_class_pair_context.tifOptional ordered class-pair context, e.g. class A near class B.
Hard-label input works directly. A single-band hard class raster with ten class IDs can produce ten class-support bands per radius. A probability or soft-target stack can also be used directly for soft context.

6 Soft targets & weights (UST)

UST means Uncertainty-aware Supervision Target. It creates ML-ready supervision that does not force every pixel to be treated as perfectly certain.

ProductBand logicMeaning
helix_soft_targets.tifone band per classProbabilistic training target.
helix_uncertainty.tifone global bandOverall per-pixel uncertainty.
helix_training_weights.tifone global bandHow strongly the pixel should influence training.
helix_class_confidence.tifoptional one band per classClass-wise confidence.
helix_class_uncertainty.tifoptional one band per classClass-wise uncertainty diagnostics.
helix_class_weights.tifoptional one band per classClass-wise weighted target support.
alpha = alpha_base + beta_edge · edge_risk + beta_temporal · temporal_risk + beta_source · source_risk + beta_context · context_risk

The soft target is formed by smoothing the hard/probability label using alpha. Overall weights decrease when uncertainty, edge risk, temporal risk, source disagreement, temporal mismatch or context ambiguity increase. If a class schema provides quality_q values, they reduce effective Q and per-class confidence/weights.

Units and conventions

QuantityUnit / convention
CRS, extent, pixel sizeInherited from the EO/reference grid.
Neighbourhood radiusPixels.
Temporal tolerance/backtrackingDays.
Probabilities, purity, agreement, uncertainty, QUsually 0–1.
Edge risk0–100 byte raster or normalized 0–1 internally.
Hard labelsInteger class IDs in one raster band.
Soft/probability/class stacksOne band per class, ordered by the class schema or class ID list.

Output naming

FileModuleMeaning
helix_class_schema.csv/jsonPreflight / SpatialMapping from source class values to integer IDs and names.
helix_label_hard.tifSpatialSingle-band class-ID raster.
helix_spatial_class_stack.tifSpatial optionalOne-hot class stack.
helix_spatial_probabilities.tifSpatial optionalClass support/probability stack.
helix_context_class_support_multiradius.tifContext optionalPer-class and per-radius local neighbourhood support.
helix_context_class_pair_context.tifContext optionalOrdered class-pair neighbourhood interactions.
helix_soft_targets.tifUSTPer-class soft targets.
helix_uncertainty.tifUSTOverall uncertainty.
helix_training_weights.tifUSTOverall training weights.
helix_manifest.jsonExportReproducibility manifest.

Troubleshooting

QGIS shows mixed German and English labels

The plugin-owned labels are English. QGIS may still translate its own interface groups, such as the Advanced section, according to the QGIS application language.

I do not see a class-field dropdown

Select the label layer as Primary vector label layer. QGIS field menus are linked to that primary layer. Additional vector layers are still supported in the advanced section.

My classes are text values, not numbers

Run Preflight or Spatial directly. HELIX maps text values to integer IDs and writes the mapping to helix_class_schema.csv and helix_class_schema.json.

Should Spatial write confidence and uncertainty?

No. Spatial writes spatial alignment products. Context writes edge/neighbourhood risk. UST writes soft targets, uncertainty and weights.

Why not bilinear resampling for class labels?

Labels are categorical. Use nearest neighbour or mode/majority resampling for class labels.