Core concept
HELIX Labelling Framework does not treat labels as perfect, timeless, isolated class values. It prepares label information so that it can be used consistently with Earth Observation feature grids and with machine-learning training pipelines.
Each module can run independently. A simple task can stop after Spatial reconstruction. A richer workflow can continue with context features, temporal matching, helical time features and UST outputs.
Typical workflows
| I want to... | Use this module | Default output |
|---|---|---|
| Inspect available class fields and class values. | Preflight & class schema | helix_class_schema.csv and .json |
| Align vector or raster labels to an EO grid. | Spatial reconstruction | helix_label_hard.tif |
| Use multiple label sources and inspect agreement. | Spatial reconstruction with optional stacks | hard label + optional source/agreement/probability stacks |
| Match labels to EO dates. | Temporal reconciliation | match table and report |
| Create seasonal/cyclic date features. | Helical / wave features | CSV and optional raster feature stacks |
| Create boundary and neighbourhood risk layers. | Context & risk features | edge, diversity, entropy, margin, local purity |
| Create ML-ready soft supervision. | Soft targets & weights | soft targets, uncertainty, training weights |
Class schema and string classes
GeoTIFF class rasters store numeric values. If a vector class field contains strings such as building, tree or water, HELIX maps them automatically to stable integer class IDs before rasterisation.
class_id,class_name,source_value,source_type,include,merge_to,priority,quality_q 1,building,building,vector,1,1,1.0,1.0 2,tree,tree,vector,1,2,1.0,1.0 3,water,water,vector,1,3,1.0,1.0
The one-band hard label raster then stores values 1, 2, 3. The schema stores the meaning of those values. Context, UST and Export can read the schema to preserve class names and band descriptions.
| Column | Meaning |
|---|---|
class_id | Integer value used in hard label rasters and stack band order. |
class_name | Readable class name for reports and band descriptions. |
source_value | Original attribute value from the vector layer or numeric raster value. |
include | 1 = use class, 0 = exclude class. |
merge_to | Optional target class ID for class merging. |
priority | Reserved source/class priority field for advanced fusion. |
quality_q | Class/source quality prior that can be used by UST workflows. |
Module overview
| Module | Main task | Can run alone? | Advanced outputs |
|---|---|---|---|
| 1 Preflight & class schema | Inspect CRS, fields, raster metadata and class values. | Yes | CSV/JSON class schema, HTML report. |
| 2 Spatial reconstruction | Align labels to the EO/reference grid. | Yes | One-hot class stack, probability/support stack, source stack, coverage, agreement, purity. |
| 3 Temporal reconciliation | Match EO dates to label snapshots or validity windows. | Yes | Tolerance, backtracking, nearest/previous/next/static/valid-window matching. |
| 4 Helical / wave features | Represent seasonal time as cyclic features. | Yes | Fourier harmonics, in-between dates, class × helical interactions. |
| 5 Context & risk | Describe spatial ambiguity and neighbourhood structure. | Yes | Multi-radius per-class support, local purity, and optional class-pair context. |
| 6 Soft targets & weights | Create UST: soft labels, uncertainty, weights. | Yes | Per-class confidence, uncertainty and weights. |
| 7 Export & report | Bundle outputs and metadata. | Yes | File copy bundle and raster statistics. |
2 Spatial reconstruction
Spatial reconstruction aligns vector and raster label sources to the selected EO/reference grid. The reference grid defines CRS, pixel size, extent, alignment and NoData logic.
Default output
The default output is a single-band hard class-ID raster:
helix_label_hard.tif pixel value = class ID
One band can contain many classes. For example, values 1–10 represent ten different classes.
Optional stack outputs
| Output | Band logic | Use |
|---|---|---|
helix_spatial_class_stack.tif | one band per class, one-hot 0/1 | Class-wise context, ML stack workflows. |
helix_spatial_probabilities.tif | one band per class, source-vote/support fraction; weighted fusion can use schema priority × quality_q | Soft supervision and ambiguity analysis. |
helix_spatial_source_labels.tif | one band per input source | Audit multiple sources and source disagreement. |
helix_spatial_source_agreement.tif | one global band | UST source-risk input. |
helix_spatial_purity.tif | one global band | Top-class dominance/support. |
Use the primary vector layer when you want QGIS field dropdowns. Additional vector layers are supported in the advanced section. If the selected class field contains strings, HELIX writes the string-to-ID mapping to the schema and avoids collisions with numeric class IDs. Raster source values can also be remapped through schema source_value → class_id rows.
3 Temporal reconciliation
Temporal reconciliation links EO acquisition dates with label snapshots or validity windows. This avoids pretending that all labels are valid at all times.
| Concept | Meaning |
|---|---|
| Nearest label | Use the closest available label date. |
| Previous valid label | Use the latest earlier label. |
| Validity window | Use the label only inside valid-from/valid-to dates. |
| Backtracking | Allow previous labels to remain valid for a controlled number of days. |
| Temporal quality | A quality signal that can reduce UST weights when temporal mismatch is high. |
4 Helical / wave features
Helical features encode cyclic time. Instead of using day-of-year as a linear number, HELIX represents seasonal position with sine/cosine waves:
Additional harmonics represent annual, semi-annual and shorter seasonal cycles. The term “helical” reflects cyclic seasonal repetition combined with forward movement through time.
By default, helical outputs are date/feature bands, not class bands. In advanced mode, HELIX can multiply class/soft-target bands by helical features to create class × time interaction stacks for research workflows. If no class/soft-target stack is available, a single-band hard class-ID raster can be provided and HELIX one-hot encodes it internally.
5 Context & risk features
Context describes whether a label is spatially clear or ambiguous. A clean polygon interior is different from a mixed boundary pixel.
| Output | Meaning |
|---|---|
helix_context_edge_risk.tif | Boundary/mixed-pixel risk using the main radius. |
helix_context_diversity.tif | How often neighbouring pixels differ from the centre class. |
helix_context_entropy.tif | Entropy of class probabilities/support. |
helix_context_margin.tif | Difference between highest and second-highest class support. |
helix_context_class_support_multiradius.tif | Optional one band per class and radius with local neighbourhood support. |
helix_context_local_purity_multiradius.tif | One purity/dominance band per radius. |
helix_context_class_pair_context.tif | Optional ordered class-pair context, e.g. class A near class B. |
6 Soft targets & weights (UST)
UST means Uncertainty-aware Supervision Target. It creates ML-ready supervision that does not force every pixel to be treated as perfectly certain.
| Product | Band logic | Meaning |
|---|---|---|
helix_soft_targets.tif | one band per class | Probabilistic training target. |
helix_uncertainty.tif | one global band | Overall per-pixel uncertainty. |
helix_training_weights.tif | one global band | How strongly the pixel should influence training. |
helix_class_confidence.tif | optional one band per class | Class-wise confidence. |
helix_class_uncertainty.tif | optional one band per class | Class-wise uncertainty diagnostics. |
helix_class_weights.tif | optional one band per class | Class-wise weighted target support. |
The soft target is formed by smoothing the hard/probability label using alpha. Overall weights decrease when uncertainty, edge risk, temporal risk, source disagreement, temporal mismatch or context ambiguity increase. If a class schema provides quality_q values, they reduce effective Q and per-class confidence/weights.
Units and conventions
| Quantity | Unit / convention |
|---|---|
| CRS, extent, pixel size | Inherited from the EO/reference grid. |
| Neighbourhood radius | Pixels. |
| Temporal tolerance/backtracking | Days. |
| Probabilities, purity, agreement, uncertainty, Q | Usually 0–1. |
| Edge risk | 0–100 byte raster or normalized 0–1 internally. |
| Hard labels | Integer class IDs in one raster band. |
| Soft/probability/class stacks | One band per class, ordered by the class schema or class ID list. |
Output naming
| File | Module | Meaning |
|---|---|---|
helix_class_schema.csv/json | Preflight / Spatial | Mapping from source class values to integer IDs and names. |
helix_label_hard.tif | Spatial | Single-band class-ID raster. |
helix_spatial_class_stack.tif | Spatial optional | One-hot class stack. |
helix_spatial_probabilities.tif | Spatial optional | Class support/probability stack. |
helix_context_class_support_multiradius.tif | Context optional | Per-class and per-radius local neighbourhood support. |
helix_context_class_pair_context.tif | Context optional | Ordered class-pair neighbourhood interactions. |
helix_soft_targets.tif | UST | Per-class soft targets. |
helix_uncertainty.tif | UST | Overall uncertainty. |
helix_training_weights.tif | UST | Overall training weights. |
helix_manifest.json | Export | Reproducibility manifest. |
Troubleshooting
QGIS shows mixed German and English labels
The plugin-owned labels are English. QGIS may still translate its own interface groups, such as the Advanced section, according to the QGIS application language.
I do not see a class-field dropdown
Select the label layer as Primary vector label layer. QGIS field menus are linked to that primary layer. Additional vector layers are still supported in the advanced section.
My classes are text values, not numbers
Run Preflight or Spatial directly. HELIX maps text values to integer IDs and writes the mapping to helix_class_schema.csv and helix_class_schema.json.
Should Spatial write confidence and uncertainty?
No. Spatial writes spatial alignment products. Context writes edge/neighbourhood risk. UST writes soft targets, uncertainty and weights.
Why not bilinear resampling for class labels?
Labels are categorical. Use nearest neighbour or mode/majority resampling for class labels.