Dynamically updating layers in QGIS with PostGIS views
View here on youtube: https://www.youtube.com/watch?v=oOhbbEkl4Kg
View here on youtube: https://www.youtube.com/watch?v=oOhbbEkl4Kg
PostGIS 2.0 is out and the awesomness continues! You can install PostGIS 2.0 on Ubuntu using packages which is exactly what I am going to show you here. Read on for details on how to get up and running and do your first simple raster analysis!
Note: You should make good backups first!
Before we begin, you should uninstall your existing postgis packages:
sudo dpkg --purge postgis postgresql-9.1-postgis
Then add a new repository and install PostGIS from there (based on this post):
sudo apt-add-repository ppa:sharpie/for-science # To get GEOS 3.3.2 sudo apt-add-repository ppa:sharpie/postgis-nightly sudo apt-get update sudo apt-get install postgresql-9.1-postgis
Next we should create a new template database (optional but recommended).
createdb -E UTF8 template_postgis2 createlang -d template_postgis2 plpgsql psql -d postgres -c "UPDATE pg_database SET datistemplate='true' WHERE datname='template_postgis2'" psql -d template_postgis2 -f /usr/share/postgresql/9.1/contrib/postgis-2.0/postgis.sql psql -d template_postgis2 -f /usr/share/postgresql/9.1/contrib/postgis-2.0/spatial_ref_sys.sql psql -d template_postgis2 -f /usr/share/postgresql/9.1/contrib/postgis-2.0/rtpostgis.sql psql -d template_postgis2 -c "GRANT ALL ON geometry_columns TO PUBLIC;" psql -d template_postgis2 -c "GRANT ALL ON geography_columns TO PUBLIC;" psql -d template_postgis2 -c "GRANT ALL ON spatial_ref_sys TO PUBLIC;" createdb training -T template_postgis2
Ok now we can load a raster (see sample data download below):
raster2pgsql -s 4326 srtm_4326.tif | psql training shp2pgsql -s 4326 -d -g geom -I places.shp places| psql training
Good - now our spatial database is ready to use - and has raster support! Here is a nice example of what you can do. The query looks up the altitude from the SRTM raster for each place listed using the ST_Value function:
select ST_Value(rast, geom, true) from places, srtm_4326;
It should produce something like this:
Further reading: A really good place to start is the Boston GIS cheatsheets - I am really looking forward to exploring all the new features that are available in PostGIS 2.0, thanks to all those involved in building it!
Having a good world dataset is kind of essential for most GIS users to have as a standard reference dataset, give context to maps and so on. Anita Graser's recent blog post about the Natural Earth Data project added another nice dataset to my world dataset collection. The Natural Earth Data set (I got the 'full' one) is provided as shapefiles. Of course as a PostGIS fan I'd rather have these in my geospatial database, so I wrote a one liner in bash to load all the datasets.
First download the full dataset from here, and then extract the contents into a working directory somewhere. Then simply run this line, adjusting your destination database and schema as needed:
for FILE in `find . -name *.shp`; do \ BASE=`basename $FILE .shp`; \ /usr/lib/postgresql/9.1/bin/shp2pgsql -s 4326 -I $FILE world.$BASE \ | psql gis; done
Ok that looks like four lines above, but only because I added some line breaks so things would be readable on this blog.
Note that you should replace 'world' with the name of the schema you are importing the data into, and 'gis' with the name of the database you are importing your data into. Thanks for the awesome dataset Natural Earth Team!
Just a quickie because I always forget this and need to look it up again. If you have a query in postgresql and you want to include a sequential identifier in the output, you can use the very handy ROW_NUMBER() function. Note, you need Postgresql 8.4+ I believe. Here is a quick example:
create view vw_farms as select ROW_NUMBER() over (order by parcel.farm_name) as id, parcel.farm_name, parcel.geometry from parcel where parcel.farm_name != '' limit 10 ;
Which returns something like this:
id | farm_name ----+------------------- 1 | ADDERLEY 66 2 | ADDERLEY 66 3 | ADDERLEY 66 4 | ADDERLEY 66 5 | ADDERLEY 66 6 | AFDAKS RIVIER 575 7 | AFDAKS RIVIER 575 8 | AFDAKS RIVIER 575 9 | AFDAKS RIVIER 575 10 | AFDAKS RIVIER 575 (10 rows)
This function is documented in the Postgresql docs here.
One more thing - if you would like to use the above technique to bring a view into QGIS which does not otherwise meet QGIS' requirement that the view have a column derived from a primary key or a column with a unique constraint on it, you can cast the row number to int4 like this:
create view vw_farms as select int4(ROW_NUMBER() over (order by parcel.farm_name)) as id, parcel.farm_name, parcel.geometry from parcel where parcel.farm_name != '' limit 10 ;
The last step then is to explicitly tell QGIS to use this column as the primary key. Do this in the PostgGIS add layer dialog, by selecting the view and then selecting the id field from the combo box available in the primary key column as illustrated in the image below.
I love bash and the gnu tools provided on a Linux system. I am always writing little oneliners that help me do my work. Today I was generating some fixtures for django (yes I am finally properly learning to use unit testing in django). I wanted to know how many records where in each table in my database. Here is the little one liner I came up with:
for TABLE in $(psql foo-test -c "\dt" | awk '{print $3}'); do \ psql foo-test -c "select '$TABLE', count(*) from $TABLE;" | grep -A1 "\-\-\-\-" | tail -1; done
It produces output that looks something like this:
auth_group | 0 auth_group_permissions | 0 auth_message | 0 auth_permission | 273 auth_user | 366 auth_user_groups | 0 auth_user_user_permissions | 0 etc.
Ok so I'm feeling a bit left behind - everyone else seems to have jumped over to 9.0 and I have resolutely been sticking to the stable version shipped with ubuntu or debian (depending on the machine I am working on). However a client has a bunch of gis data in pg 9.0 / postgis 1.5 and I needed to be able to replicate their environment. In this article I will describe how I got Postgresql 8.4 running side by side with Postgresql 9.0, both with Postgis installed.
I will assume you already have postgresql 8.4 and postgis installed via Ubuntu apt...
Add this to your /etc/apt/sources.list (as root):
# ppa for postgis 9.x deb http://ppa.launchpad.net/pitti/postgresql/ubuntu natty main deb-src http://ppa.launchpad.net/pitti/postgresql/ubuntu natty main
Then do:
sudo apt-get update
Note: Although pg9.1 beta packages are available, postgis 1.5 wont work with it so use 9.0
sudo apt-get install postgresql-contrib-9.0 postgresql-client-9.0 \ postgresql-server-dev-9.0 postgresql-9.0
You should check that you have the correct pg_config in your path
pg_config --version
If it is not 9.0, link it:
cd /usr/bin sudo mv pg_config pg_config.old sudo ln -s /usr/lib/postgresql/9.0/bin/pg_config .
There are no packages available for postgresql 9 so we need to build from source.
cd ~/dev/cpp wget -c http://postgis.refractions.net/download/postgis-1.5.2.tar.gz tar xfz postgis-1.5.2.tar.gz cd postgis-1.5.2 ./configure
You should get something like this (note building against postgresql 9)
PostGIS is now configured for x86_64-unknown-linux-gnu -------------- Compiler Info ------------- C compiler: gcc -g -O2 C++ compiler: g++ -g -O2 -------------- Dependencies -------------- GEOS config: /usr/bin/geos-config GEOS version: 3.2.0 PostgreSQL config: /usr/bin/pg_config PostgreSQL version: PostgreSQL 9.0.4 PROJ4 version: 47 Libxml2 config: /usr/bin/xml2-config Libxml2 version: 2.7.8 PostGIS debug level: 0 -------- Documentation Generation -------- xsltproc: /usr/bin/xsltproc xsl style sheets: dblatex: convert: /usr/bin/convert
Now build and install
make sudo make install
Apt will install 9.0 side by side with 8.x without issue - it just sets it to run on a different port (5433 as opposed to the default 5432). This means you can comfortably use both (at the expense of some disk and processor load on your system). In order to make sure you are using the correct database, always add a -p 5433 when you wish to connect to 9.0. Your 8.4 should be running unchanged. The 9.0 install will be 'blank'. to begin with (no user created tables). I always make myself a superuser on the database and create and account that matches my unix login account so that I can use ident authentication for my day to day work:
sudo su - postgres createuser -p 5433 -d -E -i -l -P -r -s timlinux
Note the -p option to specify that I am connecting to postgresql 9.0! After you have entered your password twice type ```exit``` to exit the postgres user account.
You can create a new template, install postgis functions into each database that needs them separately, or (as I prefer), install them to template1 so that every database you creates will automatically be spatially enabled. I do this because I very seldom need a non-spatial database, and I like to have postgis there as default.
createlang -p 5433 plpgsql template1 psql -p 5433 template1 -f \ /usr/share/postgresql/9.0/contrib/postgis-1.5/postgis.sql psql -p 5433 template1 -f \ /usr/share/postgresql/9.0/contrib/postgis-1.5/spatial_ref_sys.sql
To do this we will connect to the template1 database and run a simple query:
psql -p 5433 template1 select postgis_lib_version(); \q
Should produce output like this:
WARNING: psql version 8.4, server version 9.0. Some psql features might not work.Type "help" for help. template1=# select postgis_lib_version(); postgis_lib_version --------------------- 1.5.2(1 row)
The last thing to do is to create a new database. As you may have noticed above, the 8.4 command line tools are still default as they have PATH preference, so when working explicitly with pg 9.0, its good to do:
export PATH=/usr/lib/postgresql/9.0/bin:$PATH
in your shell to put postgresql first in your path. A quick test will confirm that it has worked:
createdb --version createdb (PostgreSQL) 9.0.4
With that setup nicely, lets create our new database:
createdb -p 5433 foo psql -p 5433 -l
Update: There are packages available for postgis (see comments below). Also, for convenience you can set your pg port to 5433 for your session by doing:
export PGPORT=5433
The GeoNames dataset provides a list of global placenames, their location and some additional information such as population and so on. It is published under the Creative Commons Attribution 3.0 License, which allows you to use the data for your own purposes. You can download the data by country, or the entire global dataset. In this article, I will walk you though how I downloaded the entire dataset, loaded it into PostgreSQL and added a geometry column so that I could view it in QGIS. Note that you can substitute these instructions for a specific country's data easily.
First up, lets get the data from the geonames download page!
wget -c http://download.geonames.org/export/dump/allCountries.zip
Note the download is around 196mb so if you live in an internet backwater like I do, expect it to take a little while. If the download gets disconnected, just rerun the same command again - the '-c' option tells wget to continue where it left off last time.
Once the data is downloaded, unzip it:
unzip allCountries.zip
You should now have a text file called allCountries.txt weighing in at just under 900mb. Next we can load it into PostgreSQL using a variation of this article. I highly recommend the use of schemas to partition your database into logical units. In the code listings that follow, it is assumed you have a schema called 'world'. If you need to create it, simply do:
create schema world;
From the psql prompt. Since I am only interested in the geoname table at the moment I simply do this in my database.
create table world.geoname ( geonameid int, name varchar(200), asciiname varchar(200), alternatenames varchar(8000), latitude float, longitude float, fclass char(1), fcode varchar(10), country varchar(2), cc2 varchar(60), admin1 varchar(20), admin2 varchar(80), admin3 varchar(20), admin4 varchar(20), population bigint, elevation int, gtopo30 int, timezone varchar(40), moddate date );
You will notice that I extended the alternatenames field size from the original tutorial's 4000 characters to 8000 characters in order to accommodate some longer entries that were causing my imports to fail with 4000 chars.
Next we can import the data (also from the psql prompt):
copy world.geoname (geonameid,name,asciiname,alternatenames, latitude,longitude,fclass,fcode,country,cc2, admin1,admin2,admin3,admin4,population,elevation,gtopo30, timezone,moddate) from '/home/web/allCountries.txt' null as '';
Once again this is similar to the import line used by the original article I used, except I have used a full path to my allCountries.txt file. The import may take a little while depending on the speed of your computer.
When it is completed, you should have a bunch of data (~7.5 million records) in your table:
gis=# select count(*) from world.geoname ; count --------- 7664712 (1 row)
It is going to be useful to have a unique id for every record - QGIS for one would like to have it, so lets add one (in addition to the geonameid field):
alter table world.geoname add column id serial not null; CREATE UNIQUE INDEX idx_geoname_id ON world.geoname (id);
Because I know I will be using some other fields as basis for subset queries etc., I added some indexes on those too:
CREATE INDEX idx_geoname_population ON world.geoname (population); CREATE INDEX idx_geoname_fcode ON world.geoname(fcode);
Ok thats all great, but there is no geometry column yet so we can't view this in QGIS easily. So lets GIS enable the table! First we add a new geometry column:
alter table world.geoname add column the_geom geometry;
Now we populate the geometry column:
update world.geoname set the_geom = st_makepoint(longitude,latitude);
Next we add a constraint to ensure that the column always contains a point record or is null
alter table world.geoname add constraint enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL);
Finally lets index the table on the geometry field:
CREATE INDEX idx_geoname_the_geom ON world.geoname USING gist(the_geom);
Ok next we can connect to the database using QGIS and view our data! Note that you may want to filter the data or set up scale dependent visibility so that QGIS doesn't try to render all 7.5 million points when you zoom out.
I added a query filter like this to the layer properties -> general tab -> subset:
"population" >= 10000 AND fcode != 'PPLA' and fcode != 'PCLI' AND fcode != 'ADM1'
You should experiment and read the geonames documentation in order to define a filter that is useful for your purposes.
The Geonames dataset is a wonderful asset to the GIS community, particularly to those of us who value Free Software and Free Data.
Update 6 Aprl 2011:
I encountered one issue with the above procedure: the SRID for the imported points is -1 when loaded instead of 4326. This cause problems e.g. in mapserver you get an error like this:
ERROR: Operation on two geometries with different SRIDs
To fix the issue I ran the following update query to assign all points a proper SRID:
update world.geoname set the_geom = ST_SETSRID(the_geom,4326);
Note that it takes quite a while to run. When it completes, you can verify that the points are all in the correct CRS by running this little query:
gis=# select distinct ST_SRID(the_geom) from world.geoname;
Which should produce something like this:
st_srid --------- 4326 (1 row)
For good measure, I also added the geoname table to my geometry columns table:
insert into geometry_columns (f_table_catalog,f_table_schema,f_table_name,f_geometry_column, coord_dimension,srid,type) values ('','world','geoname','the_geom',2,4326,'POINT');
Lastly I also gave select permissions to my readonly user (which I use when publishing in a web environment):
grant usage on schema world to readonly; grant select on world to readonly;
I have also fixed the formatting of some of the SQL snippets above for those who struggled to read it within the width of this page.
I spent most of last week in Dar es Salaam, Tanzania. A lovely tropical country in the heart of Africa. I was there as part of a project I am working to create tools for Biodiversity Informatics practitioners. Of course the tools are based on Free Software:Quantum GIS and openModeller.
The attendees at the workshop were entertained by my talk about what FOSS is and why it is important, an introduction to QGIS slideshow (superbly presented by Marco Hugentobler), and ending with a tour of openModellerDesktop. We also did some live demonstrations of QGIS and openModeller, before going on to discuss details about how these tools can be used to support their Biodiversity Informatics workflows.
The meeting was funded by the Global Biodiversity Information Facility (GBIF) with Juan Bello as their representitive, and hosted by the Tanzanian Commission for Science and Technology (COSTECH).
In case you are unfamiliar with the aims of GBIF, they are facilitating the digitisation (or digitization for our american readers) of the worlds biodiversity records - herbarium records, museum collections and so on. COSTECH provides the local infrastructure and staff for the 'TanBif' node in Tanzania.
The meeting also included 'in-country' experts in the fields of GIS, Meteorology, Ecology, IT and so on. I think for all of the attendees, the concept of FOSS was a real eye-opener. African economies can't compare with those in Europe and the USA and the capital outlay for proprietary software that presents an irritation in the Western world is a major burden in the third world. So just knowing that they could dive in and use QGIS was a great revelation.
We finished our workshop a little early on the Friday so Marco and I offered to go along to the COSTECH offices and geo-enable their PostgreSQL species occurrence database and install QGIS on their desktop PC's running Windows XP. In the space of a couple of hours we were done - the major part of which was spent showing the TanBif staff members how to bring up the PostGIS layer in QGIS, perform simple queries and make maps. Having spent days in the past trying to get proprietary software like Oracle and Arc*** configured, optimised, licensed and generally usable, I was struck by just how easy and quick it is to get someone up and running with a robust enterprise ready PostGIS geospatial datastore and a user friendly Free Software desktop GIS like QGIS.
Thanks to the friendly Tanzanian folks for their hospitality - I look forward to my next visit! Here are some piccies from the trip...
Juan Bello telling us about the cool things you can do with a good Biodiversity Information repository.
The workshop attendees (Marco and Juan out of shot)
Marco showing Godfrey how to use QGIS to bring up their PostGIS Biodiversity dataset.
Godfrey proudly showing off his first map (made with QGIS)!
Marco killing a mosquito - he became something of an expert!
I've been having a little poke around with Mapnik today (awesome software!). One of the things on my todo list has been to sort out rendering issues with roads we have been having. Our last iteration described roads something like this:
A style...
<Style name="Freeway30th_style"> <Rule> <LineSymbolizer> <CssParameter name="stroke">rgb(169,170,153)</CssParameter> <CssParameter name="stroke-width">12.26</CssParameter> <CssParameter name="stroke-linejoin">bevel</CssParameter> <CssParameter name="stroke-linecap">round</CssParameter> <CssParameter name="stroke-opacity">1</CssParameter> </LineSymbolizer> <LineSymbolizer> <CssParameter name="stroke">rgb(255,172,88)</CssParameter> <CssParameter name="stroke-width">12.16</CssParameter> <CssParameter name="stroke-linejoin">miter</CssParameter> <CssParameter name="stroke-linecap">round</CssParameter> </LineSymbolizer> </Rule> </Style>
...and this layer definition...
<Layer name="Freeway30th" srs="+init=epsg:&srid;" maxzoom="39105.90277777778"> <StyleName>Freeway30th_style</StyleName> <Datasource> <Parameter name="dbname">&dbname;</Parameter> <Parameter name="estimate_extent">0</Parameter> <Parameter name="extent">&extent;</Parameter> <Parameter name="geometry_field">&geometry_field;</Parameter> <Parameter name="host">&host;</Parameter> <Parameter name="password">&password;</Parameter> <Parameter name="port">&port;</Parameter> <Parameter name="srid">&srid;</Parameter> <Parameter name="table">(SELECT * FROM "l_roads" WHERE "type" = \ 'Freeway' ORDER BY LENGTH(&geometry_field;) DESC) as "l_roads"</Parameter> <Parameter name="type">&datasourcetype;</Parameter> <Parameter name="user">&password;</Parameter> </Datasource> </Layer>
With the idea being to render freeways with a gray outline and orange center. Unfortunately, it doesnt produce good results:
The problem being those little line ends you see making gray splodges at the end of each segment.
Michael Migurski's blog discusses this issue a little in this article but doesnt directly explain how to achieve the desired effect. So here is what you do:
First the styles are split into two...
<Style name="Freeway30th_style-bottom"> <Rule> <LineSymbolizer> <CssParameter name="stroke">rgb(169,170,153)</CssParameter> <CssParameter name="stroke-width">12.26</CssParameter> <CssParameter name="stroke-linejoin">bevel</CssParameter> <CssParameter name="stroke-linecap">round</CssParameter> <CssParameter name="stroke-opacity">1</CssParameter> </LineSymbolizer> </Rule> </Style> <Style name="Freeway30th_style-top"> <Rule> <LineSymbolizer> <CssParameter name="stroke">rgb(255,172,88)</CssParameter> <CssParameter name="stroke-width">12.16</CssParameter> <CssParameter name="stroke-linejoin">miter</CssParameter> <CssParameter name="stroke-linecap">round</CssParameter> </LineSymbolizer> </Rule> </Style>
and then the layer is now rendered as two layers, the bottom layer first, then the top:
<Layer name="Freeway30th-bottom" srs="+init=epsg:&srid;" maxzoom="39105.90277777778"> <StyleName>Freeway30th_style-bottom</StyleName> <Datasource> <Parameter name="dbname">&dbname;</Parameter> <Parameter name="estimate_extent">0</Parameter> <Parameter name="extent">&extent;</Parameter> <Parameter name="geometry_field">&geometry_field;</Parameter> <Parameter name="host">&host;</Parameter> <Parameter name="password">&password;</Parameter> <Parameter name="port">&port;</Parameter> <Parameter name="srid">&srid;</Parameter> <Parameter name="table">(SELECT * FROM "l_roads" WHERE "type" = \ 'Freeway' ORDER BY LENGTH(&geometry_field;) DESC) as "l_roads"</Parameter> <Parameter name="type">&datasourcetype;</Parameter> <Parameter name="user">&password;</Parameter> </Datasource> </Layer> <Layer name="Freeway30th-top" srs="+init=epsg:&srid;" maxzoom="39105.90277777778"> <StyleName>Freeway30th_style-top</StyleName> <Datasource> <Parameter name="dbname">&dbname;</Parameter> <Parameter name="estimate_extent">0</Parameter> <Parameter name="extent">&extent;</Parameter> <Parameter name="geometry_field">&geometry_field;</Parameter> <Parameter name="host">&host;</Parameter> <Parameter name="password">&password;</Parameter> <Parameter name="port">&port;</Parameter> <Parameter name="srid">&srid;</Parameter> <Parameter name="table">(SELECT * FROM "l_roads" WHERE "type" = \ 'Freeway' ORDER BY LENGTH(&geometry_field;) DESC) as "l_roads"</Parameter> <Parameter name="type">&datasourcetype;</Parameter> <Parameter name="user">&password;</Parameter> </Datasource> </Layer>
A much cleaner rendering!
This approach consumes more cpu time and hits your database harder than the 'messier' approach shown first.
Also you can see in the example above, I have adopted Michaels approach of rendering long lines first.
Have fun with your mapnik maps!
Horst and I are spending the week up in Johannesburg at the Satellite Applications Center in Hartebeeshoek. We are doing yet another week long training course (I hope I'm not working the poor guy too hard :-P ). This time we are doing:
Tomorrow we start with the PostGIS component. Horst and I have been compiling some course notes for the PostGIS module which we are making available to the world as per usual. The pdf still has some rendering issues - we are aware of that. The document tries to walk the reader through the basics of using SQL and then some basic activities with PostGIS and working with geometries.
I hope some of you out there find it useful - let us know if you do! Also if you have any improvements to make, we'd love to hear from you.
Here is a quick pic or two from the course:
I have a cron job that backs up my PG databases and emails them to me every night. Today I wanted to upgrade my PostgreSQL 8.3 databases to PG 8.4 so I made a few modifications to my script so that I could dump out my PG data, and then restore it under PG 8.4. In case you are wondering I am doing this because Ubuntu Lucid now ships with PG 8.4 (yay!). I also made the script generate another script to restore the databases. So basically the procedure is to run the script below on your 8.3 cluster, shut that down and bring up your 8.4 cluster, and then restore your databases into that. Here follows my little script:
MYDATE=`date +%d-%B-%Y` MYBACKUPDIR=/home/timlinux/sql_backups MYRECIPIENT=tim@linfiniti.com DBLIST=`psql -l \ | awk '{print $1}' | grep -v "+" | grep -v "Name" | \ grep -v "List" | grep -v "(" | grep -v "template" | \ grep -v "postgres"` for DB in ${DBLIST} do echo "Backing up $DB" FILENAME=${MYBACKUPDIR}/PG_${DB}.${MYDATE}.sql.tar.gz pg_dump -f ${FILENAME} -x -O -F tar ${DB} #If you want to email the database uncomment #below (will cause issues if backups are large) #mpack -s "Daily $DB PG backup for ${MYDATE}" $FILENAME $MYRECIPIENT done echo "Procedure to restore one of your backups:" echo "createdb " echo "pg_restore -F t .sql.tar.gz |psql " echo "Or to restore in a batch make a script like this:" echo "for FILE in /home/timlinux/sql_backups/*; do DB=\$(echo $FILE | \" echo " sed 's\/home\/timlinux\/sql_backups\/PG_//g' | sed 's/.${MYDATE}.sql.tar.gz//g'); " echo " 'Restoring: \$DB'; createdb \$DB; pg_restore -F t \$FILE |psql \$DB; done"
Update 02 May 2010 Uncommented pg_dump line which was inadvertantly commented in my original post.
Hi, I'm Sam. I've been learning a lot here at Linfiniti (thanks to the brilliant team!) Just like to add a quick note on one of the tasks I learnt this week.
I was working on a shapefile of the suburbs in Cape Town. A client required the suburbs to be grouped by region. After the tedious part of manually grouping the suburbs (using a created field, REGION), the unioning (dissolving) of the suburbs proved to be quick and painless through PostgreSQL, using this SQL command that implements the geomunion function:
create table ct_regions as select geomunion(the_geom), "REGION" \ from "ct_suburbs" group by "REGION";
“ct_suburbs” is the original shapefile that was loaded into the PostGIS database using the Quantum GIS 'SPIT' plugin. “REGION” is the class (attribute) that I wish to union by. And ct_regions will be the output shapefile. See the result here:
Before dissolving (ct_suburbs shows suburbs)
After dissolving by suburbs (ct_regions shows collections of suburbs that have been merged)
Hopefully this will be of some use when it comes to your own mapping!
I've posted this before on my old blog so this is a repeat for those looking to get going with PostGIS in a hurry. This procedure should work on Ubuntu Jaunty or Ubuntu Karmic and possibly earlier versions.
sudo apt-get install postgresql-8.3-postgis sudo su - postgres createuser -s -d -r -l -P -E timlinux createdb gis createlang plpgsql gis psql gis < /usr/share/postgresql-8.3-postgis/lwpostgis.sql psql gis < /usr/share/postgresql-8.3-postgis/spatial_ref_sys.sql
Having done that, you should be able to connect to the database by creating a new postgis connection in QGIS. After that you can start to load data into your spatial database using SPIT (QGIS plugin to import shapefiles into QGIS) or the shp2psql command line tool.
I usually upgrade to the next-gen ubuntu release a month or two before it comes out so I can test QGIS and other software on the new platform. Today I was looking for a visual query designer for PostgreSQL because I wanted to make some complex queries with multiple joins. I'm ok for dealing with a few simple joins when hand writing my SQL but when things get complex, having a visual query builder is a great time saver.
I tried a whole bunch of options today:
Finally I took a re-look at PGAdmin III. I have used it on and off over the years but usually settle back to using psql from the command line. Imagine my surprise to see that in version 1.10 that ships with Karmic, it now includes a visual query builder.
This is a real notch in the belt for postgres as we are now able to provide a simple solution for novice users to build their own queries visually!
So you have moved your spatial data into PostGIS and everything is great....until you start wanting to clip it out again. Luckily clipping data within the postgres environment is trivial with the little help of SQL. However I wanted something a little more fancy - a way to iterate through all my spatial tables and clip out an arbitary polygon, saving the result to a shapefile. To achieve this I wrote a little bash script that does just that. The script will also optionally make sql dumps of the data and clean up after itself when it is done.
Note that for large dataset, clipping with an arbitary polygon can take a long time! Here is the script:
#!/bin/bash SUFFIX="ecape" CLIPTABLE="ecape" CLIPFIELD="provname" CLIPGEOMFIELD="the_geom" CLIPFIELDVALUE="Eastern Cape" DB="cdsm50k" DELETETABLE=1 DELETEGEOM=1 SHPDUMP=1 SQLDUMP=0 pushd . cd /tmp for TABLE in `echo "\dt" | psql $DB \ | awk '{print $3}' | grep "^[a-zA-Z]"` do GEOMFIELD=`echo "\d $TABLE" | psql $DB | grep "geometry" \ | grep -v "geometrytype" |awk '{print $1}'` if [ "$GEOMFIELD" == "" ] then echo "$TABLE has no geometry column, skipping" else echo "$TABLE -> $GEOMFIELD" echo "drop table \"${TABLE}_${SUFFIX}\";" | psql $DB # Note we use the && bounding box query first and # then the intersects query to avoid unneeded comparison of # complex geometries. SQL="CREATE TABLE \"${TABLE}_${SUFFIX}\" AS \ SELECT \ ST_Intersection(v.$GEOMFIELD, m.$CLIPGEOMFIELD) AS intersection_geom, \ v.*, \ m.$CLIPFIELD \ FROM \ \"$TABLE\" v, \ $CLIPTABLE m \ WHERE \ ST_Intersects(v.$GEOMFIELD, m.$CLIPGEOMFIELD) AND \ $CLIPFIELD='$CLIPFIELDVALUE';" echo $SQL | psql $DB if [ $DELETEGEOM -eq 1 ] then echo "alter table \"${TABLE}_${SUFFIX}\" drop column $GEOMFIELD;" | psql $DB fi if [ $SHPDUMP -eq 1 ] then pgsql2shp -f ${TABLE}.shp -g "intersection_geom" $DB ${TABLE}_${SUFFIX} fi if [ $SQLDUMP -eq 1 ] then pg_dump -D $DB -t ${TABLE}_${SUFFIX} > ${TABLE}_${SUFFIX}.sql fi if [ $DELETETABLE -eq 1 ] then echo "drop table ${TABLE}_${SUFFIX};" | psql $DB fi fi done echo "vacuum analyze;" | psql $DB popd
Sometimes just because a thing is possible doesn't make it a good idea. In the last week I have been helping a client whose web developer has disappeared. The developer implemented the database using a variation on the Entity Attribute Value (EAV) modelling paradigm. The database consists of only four tables: Entities, Attributes, Values, EntityTypes. So the entities in the database don't match any real-world entities - there is no products table for products, no categories table for categories and so on. This generic modelling approach might sound good but in practice it's a real pain. Look at this query diagram representing a simple query to find out what categories exist in the database:
The query consists of numerous joins and is completely opaque. The EAV wikipedia article I referenced above contains a list of downsides to the EAV approach, with which I can only concur. For a new developer approach an EAV implementation its an uphill battle to understand what is going on. Writing queries is a cumbersome task, and the logic such as foreign key constraints that would normally reside in the database layer now needs to be implemented in the application codebase.
For me EAV is something I will be avoiding in any new developments I embark on...
Someone asked on twitter it is possible to dump all the tables in Postgres to individual shp files. Some time ago I wrote a script to dump all tables as SQL dumps. The question prompted me to tweak that script to drop out shapefiles instead.
My original script looked like the listing below. The dump files contain data only (see the comments in the bash script below) because I use this script to create fixtures for my django projects.
#!/bin/bash # A script to create sql formatted fixtures (serialised models) # used to initialise the application if you install it to another # machine. You should run this any time you change your models # or when you need to make a backup of all your data. # Tim Sutton 2009 mkdir bees/sql for TABLE in `echo "\d" | psql sabio | grep -v seq | awk '{print $3}'` do echo $TABLE # -a data only # -t table # -D dump as sql inserts pg_dump -a -t $TABLE -D sabio > bees/sql/${TABLE}.sql #bzip2 bees/sql/${TABLE}.sql done
To make the script drop out shapefiles I modified it a bit as shown in the next listing. Obviously as we are dumping shapefiles, we should only bother dumping tables with geometry in them so I went the route of using the geometry_columns table to decide which tables to dump...
#!/bin/bash # A script to dump shapefiles of all tables listed in geometry_columns # Tim Sutton 2009 mkdir bees/sql for TABLE in `echo "select f_table_name from geometry_columns;" | psql sabio \ | head -n -2 | egrep -v "\-\-\-\-\-\-\-\-\-" | egrep -v "f_table_name"` do echo $TABLE pgsql2shp sabio $TABLE done
Hope this is useful to someone out there :-)
Ok so I have a few production databases that I need to back up regularly. The trick is I want to run the backup from a remote machine so that the backup lives on a separate server to the actual database system. You can run backups manually like this (assuming your database is called 'postgis'):
pg_dump -h dbhost -f postgis_`date +%d%B%Y`.sql.tar.gz -x -O -F tar postgis
When you run the above command, you will be prompted for a password. After entering the password you will find a date stamped backup. Very nice, but you may have noted that pg_dump has no option for giving it the password on the command line - it expects you to do that interactively. So what do we do if we need to automate the packup using a cron job? The solution is to use either ~/.pgpass or the PGPASSWORD environment variable. So here is how I automated the backup by placing a script in /etc/cron.daily/
export PGPASSWORD=secret pg_dump -h dbhost -f postgis_`date +%d%B%Y`.sql.tar.gz -x -O -F tar postgis
Good news if you are a PostGIS fan - the PostGIS team just released version 1.4! See the release announcement for full details...