Geo tips & tricks: 2014

jeudi 18 décembre 2014

JPEG-in-TIFF with mixed qualities

Some time ago, I wrote about advanced JPEG-in-TIFF uses in GDAL, i.e. TIFF files compressed with the JPEG codec. Recently, I've got again to that topic when I realized that recent libtiff versions produced files a bit bigger than necessary by repeating the quantization tables in each tile or strip, instead of omitting them in favor of the global quantization tables that are stored in a global location, the JpegTables TIFF tag, shared by all tiles or strips. This issue is now solved.

A quantization table is a set of 64 integer coefficients that are used to divide values coming from the Discrete Cosine Transform, before passing them to the Huffman compression stage and finally reaching the JPEG binary stream. The higher the coefficients, the lower the quality. The decoder must have the quantization table used by the encoder to properly recompute the values needed to do the Inverse Discrete Cosine Transform. The TIFF specification supplement 2 allows the quantization table to be stored in the JpegTables TIFF tag for all tiles or strips. Up to now, I thought this was a requirement, but indeed, reading more closely the specification, I found this was only an option. So there are several legit possibilities :

centralized quantization tables stored in JpegTables TIFF tag, and none in each tile/strip. This is what GDAL generates with its internal version of libtiff (or if using older libtiff, such as 3.8.2, or the to-be-released libtiff 4.0.4) will now produce.
centralized quantization tables stored in JpegTables TIFF tag, and redefined in some tile/strip if needed.
no JpegTables TIFF tag, and quantization tables defined in each tile/strip.

Options 2 and 3 makes it possible to have TIFF files with strips/tiles of different qualities. Actually even option 1 could be used to have up to 2 different qualities. Why 2 ? The reason is that JPEG specification allows a maximum of 4 different quantization tables (also technically it could reference up to 16 tables, but the limit of 4 is actually enforced by libjpeg, so better not play with that!). For RGB imagery compressed in the YCbCr color space, libjpeg uses two quantization tables: one for the Y component and another one for the Cb and Cr components. Hence the capability of storing a set of table for 2 quality settings.

So, as mentionned in my previous article about GeoPackage, it would actually be possible to generate JPEG-in-TIFF with different qualities depending on area(s) of interest, for example medium quality (and high compression) on most areas, and high quality on specific spots. I would expect such variants to have good compatibility with existing readers of JPEG-in-TIFF.

If you are interested in exploring that, you can contact me.

lundi 15 décembre 2014

GDAL GeoPackage raster support

TLDR

One of the recent additions in GDAL development version (2.0dev) is the support for raster tiles in the GeoPackage format.

A bit of history

The GeoPackage format has been adopted this year as a OGC standard and covers storage of both raster and vector data within the same SQLite3 container. This is not a completely revolutionary approach as there were precedents (I may have forgotten some!) :

- storage of vector format in SQLite3 database was at least done by FDO (with storage of geometries as WKT, WKB or Autodesk own geometry binary format "FGF") and Spatialite (in its own Spatialite geometry binary format, somewhat derived from WKB, but incompatible, and with compressed variants as well). Spatialite introduced the use of SQlite virtual RTree tables to implement spatial index. Both formats have been supported for long by the OGR SQLite driver. GeoPackage vector adds yet another geometry binary format, GPB (GeoPackageBinary) that consists of a header followed by actual WKB, and borrowed the idea of RTree for spatial index from Spatialite (which was a candidate implementation in early draft versions)

- storage of rasters in SQLite3 database was at least done by MBTiles, with raster tiles being stored as PNG or JPEG tiles as BLOB records, and a multi zoom level tile indexing scheme. GeoPackage raster support clearly derives from that design choice, and use the same naming of columns in the tile tables, but with various improvements (that will be perceived as defects by proponents of MBTiles simplicity), such as support for arbitrary reference spatial systems (MBTiles is bound to Google Mercator projection) and custom tiling schemes... and also a subtle semantic difference that we will detail afterwards. GeoPackage can also support several raster tables within the same container. One current limitation of GeoPackage is that currently only images that have 8-bit depth per channel, limited to R,G,B,A color space are supported, which prevents from storing DEMs or multi-spectral imagery. So there is still place for other solutions, such as Rasterlite 2, the raster-side of Spatialite, which offers a variety of storage formats, supported bit depths, multiple regions with different resolutions within the same coverage, etc...

In addition to the above, GeoPackage introduces metadata (with a potential hierarchical organization), ways of expressing layer schema and constraints (beyond SQL capabilities provided by SQLite 3), and a formalism to define extensions to the specification, both for a few standardized official extensions and more "proprietary" extensions.

GDAL 1.11 already had support for most of GeoPackage vector specification. More recent developments have added support for spatial index, curve geometries, and aspatial tables.

Raster support was still missing. This is now the case in the latest development version.

GDAL GeoPackage raster support

GeoPackage is now one of the very few GDAL drivers to support both raster and vector with the same "Dataset" object, which is now possible since the unification of the GDAL and OGR driver models. This means that you only have to open it once to explore its content, and not once with the GDAL API, and another one with the OGR API. I should note that the way we handle multiple raster "layers" in the GDAL API through the subdataset mechanism, which requires multiple Open() calls, is probably not yet optimal compared to the vector layer API. Food for thought...

The GeoPackage raster driver has the following capabilities :

- reading, creation... and update of raster layers. That means that you can use GeoPackage as the direct output format of any GDAL utility or API, gdal_translate, gdalwarp, etc...

- on-the-fly conversion in both ways between R,G,B,A colorspace exposed by GDAL and the storage tile format :

    * grey level PNG or JPEG tiles,
    * grey level with alpha band PNG,
    * RGB PNG, JPEG or WebP tiles (storage as WebP tiles is one of the official extensions to the baseline specification),
    * RGBA PNG or WebP tiles
    * 8-bit quantized (256 color palette) PNG tiles

- on creation/update, it is possible to use a strategy where JPEG tiles are used when the tile content is fully opaque, and PNG when there is transparency involved. Note that fully transparent tiles are not stored at all, as allowed by the specification, to allow efficient sparse storage.

- use, creation and update of multiple zoom levels (known as overviews in GDAL, or pyramids) for fast zoom-in/zoom-out operations. Including support for overview levels whose resolution does not necessarily differ by power-of-two factors ( "gpkg_other_zoom" extension in GeoPackage terminology )

- reading, creation and update of metadata (in a "flat" way, i.e. ignoring the potential hierarchy of metadata records)

- reading and creation of several tiling schemes. By default, the driver will create rasters with a tiling scheme that perfectly fits the resolution and spatial registration of the input dataset being converted to GeoPackage, so that no loss of image quality (if using PNG storage) or georeferencing occurs. But for some uses, adoption of more "universal" tiling schemes with world coverage might be desirable to ease overlapping of several raster coverages, or extending the spatial extent of an existing raster. A few such tiling schemes are available, such as the popular GoogleMapsCompatible on (reused from WMTS specification). Note: although GDAL should have no problem with it, some tests have shown that another available tiling scheme, GoogleCRS84Quad, might be difficult to handle for other implementations of the specification, so better not use it until its relevance for GeoPackage has been clarified.

Full documentation of the driver page is available for those who want to explore its capabilities. All in all, pretty much everything in the raster part of the specification has been implemented.

Tiles and blocks

One of the toughest part of the implementation was the update of tiles when the origin of the GDAL area of interest does not match the corner of a tile. This is the general case when using a global tiling scheme, where the origin of a raster can be anywhere within a tile. GDAL internally uses its own tiling system, with raster "blocks" (in red in the below drawing). In the case of GeoPackage, of course we choose block dimensions that exactly matches GeoPackage tiles. But due to that possible shift of origins, a GDAL block can potentially overlap 4 GeoPackage tiles (in blue). And to add more fun, the dimensions of GDAL rasters (black rectangle) are not necessarily a multiple of the block size.

Filling GDAL blocks from GeoPackage tiles is relatively easy : you figure out which 4 tiles are needed, read them and composite the interesting pixels into the GDAL block. For more performance we cache the last read tiles so that if reading the file from left to right (typical way GDAL algorithms process a raster), we only need to load 2 new tiles, instead of 4, for each GDAL block.
Regarding writing of tiles from GDAL blocks, a naive implementation would reload, update and recompress each of those 4 tiles each time a block is updated, but besides the fact that this would be time consuming, this would introduce repeated steps of image quality loss when using lossy compression formats, such as JPEG or WebP (or 8-bit PNG), as the storage format. Instead, we use a temporary database of uncompressed and partially updated tiles, and wait for a tile to be completely updated (and that considering its 4 R,G,B,A components) before compressing it to its final storage.

What's next ?

Potential future enhancements, on the raster as well as vector side, could consist in :

- studying how de-duplication of tiles (e.g. avoiding storing multiple times a fully blue tile when mapping oceanic areas) could be done. This is likely possible through updatable views and is a feature of MBTiles.
- as tiles are completely independant from each other, it might possible to have different quality settings of JPEG/WebP compression per area(s) of interest.
- adding reading and writing of metadata at the vector layer level. This should now be possible with the GDAL-OGR unification. That would require an update in ogrinfo to display such metadata, and in ogr2ogr to define and propagate them.
- use of schema/column constraints in vector layers.

- more technical: creating the triggers that correspond to the gpkg_geometry_type_trigger and gpkg_srs_id_trigger extensions, that can be used to ensure consistency of geometries.
- and likely keeping up with future versions of the specification.

The fun part ! Two formats in one

Ah, and for those who remembered and reached that part of the article, I mentionned that there was a subtle difference between GeoPackage rasters and MBTiles. I can hear you : "which one??" Well, by using the GoogleMapsCompatible tiling scheme, mandatory for MBTiles, in a GeoPackage, could not we manage to have a same container that is at the same time both a valid GeoPackage and MBTiles ? The answer is yes, but that requires some trickery. A key difference is that MBTiles was based on the OSGeo TMS (Tile Map Service ) specification that decided to use a tile row number of 0 for the bottom most tile of the tiling, whereas the later OGC WMTS specification on which GeoPackage is based decided that it would be the top most tile! Fortunately with some SQL, we can do the renumbering. Demo :

$ gdal_translate test.tif test.gpkg -of GPKG -co TILE_FORMAT=PNG -co TILING_SCHEME=GoogleMapsCompatible
$ gdaladdo -r cubic test.gpkg 2 4
$ ogrinfo test.gpkg -sql "CREATE VIEW tiles AS SELECT test.zoom_level, tile_column, tm.matrix_height-1-tile_row AS tile_row, tile_data FROM test JOIN
gpkg_tile_matrix tm ON test.zoom_level = tm.zoom_level AND tm.table_name = 'test'"
$ ogrinfo test.gpkg -sql "CREATE TABLE metadata(name TEXT, value TEXT)"

$ ogrinfo test.gpkg -sql "INSERT INTO metadata VALUES('name', 'my_tileset')"
$ ogrinfo test.gpkg -sql "INSERT INTO metadata VALUES('type', 'overlay')" // or 'baselayer'
$ ogrinfo test.gpkg -sql "INSERT INTO metadata VALUES('version', '1.1')"
$ ogrinfo test.gpkg -sql "INSERT INTO metadata VALUES('description', 'description')"
$ ogrinfo test.gpkg -sql "INSERT INTO metadata VALUES('format', 'PNG')"

And the result is also now a valid MBTiles dataset (while still being a valid GeoPackage), that can for example be opened by the GDAL MBTiles driver (after renaming to .mbtiles, or creating a symbolic link, since the MBTiles driver will not try to open a .gpkg file).

Finally, I would like to thank Safe Software for having financially supported this work on GeoPackage raster support.

jeudi 6 novembre 2014

Hacking Python module constants

I've spent a whole afternoon figuring out how to add a deprecation warning when using a constant attached to a Python module, so here's how to do it in case you would need to do something similar.

The use case is the following : as GDAL RFC 49 introduces curve geometries, I wanted the user to be warned if he uses the now deprecated ogr.wkb25DBit constant, since the new geometry types no longer use the most significant bit of the value to indicate the Z dimension.

from osgeo import ogr

ogr.wkb25DBit
ogr.py:167: DeprecationWarning: deprecated: use ogr.GT_Flatten(), ogr.GT_HasZ() or ogr.GT_SetZ() instead
warnings.warn("deprecated: use ogr.GT_Flatten(), ogr.GT_HasZ() or ogr.GT_SetZ() instead", DeprecationWarning)
-2147483648

The issue is that ogr.wkb25DBit was a variable assigned to a module, and you cannot do anything to "attach" a function that would be evaluated at runtime when the constant is used.

After several non conclusive attempts, the solution finally came from this article. Basically, you can use the sys.modules dictionnary to substitute the original module by a class instance (called pseudo-module class instance in the rest of this writing). And on the class of this class instance, you can define a property that will call a function when it is read. To get all other global functions and classes of the original module, you can copy the global dictionnary of the module to the pseudo-module class instance, and here you are ! If this summary does not make sense, the above mentionned article explains that in greater details.

My personal touch is a further improvement. The above trick is nice, but when from the Python console, you call help(ogr), it displays the help of the pseudo-module class instance, and not the one of the original module. So you loose the help of all other constants, functions and classes. Almost everything in fact.

But even Python builtins functions like help() can be replaced by a custom version. The custom help() tests if the object passed is the pseudo-module class instance, in which case it substitutes it temporarily with the original module before calling the original help().

Apart from issuing deprecation warnings, such a technique can be usefull to make module constants really constants. Did you know that you can affect another value to math.pi... ?

The full code for all above tricks (works with all Python 2 and 3 versions starting with 2.4) :

# Original module constant
my_constant = 1

# Backup original dictionnary before doing anything else
_initial_dict = globals().copy()

@property
def my_constant(module):
    import warnings
    warnings.warn("my_constant is deprecated", DeprecationWarning)
    return module._initial_dict['constant']

# Inspired from http://www.dr-josiah.com/2013/12/properties-on-python-modules.html
class _Module(object):
    def __init__(self):
        self.__dict__ = globals()
        self._initial_dict = _initial_dict

        # Transfer properties from the object to the Class
        for k, v in list(self.__dict__.items()):
            if isinstance(v, property):
                setattr(self.__class__, k, v)

        # Replace original module by our object
        import sys
        self._original_module = sys.modules[self.__name__]
        sys.modules[self.__name__] = self

# Custom help() replacement to display the help of the original module
# instead of the one of our instance object
class _MyHelper(object):

    def __init__(self, module):
        self.module = module
        self.original_help = help

        # Replace builtin help by ours
        try:
            import __builtin__ as builtins # Python 2
        except ImportError:
            import builtins # Python 3
        builtins.help = self

    def __repr__(self):
        return self.original_help.__repr__()

    def __call__(self, *args, **kwds):

        if args == (self.module,):
            import sys

            # Restore original module before calling help() otherwise
            # we don't get methods or classes mentionned
            sys.modules[self.module.__name__] = self.module._original_module

            ret = self.original_help(self.module._original_module, **kwds)

            # Reinstall our module
            sys.modules[self.module.__name__] = self.module

            return ret
        elif args == (self,):
            return self.original_help(self.original_help, **kwds)
        else:
            return self.original_help(*args, **kwds)

_MyHelper(_Module())
del _MyHelper
del _Module

mercredi 22 octobre 2014

Blending metadata into vector formats

This post explores a few ideas, and the resulting experiments, I've had recently to put metadata (or arbitrary information) into vector GIS formats that have no provision for them. One typical such format is the good-old Shapefile format. A shapefile generally consists in 3 files, a .shp file that contains the geometries, a .shx that is an index from the shape number to the offset in the .shp file where the geometry is located (to allow fast retrieval by shape ID) and a .dbf file that contains the attributes of each shape.

Of course, the most simple way of adding metadata would be to but an additional file besides the 3 mentionned ones, but that would not be very challenging (plus the risk of losing it during copy).

Most implementations require at least those 3 files to be present. Some allow .dbf to be missing (e.g. GDAL/OGR). Some allow .shx to be missing, like OpenJUMP which doesn't read it even if it is available, which is both a feature and a drawback in situations when there are "holes" in the .shp due to editing.

A basic solution is to add our metadata at the end of one of those 3 files. I've done tests with GDAL/OGR (based on Shapelib), GeoTools 12.0, OpenJUMP 1.7.1 (whose shapefile reader is a forked version of the GeoTools one with changes), proprietary software code-named "GM" and proprietary software "AG"

.dbf : all 5 implementations are happy with extra content at the end of the file

.shp : all implementations happy, except OpenJUMP that opens the file, but throws a warning because it tries to interprete the additional bytes as shape.

.shx : all 5 implementations are happy

So we have at least 2 possibilities that are rather portable.

It should be checked how they react in editing use cases, like adding new features to the shapefile. Regarding GDAL/OGR, I can say that it would overwrite the extra content at the end of the .dbf and the .shx. It would let the extra content at the end of the .shp to write the new geometry afterwards.

What if we want to "link" the metadata per feature in a way where it is preserved if shapes are added ? And for the sake of exploring more possibilities, we will exclude using the data-at-end-of-file track. Interleaving data and metadata is not possible in .dbf since the records are placed consecutively. Same for .shx. In .shp, we can try reserving some space between all geometry records and make sure that the .shx index takes the holes into account. Due to the fact that size and offsets in shapefile are expressed in term of 16 bit words, that extra space must be a multiple of 16 bit too. That works fine for all implementations, except OpenJUMP for the same reason as above. Hum, and what if we incorporate the metadata, not between the encoded geometries, but inside them ? Each geometry record is indeed structured like this :

Shape Id: 4 bytes
Record length (number of 16 bit words after that field): 4 bytes
Record content: (2 * record length) bytes
Shape Type: 4 bytes
Variable payload according to shape type

We can try adding extra payload at the end of record content while still updating record length to take into account. We could have thought that implementations strictly checks that the declared record length is consistant with the shape type, but experimentations (and code inspection on the 3 Open Source implementations) show that, when they check, they check that the record length is at least greater or equal to the minimum expected record length. So this works for the 5 implementations ! At least on a layer with 2D polygons. That should also work for other 2D geometry type. 3D shapes consist in the 2D information, followed by the Z information, and optionaly by the M(eausre) information. M information is sometimes omitted when it is not present (this is the case of the OGR writer). So if we would want to add metadata for 3D shapes, we would have to write dummy M information (writting not-a-number double values is commonly done to indicate that M information is invalid).

To go back to .dbf file a bit, sometimes the width of fields of string type is larger than strictly needed. The values are left aligned in the field and remaining space is padded with space characters. I've tried to insert a nul character just at the end of the string, and put the extra information afterwards. This works fine for the 3 C/C++ based shapefile readers (GDAL/OGR, G.M., A.G) since nul character is conventionnaly used to terminate a string in C/C++. Unfortunately that does not work with the 2 Java based implementations that do not use that convention : the extra content is displayed after the field content.

As we have started exploring modifying the data itself, let's return to .shp file. One thing to consider is that coordinates in shapefiles are stored as double precision floating point numbers, stored on 64 bits using the IEEE-754 binary representation. Such numbers are decomposed like the following : 1 bit for the sign of the value, 11 bits for the exponent and its sign and 52 bits for the mantissa. The mantissa is where the significand precision of the number is stored. How big is that ? Let's go back to geography a bit. The Earth has rougly a circonference of 40 000 km. If we want to map features with a precision of 1 cm, we need 40 000 000 / 0.01 = 4 billion distinct numbers. 4 billion fits conveniently on a 32 bit integer (and OpenStreetMap .pbf optimized format store coordinates on 32 bit integers based on that observation). So 52 bits allow 2^(52-32)=2^20, roughly 1 million more numbers, i.e. a precision of 10^-8 meters = 10 nanometers ! We could almost map every molecule located on the Earth surface !

It is consequently reasonable to borrow the 16 least significant bits from the mantissa for other use. Said differently for every 2D point/vertex, we can get back 4 bytes without any noticeable loss of precision. Depending on the shape complexity, this might be not big enough to store per-feature metadata. But on a typical shapefile, if we spead the metadata over the features, we can certainly store useful content. And the really great news is that this metadata would be preserved naturally in most format conversions (at least with GDAL/OGR whose internal geometry representation also uses 64-bit floating point numbers, and probably most other geometry engines), and for formats like Spatialite or GeoPackage that also use 64-bit floating point numbers. However, one must be aware than any other operation like rescaling or reprojection would completely change the least significant bits and erase our metadata.

Admitedly this is not a new idea. People have explored similar ideas for digital watermarking and more generally steganography, typically used to embed copyright information or source tracking (i.e. you generate a slightly different dataset for each customer, hence if a copy is then available for download, you can identify the origin of the leak), generally in a not noticeable way. Using least significant bits is the very basic technique, that can be circumvented easily by just zeroing them or adding noise. More advanced technique operate in the spectral domain, like DCT (Discrete Cosine Transform), DFT (Discrete Fourier Transform) or DWT (Discrete Wavelet Tranform). Some techniques have been specifically designed for GIS data, using topological properties for example. The common target of those techniques is to have robustness against attempts of removing the watermark from the signal, at the expense of a reduced bandwith for the inserted information. But for regular metadata, we do not need such guarantee and the use of least-significant bits might be good enough and easily implemented.

Any other ideas ? Sure...

For polygons, the shapefile specification states that the vertices of the outer ring must be listed in clockwise order. But it does not specify which vertex of the outline must be the first one. Let's consider that the top-most vertex of the polygon is numbered 0 (if there are several vertices with the same y coordinate, let's take the one of them with the minimum x), the following vertex in clockwise order is 1, etc... If our polygon has 16 vertices, and we serialize it starting at vertex 11, we have coded the 11 number. Combined with information of following polygons, we can build a longer message. This idea could only work in practice for shapefiles of complex/dense enough polygons. If every polygon has 256 vertex, we can encode log2(256)=8 bits per polygon. More generally, for a polygon with N vertex, we could encode log2(N) bits (rounded to inferior integer). So we need also at least hundreds or thousands of polygons of that complexity to be able to encode something useful. The advantage of this technique is that it is robust to rescaling, and probably most reprojections (at least the one that globally preserve the appearance of shapes), provided that the shapes are rewritten in the same order as in the original data.

That technique could also be adapted for lines. Let's consider a line made of (V1,V2,....Vn). We can for example simply build a multi-polyline of 2 parts (V1,...Vi) and (Vi,....,VN) that will visually looks like the original line and will encode for the i value. The increase in binary encoding would be modest (4+8+8=20 extra bytes).

Another technique might be to use repeated vertices. Let's consider a line or a polygon: if while listing consecutive vertices, they are repeated, this would encode a 1 value. Otherwise 0. For example, if a line is made of the sequence of vertices (V1,V1,V2,V3,V4,V4,V5,V5,V6), it would be equivalent to binary number 100110. So we could encode as many bits as vertices in the geometry. If needed, we can also use more repetitions to encode more bits. For one bit per vertex, on average such a technique would increase shapefile size by 50% (because on average, half of bits in a message are 1). It would preserve metadata perfectly for all coordinate transformations (geometry engines generally operate on vertices separately). But not to operations that would remove duplicated vertices.

Finally, here's another idea, conceptually close to the one based on the starting vertex. Excluding implementations that don't rely on the .shx (I've no prejudice against such one ! Keep on good work folks !), we could use the order of shapes in the .shp to encode information. Traditionnaly, feature 1 appears first in the .shp, followed by feature 2, etc... But we could re-order the shapes as we wish, provided we make the .shx point to the right offset in the .shp. If we have N shapes, there are N! (factorial(N) = N*(N-1)*(N-2)*...2*1) ways of ordering them. So for N shapes, we can encode log2(N!) bits. In practice for 10 shapes, that is 21 bits. For 100 shapes, 524 bits. For 1000 shapes, 8529 bits. And for 10000, 118458. Advantages: works for all geometry types, no increase in file size. Inconvenients: possibly less performant sequential reading because of apparently random seeking within the .shp, doesn't resist to file conversion.

I've not mentionned it, but for nearly all mentionned techniques, especially the last ones, we would need to reserve a few bits to insert a CRC or any other integrity mechanism, so as to make sure that we think is metadata really is. And all them could be potentially combined !

jeudi 16 octobre 2014

Warping, overviews and... warped overviews

The development version of GDAL has lately received a few long awaited improvements in the area of warping and overview computation.

For those non familiar with GDAL, warping is mainly used for reprojecting datasets from one source coordinate system to a target one, or to create a "north-up" image from a rotated image or an image that has ground control points. Overviews in GDAL are also called pyramids in other GIS software and are sub-sampled (i.e. with coarser resolution) versions of full-resolution datasets, that are mainly used for fast display in zooming out operations. Depending on the utility (warper or overview computation), different resampling methods are available : bilinear, cubic, cubicspline, lanczos, average, etc..

Cubic resampling

Up to now, the bi-cubic resampling algorithm used when computing warped images and overviews was a 4x4 convolution kernel. This was appropriate for warping, when the dimensions of the target dataset are of the same order as the source dataset. However if the target dataset was downsized (which is the nominal case of overview computation), the result was sub-optimal, not to say plainly bad, because not enough source pixels were captured, leading to a result close to what nearest neighbour would give. Now, the convolution kernel dynamically uses the subsampling ratio to take into account all source pixels that have an influence on each target pixel, so e.g 8x8 pixels if subsampling by a factor of 2.

Of course, this involves more computation and could be slower. Fortunately, for 64 bit builds, Intel SSE2 intrinsics are at the rescue to compute convolutions in a very efficient way.

For example in GDAL 2.0dev, computing 5 overview levels on a 10474x4951 RGB raster with cubic resampling takes 2.4 seconds on a Core i5-750, to be compared with 3.8s with GDAL 1.11

$ gdaladdo -ro -r cubic world_4326.tif 2 4 8 16 32

To compare both results, we can select the 5th overview level with the fresh new open option OVERVIEW_LEVEL=4 (index are 0 based)

$ gdal_translate world_4326.tif out.tif -oo OVERVIEW_LEVEL=4

5th overview generated by GDAL 2.0dev

5th overview generated by GDAL 1.11.1

So yes, faster (a bit) and better (a lot) !

Similar result can also be obtained with :

$ gdalwarp -r cubic world_4326.tif out.tif -ts 328 155

The "-oo OVERVIEW_LEVEL=xxx" option can be used with gdalinfo, gdal_translate and gdalwarp, or with the new GDALOpenEx() API.

Related work could involve adding resampling method selection in the RasterIO() API that currently only does nearest neighbour sampling. If that might interest you, please contact me.

Overviews in warping

Related to the OVERVIEW_LEVEL open option, another long due improvement was the selection of the appropriate overview level when warping. A typical use case is to start with a WMS or tiled dataset, e.g the OpenStreetMap tiles, and wanting to reproject full or partial extent to an image with reasonably small dimensions. Up to now, GDAL would alway use the most precision dataset (typically zoom level 18 for OpenStreetMap), which would make the operation terribly slow and unpractical.

Now, the following will run in just a few seconds :

$ gdalwarp frmt_wms_openstreetmap_tms.xml out.tif -t_srs EPSG:4326 \
-r cubic -te -10 35 10 55 -overwrite -ts 1000 1000

With the -ovr flag, you can modify the overview selection strategy, and for example specify you want to use the overview if the level immediately before the one that would have been automatically selected (i.e. with bigger dimensions, more precise)

$ gdalwarp frmt_wms_openstreetmap_tms.xml out.tif -t_srs EPSG:4326 \
-r cubic -te -10 35 10 55 -overwrite -ts 1000 1000 -ovr AUTO-1

You can also specify a precise overview level to control the level of details, which is particuarly relevant in the case of OSM since the rendering depends on the scale :

$ gdalwarp frmt_wms_openstreetmap_tms.xml out.tif -t_srs EPSG:4326 \
-r cubic -te -10 35 10 55 -overwrite -ts 1000 1000 -ovr 9

(Note: -ovr 9 is equivalent to OSM zoom level 8, since GDAL_overview_level = OSM_max_zoom_level - 1 - OSM_level, 9 = 18 - 1 - 8. )

With -ovr 9 (zoom level 8)

With -ovr 10 (zoom level 7)

With -ovr 11 (zoom level 6) or without any -ovr parameter

With -ovr 12 (zoom level 5)

(All above images are © OpenStreetMap contributors)

Overviews in warped VRT

GDAL advanced users will perhaps know the Virtual Raster (.vrt) format. There are several flavors of VRT files, one of them is the so-called "warped VRT", which can be produced by "gdalwarp -of VRT". This is an XML file that captures the name of the source dataset being warped and the parameters of the warping: output resolution, extent, dimensions, transformer used, etc... This can be convenient to do on-the-fly reprojection without needing to store the result of the reprojection. Similarly to regular warping, warped VRT can now make use of overviews of the source dataset to expose "implicit" overviews in the warped VRT dataset. Which make it possible to use warped VRT in a GIS viewer ith decent performance when zooming out. Among others, this will be beneficial to QGIS that use the "auto-warped-VRT" mechanism when opening a raster that is not a "north-up" dataset.

Still playing with our OpenStreetMap dataset, let's create a warped VRT around western Europe :

$ gdalwarp frmt_wms_openstreetmap_tms.xml out.vrt -t_srs EPSG:4326 \
-r cubic -te -10 35 10 55 -overwrite -of VRT

We can see that the VRT now advertizes overviews :

$ gdalinfo out.vrt
[...]
Size is 4767192, 4767192
[...]
Band 1 Block=512x128 Type=Byte, ColorInterp=Red
Overviews: 2383596x2383596, 1191798x1191798, 595899x595899,
   297950x297950, 148975x148975, 74487x74487,
   37244x37244, 18622x18622, 9311x9311, 4655x4655,
             2328x2328, 1164x1164, 582x582, 291x291, 145x145,
             73x73, 36x36, 18x18

I'd like to thank Koordinates and Land Information New Zealand for funding those improvements.

mercredi 1 octobre 2014

GDAL/OGR 1.11.1 released

On behalf of the GDAL/OGR development team, I am pleased to
announce the release of the GDAL/OGR 1.11.1 bug fix release. This
release contains more than 80 bug fixes since the April 1.11.0 release.

The source is available at:

http://download.osgeo.org/gdal/1.11.1/gdal-1.11.1.tar.xz
http://download.osgeo.org/gdal/1.11.1/gdal-1.11.1.tar.gz
http://download.osgeo.org/gdal/1.11.1/gdal1111.zip

Details on the the fixes in this release, and a security announcement, are
available at:
http://trac.osgeo.org/gdal/wiki/Release/1.11.1-News

dimanche 21 septembre 2014

GeoTIFF tile de-duplication

Have you ever had the opportunity to work with a raster dataset, that has world coverage, including oceans, and a resolution of 38 meters ? With World Mercator projection, the width and height of such a raster is 1 million pixels (1 048 576 exactly). If we also add 15 overviews (to go to a 32x32 thumbnail), how big would be such a raster ? More than 5 terabytes ?! 1 million * 1 million * 4 (for RGBA) * 1.33 (space for overviews : 1/4 + 1/16 +... = 0.333..). BigTIFF to the rescue ?

You are wrong! Such a raster dataset can be as small as 1 392 764 bytes (1.3 MB) in standard GeoTIFF format (that can be further compressed to 78838 bytes once put in a .zip). And in that size, it can feature 1 431 655 765 (1.4 billion) GDAL logos. If you don't believe me, just download it now and check by yourself !

http://even.rouault.free.fr/gdal_everywhere2.tif

You should be able to display it at light speed in any reasonable desktop GIS. QGIS is one of them.

What is the recipe for such a file ? Simply (ab)using possibilities offered by the TIFF specification. Namely, for a tiled TIFF, for each resolution, there are 2 arrays : one that contains the location (offsets) of each tile data (TileOffsets tag), and another one the size of each tile data (TileByteCounts tag). Here we simply put the same value for the location of all tiles, and write just once a 2048x2048 tile (compressed with DEFLATE codec) that mosaics the 32x32 GDAL logo. Using just that leads to a file of size 2 971 732 bytes, much larger than needed. So we are going to abuse the TIFF specification even more. First by noticing that if the offset of the tile matches also its size, then we can use the same array for TileOffsets and TileByteCounts, thus saving (1048576/2048)^2*4 = 1048576 bytes. In that instance, the tile is 172094 bytes large, so we place it at offset 172094. And finally, we can also make the overviews point their TileOffsets and TileByteCounts tags to the single array of the full resolution. Actually, as the definition of the 16 TIFF directories ends at offset 3446, we have also 172094-3446 = 168 648 spare bytes !

Letting aside this challenge, it could be interesting to have that tile de-duplication capability directly incorporated into the GDAL GeoTIFF driver in a more user friendly way than the mix of GDAL and direct byte access that has been used to build that file. A typical use case is when creating raster with a lot of oceanic area where tiles are in solid blue. Such technique can be used when creating MBTiles, but the good old TIFF can also do it. If you are interested, contact me !

vendredi 25 avril 2014

GDAL/OGR 1.11.0 released

On behalf of the GDAL/OGR development team and community, I am pleased to
announce the release of GDAL/OGR 1.11.0. GDAL/OGR is a C++ geospatial
data access library for raster and vector file formats, databases and
web services. It includes bindings for several languages, and a variety
of command line tools.

The 1.11.0 release is a major new feature release with the following
highlights:

* New GDAL drivers:
    - KRO: read/write support for KRO KOLOR Raw format

* New OGR drivers:
    - CartoDB : read/write support
    - GME / Google Map Engine : read/write support
    - GPKG / GeoPackage : read-write support (vector part of the spec.)
    - OpenFileGDB: read-only support (no external dependency)
    - SXF: read-only support
    - WALK: read-only support
    - WasP .map : read-write support

* Significantly improved drivers: GML, LIBKML

* RFC 40: enhanced RAT support
* RFC 41: multiple geometry fields support
* RFC 42: OGR Layer laundered field lookup
* RFC 43: add GDALMajorObject::GetMetadataDomainList()
* RFC 45: GDAL datasets and raster bands as virtual memory mapping
* Upgrade to EPSG 8.2 database

More complete information on the new features and fixes in the 1.11.0
release can be found at:

http://trac.osgeo.org/gdal/wiki/Release/1.11.0-News

The new release can be downloaded from:
* http://download.osgeo.org/gdal/1.11.0/gdal1110.zip - source as a zip
* http://download.osgeo.org/gdal/1.11.0/gdal-1.11.0.tar.gz - source as
.tar.gz
* http://download.osgeo.org/gdal/1.11.0/gdalautotest-1.11.0.tar.gz - test
suite
* http://download.osgeo.org/gdal/1.11.0/gdal1110doc.zip - documentation /
website

mercredi 23 avril 2014

Advanced JPEG-in-TIFF uses in GDAL

This post is about advanced uses of JPEG compression in TIFF/GeoTIFF files. We will call such files "JPEG-in-TIFF" for the sake of shortness.

JPEG-in-TIFF is a popular variation of TIFF, described in TIFF specification supplement 2, well-suited for aerial/satellite imagery, that exhibits an interesting quality / (size * decompression_time) ratio, while remaining a format simple to encode/decode with Free and Open Source software.

Side note: while JPEG 2000 compression is a much more capable format, F.O.S.S. is still trying to catch up with proprietary implementations, although the OpenJPEG library (that can be used through the GDAL JP2OpenJPEG driver) has made recent advances that make it worth to be considered.

JPEG-in-TIFF creation options

To go back to JPEG-in-TIFF, quality/size can be controlled by selecting :

appropriate subsampling and colorspace. For RGB "natural" images, a good choice is YCbCr colorspace with subsampling of factor 2 on the chrominance difference componants (YCbCr 4:2:0). This is the PHOTOMETRIC=YCbCr creation option in the GDAL GTiff driver. Using it make the size of the image typically 2 to 3 times smaller than the default value for photometric interpretation (RGB)
the usual JPEG quality parameter that acts on the quantization coefficients. This is the JPEG_QUALITY creation option.

Generally, you will want to generate a tiled version of JPEG-in-TIFF (TILED=YES creation option), so as to be able to access efficiently and in a random way to parts of the image.

Implicit overviews

The very latest improvements added to the GDAL development version (trunk r27226 or later, already deprecating the soon to-be-released GDAL 1.11) make it possible to have faster downsampled versions of JPEG-in-TIFF than before. Despite this improvement, the recommandation remains to generate overviews, either external or internal, with the gdaladdo utility, in order to have very fast access to downsampled versions of a raster (at the expense of increased storage space)

But what can we do when such overviews are not (yet) generated ? Previously, the GTiff driver would decompress the queried part of the raster at its full resolution and compute a downsampled image from it. But this is more slow than needed.

Schematically (voluntary omitting quantization and Huffman compression steps), a JPEG compressed stream is made of a sequence of squares of size 8x8 (or 16x16 with YCbCr 4:2:0) pixels (the technical name for such as block is a MCU, Minimum Code Unit) that contain the coefficients resulting from the Discrete Cosine Transform of the original 8x8 (16x16) pixels. To decompress a MCU to its full resolution, you need to compute the inverse DCT on the whole set of 8x8 (16x16) coefficients, which has some cost. But an interesting property of MCU coefficients is that you only need to operate on the high order ones to compute a lower resolution of the uncompressed block, and libjpeg, the software library that does the low-level job of compressing and decompressing the JPEG codestream, is capable of that ! Actually, we had already used that capability in the GDAL JPEG driver of GDAL 1.10, to expose implicit overview levels (at x2, x4, x8 sub-sampling factors), but it was not yet plugged into the GTiff driver.

Now, JPEG-in-TIFF files, in all possible formulations (tiled / stripped / single-stripped, pixel-interleaved vs band-interleaved, single band vs YCbCr 4:2:0 vs RGB colorspace), will internally expose overview levels at x2, x4 and x8 sub-sampling factors for raster operations.

So computing the 1/16th reduction of a BMNG tile of size 21600x21600, with 256x256 tiling, JPEG RGB compression, now takes about 3.5s with the latest developmenet version about 21s in GDAL 1.11 :

GDAL trunk :
$ time gdal_translate world.topo.bathy.200406.3x21600x21600.B2.tif out.tif -outsize 6.25% 6.25%
real 0m3.441s

GDAL 1.11 :
$ time gdal_translate world.topo.bathy.200406.3x21600x21600.B2.tif out.tif -outsize 6.25% 6.25%
real 0m20.987s

Note that the whole JPEG codestream will still be read from the storage, so the new optimization will be especially worthwile when I/O speed is good w.r.t CPU speed (whereas with JPEG2000 compression, due to the way how wavelet coefficients are packed, you only need to read small portion of the file).

If you try gdalinfo on a JPEG-in-TIFF file, relax if you don't see the implicit overviews mentionned. They are hidden most of the time to avoid confusion : it would be difficult for users to distinguish between internal pre-computed overviews, which benefit from fast acces, and the new implicit overviews. The latter ones are only made visible to the internals of the GTiff driver when a raster operation takes place.

Lossless conversion of JPEG into JPEG-in-TIFF

This is a feature that appeared in GDAL 1.10 released last year, but which has probably been unnoticed in the NEWS. The conversion of a JPEG file to a JPEG-in-TIFF is done without decompression and recompression cycles, through the preservation of the MCU coefficients, making it effectively lossless (the initial JPEG compression was lossy, but the conversion into JPEG-in-TIFF is lossless).
This optimized conversion path is taken if all the following conditions are met :

the source dataset is a JPEG file (or a VRT with a JPEG as a single SimpleSource)
the target dataset is a JPEG-in-TIFF file
no explicity target JPEG quality is specified
no change in colorspace is specified
no sub-windowing is requested
etc...

But it is compatible with the generation of a tiled JPEG-in-TIFF from the original JPEG image. Explicit assigment of target SRS and bounds are also possible.

So, the following commands will use the lossless copy method :

$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG

$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG -co TILED=YES

$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG -a_srs EPSG:4326 -a_ullr -180 90 180 -90

whereas the following commands will NOT :

$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG -co QUALITY=60

$ gdal_translate in.jpg out.tif -srcwin 0 0 500 500 -co COMPRESS=JPEG

Lossless extraction of JPEG tiles from JPEG-in-TIFF

The fresh new jpeg_in_tiff_extract.py Python script (needs GDAL trunk) does (part of) the reverse operation. From a JPEG-in-TIFF, it can extract one particular tile/strip into a standalone JPEG file, and generate the companion .aux.xml file if the source JPEG-in-TIFF is georeferenced.

The following command will extract the tile at column 10 (count starts at 0), row 20 from a tiled JPEG-in-TIFF :

python jpeg_in_tiff_extract.py world.topo.bathy.200406.3x21600x21600.B2.tif out_10_20.jpg 10 20

Or to extract all the tiles (filenames will have the out_X_Y.jpg pattern) :

python jpeg_in_tiff_extract.py world.topo.bathy.200406.3x21600x21600.B2.tif out.jpg

This could be interesting for tiling servers that want to keep global mosaics as sources.

Note: this is not exactly the reverse operation from JPEG --> JPEG-in-TIFF conversion, since it will not merge several JPEG-in-TIFF strips/tiles into a single JPEG file.

Ideas for later...

Instead of the jpeg_in_tiff_extract.py script, we could imagine that the lossless extraction of JPEG from JPEG-in-TIFF could be done, in a natural way, with :

gdal_translate -srcwin X Y XSIZE YSIZE in.tif out.jpg -of JPEG

That would require detecting a sub-windowing pattern in the temporary VRT generated by gdal_translate, and then reassembling the right MCU coefficients. X, Y, XSIZE and YSIZE should be multiple of 8 or 16 to match MCU dimensions.

A more powerful, but even more complicated, idea would be to have first-class support in GDAL for the DCT coefficients, as raster bands ?, but it would require some thinking to find the right modelisation, and even more to implement it (with complications like YCbCr 4:2:0 subsampling).

In a similar vein, why not imagining:

gdal_translate mosaic_of_jpeg_images.vrt out.tif -co COMPRESS=JPEG

To make it easier, the VRT file should be made of JPEG tiles whose dimensions are a multiple of the MCU dimensions, and that are placed into the mosaic at offsets that are themselves multiple of the tile dimensions. An additional constraint is that all the JPEG tiles should share the same JPEG quantization and Huffman tables, since in JPEG-in-TIFF, those tables are common for all tiles/strips and placed in the JPEGTABLES TIFF tag.
Building a JPEG-in-TIFF from a mosaic in the GTiff driver might be tricky, but an ad-hoc Python script might be possible.

I will stop here with science-fiction. There is already enough to experiment !

dimanche 6 avril 2014

GML madness

I am convinced that most people wonder "how many ways are there to encode a polygon in GML ?" If you have never considered that before, you might be interested in reading the following lines.

To start gently, let us consider the following grey shape :

Mathematicians call it a square, which is a particularly case of a rectangle, which is itself a polygon. A simple way of describing a polygon is to list the coordinates of its corners :

Corner 0 coordinates are (0,0)
Corner 1 coordinates are (0,1)
Corner 2 coordinates are (1,1)
Corner 3 coordinates are (1,0)
And Corner 4 = Corner 0

One of the most compact way of describing that polygon in GML 3.2 is the use of the gml:Polygon element :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

We can forget the XML namespaces declaration and just concentrate on the fact that a Polygon is made of an exterior ring described by a list of positions. For those who wonder why we need to specify the "exterior", you must know that polygons may have holes in them, and those holes are called "interior rings", but we will not explore that level of complexity.

The documentation of the gml:LinearRing element shows that there are other ways of expressing the coordinates. We can isolate each corner in a separate gml:pos element :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                       http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
            <gml:pos>0 0</gml:pos>
            <gml:pos>0 1</gml:pos>
            <gml:pos>1 1</gml:pos>
            <gml:pos>1 0</gml:pos>
            <gml:pos>0 0</gml:pos>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

Or we can use a gml:Point inside a gml:pointProperty :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                    http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
            <gml:pointProperty>
                <gml:Point gml:id="ID2">
                    <gml:pos>0 0</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID3">
                    <gml:pos>0 1</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID4">
                    <gml:pos>1 1</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID5">
                    <gml:pos>1 0</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID6">
                    <gml:pos>0 0</gml:pos>
                </gml:Point>
            </gml:pointProperty>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

Those who carefully look at the above snippet realize that the content of the last pointProperty is the same as the first one. So we can use xlink:href power to optimize that a bit :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xmlns:xlink="http://www.w3.org/1999/xlink"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
            <gml:pointProperty>
                <gml:Point gml:id="ID2">
                    <gml:pos>0 0</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID3">
                    <gml:pos>0 1</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID4">
                    <gml:pos>1 1</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID5">
                    <gml:pos>1 0</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty xlink:href="#ID2"/>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

People nostalgic of the GML 2.1.2 era will probably want to use the now deprecated (but still valid) gml:coordinates element :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                    http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
        
            <gml:coordinates>0,0 0,1 1,1 1,0 0,0</gml:coordinates>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

We could play with cs (coordinate separator) and ts (tuple separator) attributes of gml:coordinates to generate alternate encoding for the coordinate list, but we will not do that. Enough with deprecated features ! Let us concentrate on modernity.

Our shape is a gml:Rectangle, isn'it ?

<?xml version="1.0"?>
<gml:Rectangle xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                 http://schemas.opengis.net/gml/3.2.1/gml.xsd">
    <gml:exterior>
        <gml:LinearRing>
            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
        </gml:LinearRing>
    </gml:exterior>
</gml:Rectangle>

Careful observers will notice that we have not simply substituted Polygon by Rectangle, but we have also removed the gml:id attribute. Why so ? Because a Polygon is a first citizen GML object deriving from gml:AbstractGMLType, whereas Rectangle just derives from gml:AbstractSurfacePatchType. Poor gml:Rectangle... We will come back to it later.

Until now, we have restricted the interior to be a LinearRing. But a LinearRing is a particular case of a gml:Ring :

<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                  http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:LineString gml:id="ID2">
                    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                </gml:LineString>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

As before we can use a series of gml:pos instead of gml:posList :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                 http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:LineString gml:id="ID2">
                    <gml:pos>0 0</gml:pos>
                    <gml:pos>0 1</gml:pos>
                    <gml:pos>1 1</gml:pos>
                    <gml:pos>1 0</gml:pos>
                    <gml:pos>0 0</gml:pos>
                </gml:LineString>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

But we could also use several gml:curveMember with a simple 2-point gml:LineString :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:LineString gml:id="ID2">
                    <gml:posList>0 0 0 1</gml:posList>
                </gml:LineString>
            </gml:curveMember>
            <gml:curveMember>
                <gml:LineString gml:id="ID3">
                    <gml:posList>0 1 1 1</gml:posList>
                </gml:LineString>
            </gml:curveMember>
            <gml:curveMember>
                <gml:LineString gml:id="ID4">
                    <gml:posList>1 1 1 0</gml:posList>
                </gml:LineString>
            </gml:curveMember>
            <gml:curveMember>
                <gml:LineString gml:id="ID5">
                    <gml:posList>1 0 0 0</gml:posList>
                </gml:LineString>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

Instead of a single gml:LineString, we could use a more powerful gml:Curve :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                  http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:Curve gml:id="ID2">
                    <gml:segments>
                        <gml:LineStringSegment>
                            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                        </gml:LineStringSegment>
                    </gml:segments>
                </gml:Curve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

But it is a bit of a shame to use a single gml:LineStringSegment inside a gml:segments. Let us fix that :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:Curve gml:id="ID2">
                    <gml:segments>
                        <gml:LineStringSegment>
                            <gml:posList>0 0 0 1</gml:posList>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:posList>0 1 1 1</gml:posList>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:posList>1 1 1 0</gml:posList>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:posList>1 0 0 0</gml:posList>
                        </gml:LineStringSegment>
                    </gml:segments>
                </gml:Curve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

Of course we can still use gml:pointProperty to avoid repeating the same coordinate tuples :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xmlns:xlink="http://www.w3.org/1999/xlink"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:Curve gml:id="ID2">
                    <gml:segments>
                        <gml:LineStringSegment>
                            <gml:pointProperty>
                                <gml:Point gml:id="ID3">
                                    <gml:pos>0 0</gml:pos>
                                </gml:Point>
                            </gml:pointProperty>
                            <gml:pointProperty>
                                <gml:Point gml:id="ID4">
                                    <gml:pos>0 1</gml:pos>
                                </gml:Point>
                            </gml:pointProperty>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:pointProperty xlink:href="#ID4"/>
                            <gml:pointProperty>
                                <gml:Point gml:id="ID5">
                                    <gml:pos>1 1</gml:pos>
                                </gml:Point>
                            </gml:pointProperty>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:pointProperty xlink:href="#ID5"/>
                            <gml:pointProperty>
                                <gml:Point gml:id="ID6">
                                    <gml:pos>1 0</gml:pos>
                                </gml:Point>
                            </gml:pointProperty>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:pointProperty xlink:href="#ID5"/>
                            <gml:pointProperty xlink:href="#ID3"/>
                        </gml:LineStringSegment>
                    </gml:segments>
                </gml:Curve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

Another child element of gml:curveMember is a gml:CompositeCurve :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:CompositeCurve gml:id="ID2">
                    <gml:curveMember>
                        <gml:LineString gml:id="ID3">
                            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                        </gml:LineString>
                    </gml:curveMember>
                </gml:CompositeCurve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

But, you may have noticed that the child of a CompositeCurve is a curveMember, which is also the parent of the CompositeCurve. So we may put a CompositeCurve inside a CompositeCurve :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:CompositeCurve gml:id="ID2">
                    <gml:curveMember>
                        <gml:CompositeCurve gml:id="ID3">
                            <gml:curveMember>
                                <gml:LineString gml:id="ID4">
                                    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                                </gml:LineString>
                            </gml:curveMember>
                        </gml:CompositeCurve>
                    </gml:curveMember>
                </gml:CompositeCurve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

You have probably understood now that we could nest CompositeCurve as many times as wished. So we have now the answer to the initial question : there is an infinity of ways of expressing a polygon in GML 3.2 !

Another child element of gml:curveMember is a gml:OrientableCurve :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:OrientableCurve gml:id="ID2">
                    <gml:baseCurve>
                        <gml:LineString gml:id="ID3">
                            <gml:pos>0 0</gml:pos>
                            <gml:pos>0 1</gml:pos>
                            <gml:pos>1 1</gml:pos>
                            <gml:pos>1 0</gml:pos>
                            <gml:pos>0 0</gml:pos>
                        </gml:LineString>
                    </gml:baseCurve>
                </gml:OrientableCurve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

But the full power of OrientableCurve is to be able to express the orientation of the curve. So let us split the ring into 2 pieces, one with positive orientation and one with negative orientation :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                 http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:OrientableCurve gml:id="ID2">
                    <gml:baseCurve>
                        <gml:LineString gml:id="ID3">
                            <gml:pos>0 0</gml:pos>
                            <gml:pos>0 1</gml:pos>
                            <gml:pos>1 1</gml:pos>
                        </gml:LineString>
                    </gml:baseCurve>
                </gml:OrientableCurve>
            </gml:curveMember>
            <gml:curveMember>
                <gml:OrientableCurve gml:id="ID4" orientation="-">
                    <gml:baseCurve>
                        <gml:LineString gml:id="ID5">
                            <gml:pos>0 0</gml:pos>
                            <gml:pos>1 0</gml:pos>
                            <gml:pos>1 1</gml:pos>
                        </gml:LineString>
                    </gml:baseCurve>
                </gml:OrientableCurve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

Enough with polygons. A polygon is just a particular case of a gml:Surface :

<?xml version="1.0"?>
<gml:Surface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                     http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:patches>
        <gml:PolygonPatch>
            <gml:exterior>
                <gml:LinearRing>
                    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                </gml:LinearRing>
            </gml:exterior>
        </gml:PolygonPatch>
    </gml:patches>
</gml:Surface>

Instead of a gml:PolygonPatch as a child of a gml:patches, we can use the gml:Rectangle we have used before :

<?xml version="1.0"?>
<gml:Surface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:patches>
        <gml:Rectangle>
            <gml:exterior>
                <gml:LinearRing>
                    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                </gml:LinearRing>
            </gml:exterior>
        </gml:Rectangle>
    </gml:patches>
</gml:Surface>

A Surface seems to be too simple. Why not using a gml:CompositeSurface ?

<?xml version="1.0"?>
<gml:CompositeSurface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:surfaceMember>
        <gml:Surface gml:id="ID2">
            <gml:patches>
                <gml:PolygonPatch>
                    <gml:exterior>
                        <gml:LinearRing>
                            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                        </gml:LinearRing>
                    </gml:exterior>
                </gml:PolygonPatch>
            </gml:patches>
        </gml:Surface>
    </gml:surfaceMember>
</gml:CompositeSurface>

But it looks a bit dumb to use only one gml:surfaceMember. Let us divide our square into 2 triangles :

<?xml version="1.0"?>
<gml:CompositeSurface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                     http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:surfaceMember>
        <gml:Surface gml:id="ID2">
            <gml:patches>
                <gml:PolygonPatch>
                    <gml:exterior>
                        <gml:LinearRing>
                            <gml:posList>0 0 0 1 1 1 0 0</gml:posList>
                        </gml:LinearRing>
                    </gml:exterior>
                </gml:PolygonPatch>
            </gml:patches>
        </gml:Surface>
    </gml:surfaceMember>
    <gml:surfaceMember>
        <gml:Surface gml:id="ID3">
            <gml:patches>
                <gml:PolygonPatch>
                    <gml:exterior>
                        <gml:LinearRing>
                            <gml:posList>0 0 1 1 1 0 0 0</gml:posList>
                        </gml:LinearRing>
                    </gml:exterior>
                </gml:PolygonPatch>
            </gml:patches>
        </gml:Surface>
    </gml:surfaceMember>
</gml:CompositeSurface>

Instead of a gml:CompositeSurface, why not using a gml:MultiSurface ?

<?xml version="1.0"?>
<gml:MultiSurface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                      http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:surfaceMember>
        <gml:Surface gml:id="ID2">
            <gml:patches>
                <gml:PolygonPatch>
                    <gml:exterior>
                        <gml:LinearRing>
                            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                        </gml:LinearRing>
                    </gml:exterior>
                </gml:PolygonPatch>
            </gml:patches>
        </gml:Surface>
    </gml:surfaceMember>
</gml:MultiSurface>

or maybe you prefer to use gml:surfaceMembers (with a final 's') instead of a gml:surfaceMember :

<?xml version="1.0"?>
<gml:MultiSurface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                       http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:surfaceMembers>
        <gml:Surface gml:id="ID2">
            <gml:patches>
                <gml:PolygonPatch>
                    <gml:exterior>
                        <gml:LinearRing>
                            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                        </gml:LinearRing>
                    </gml:exterior>
                </gml:PolygonPatch>
            </gml:patches>
        </gml:Surface>
    </gml:surfaceMembers>
</gml:MultiSurface>

Similarly to gml:CompositeCurve, we can arbitrary nest as many gml:CompositeSurface as wished :

<?xml version="1.0"?>
<gml:CompositeSurface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                 http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:surfaceMember>
        <gml:CompositeSurface gml:id="ID2">
            <gml:surfaceMember>
                <gml:Surface gml:id="ID3">
                    <gml:patches>
                        <gml:PolygonPatch>
                            <gml:exterior>
                                <gml:LinearRing>
                                    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                                </gml:LinearRing>
                            </gml:exterior>
                        </gml:PolygonPatch>
                    </gml:patches>
                </gml:Surface>
            </gml:surfaceMember>
        </gml:CompositeSurface>
    </gml:surfaceMember>
</gml:CompositeSurface>

So we have now two different kind of infinities ! That we could combine together. But, do not hope to have discovered more ways of expressing polygons. The cardinality of the set of natural numbers times the set of natural numbers ( N x N ) is just the cardinality of the set of natural numbers...

To conclude, we should mention that the authors of the GML specification have admitted that encoding polygons was a bit too complicated. So they have invented a "compact encoding" in the extended schemas of GML 3.3 :

<?xml version="1.0"?>
<gmlce:SimplePolygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:gmlce="http://www.opengis.net/gml/3.3/ce"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                    http://schemas.opengis.net/gml/3.2.1/gml.xsd
                    http://www.opengis.net/gml/3.3/ce
                    http://schemas.opengis.net/gml/3.3/geometryCompact.xsd"
             gml:id="ID1">
    
    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
</gmlce:SimplePolygon>

But our SimplePolygon is indeed a SimpleRectangle. So let us use instead :

<?xml version="1.0"?>
<gmlce:SimpleRectangle xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:gmlce="http://www.opengis.net/gml/3.3/ce"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                    http://schemas.opengis.net/gml/3.2.1/gml.xsd
                    http://www.opengis.net/gml/3.3/ce
                    http://schemas.opengis.net/gml/3.3/geometryCompact.xsd"
             gml:id="ID1">
    
    <gml:posList>0 0 0 1 1 1 1 0</gml:posList>
</gmlce:SimpleRectangle>

You can found the above 25 snippets at the following URL : http://even.rouault.free.fr/gml/
They are all valid GML 3.2 snippets that validate the XML schemas and pass the GML 3.2 Conformance Test Suite (except gml4.xsd which uses the deprecated gml:coordinates element).

Oh, final fun, as GML is XML, we can also use XML substitutable entities ...

<?xml version="1.0"?>
<!DOCTYPE points [
<!ENTITY pt0 "0 0">
<!ENTITY pt1 "0 1">
<!ENTITY pt2 "1 1">
<!ENTITY pt3 "1 0">
]>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                     http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
            <gml:posList>&pt0; &pt1; &pt2; &pt3; &pt0;</gml:posList>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

Those who wonder why I decided to write this article might want to have a look at the following simplified GML sample of a real-world use case where a PolygonPatch has only an interior ring (a hole), but no exterior ring... Standalone holes : interesting concept, isn't it ?