Geo tips & tricks: avril 2014

vendredi 25 avril 2014

GDAL/OGR 1.11.0 released

On behalf of the GDAL/OGR development team and community, I am pleased to
announce the release of GDAL/OGR 1.11.0. GDAL/OGR is a C++ geospatial
data access library for raster and vector file formats, databases and
web services. It includes bindings for several languages, and a variety
of command line tools.

The 1.11.0 release is a major new feature release with the following
highlights:

* New GDAL drivers:
    - KRO: read/write support for KRO KOLOR Raw format

* New OGR drivers:
    - CartoDB : read/write support
    - GME / Google Map Engine : read/write support
    - GPKG / GeoPackage : read-write support (vector part of the spec.)
    - OpenFileGDB: read-only support (no external dependency)
    - SXF: read-only support
    - WALK: read-only support
    - WasP .map : read-write support

* Significantly improved drivers: GML, LIBKML

* RFC 40: enhanced RAT support
* RFC 41: multiple geometry fields support
* RFC 42: OGR Layer laundered field lookup
* RFC 43: add GDALMajorObject::GetMetadataDomainList()
* RFC 45: GDAL datasets and raster bands as virtual memory mapping
* Upgrade to EPSG 8.2 database

More complete information on the new features and fixes in the 1.11.0
release can be found at:

http://trac.osgeo.org/gdal/wiki/Release/1.11.0-News

The new release can be downloaded from:
* http://download.osgeo.org/gdal/1.11.0/gdal1110.zip - source as a zip
* http://download.osgeo.org/gdal/1.11.0/gdal-1.11.0.tar.gz - source as
.tar.gz
* http://download.osgeo.org/gdal/1.11.0/gdalautotest-1.11.0.tar.gz - test
suite
* http://download.osgeo.org/gdal/1.11.0/gdal1110doc.zip - documentation /
website

mercredi 23 avril 2014

Advanced JPEG-in-TIFF uses in GDAL

This post is about advanced uses of JPEG compression in TIFF/GeoTIFF files. We will call such files "JPEG-in-TIFF" for the sake of shortness.

JPEG-in-TIFF is a popular variation of TIFF, described in TIFF specification supplement 2, well-suited for aerial/satellite imagery, that exhibits an interesting quality / (size * decompression_time) ratio, while remaining a format simple to encode/decode with Free and Open Source software.

Side note: while JPEG 2000 compression is a much more capable format, F.O.S.S. is still trying to catch up with proprietary implementations, although the OpenJPEG library (that can be used through the GDAL JP2OpenJPEG driver) has made recent advances that make it worth to be considered.

JPEG-in-TIFF creation options

To go back to JPEG-in-TIFF, quality/size can be controlled by selecting :

appropriate subsampling and colorspace. For RGB "natural" images, a good choice is YCbCr colorspace with subsampling of factor 2 on the chrominance difference componants (YCbCr 4:2:0). This is the PHOTOMETRIC=YCbCr creation option in the GDAL GTiff driver. Using it make the size of the image typically 2 to 3 times smaller than the default value for photometric interpretation (RGB)
the usual JPEG quality parameter that acts on the quantization coefficients. This is the JPEG_QUALITY creation option.

Generally, you will want to generate a tiled version of JPEG-in-TIFF (TILED=YES creation option), so as to be able to access efficiently and in a random way to parts of the image.

Implicit overviews

The very latest improvements added to the GDAL development version (trunk r27226 or later, already deprecating the soon to-be-released GDAL 1.11) make it possible to have faster downsampled versions of JPEG-in-TIFF than before. Despite this improvement, the recommandation remains to generate overviews, either external or internal, with the gdaladdo utility, in order to have very fast access to downsampled versions of a raster (at the expense of increased storage space)

But what can we do when such overviews are not (yet) generated ? Previously, the GTiff driver would decompress the queried part of the raster at its full resolution and compute a downsampled image from it. But this is more slow than needed.

Schematically (voluntary omitting quantization and Huffman compression steps), a JPEG compressed stream is made of a sequence of squares of size 8x8 (or 16x16 with YCbCr 4:2:0) pixels (the technical name for such as block is a MCU, Minimum Code Unit) that contain the coefficients resulting from the Discrete Cosine Transform of the original 8x8 (16x16) pixels. To decompress a MCU to its full resolution, you need to compute the inverse DCT on the whole set of 8x8 (16x16) coefficients, which has some cost. But an interesting property of MCU coefficients is that you only need to operate on the high order ones to compute a lower resolution of the uncompressed block, and libjpeg, the software library that does the low-level job of compressing and decompressing the JPEG codestream, is capable of that ! Actually, we had already used that capability in the GDAL JPEG driver of GDAL 1.10, to expose implicit overview levels (at x2, x4, x8 sub-sampling factors), but it was not yet plugged into the GTiff driver.

Now, JPEG-in-TIFF files, in all possible formulations (tiled / stripped / single-stripped, pixel-interleaved vs band-interleaved, single band vs YCbCr 4:2:0 vs RGB colorspace), will internally expose overview levels at x2, x4 and x8 sub-sampling factors for raster operations.

So computing the 1/16th reduction of a BMNG tile of size 21600x21600, with 256x256 tiling, JPEG RGB compression, now takes about 3.5s with the latest developmenet version about 21s in GDAL 1.11 :

GDAL trunk :
$ time gdal_translate world.topo.bathy.200406.3x21600x21600.B2.tif out.tif -outsize 6.25% 6.25%
real 0m3.441s

GDAL 1.11 :
$ time gdal_translate world.topo.bathy.200406.3x21600x21600.B2.tif out.tif -outsize 6.25% 6.25%
real 0m20.987s

Note that the whole JPEG codestream will still be read from the storage, so the new optimization will be especially worthwile when I/O speed is good w.r.t CPU speed (whereas with JPEG2000 compression, due to the way how wavelet coefficients are packed, you only need to read small portion of the file).

If you try gdalinfo on a JPEG-in-TIFF file, relax if you don't see the implicit overviews mentionned. They are hidden most of the time to avoid confusion : it would be difficult for users to distinguish between internal pre-computed overviews, which benefit from fast acces, and the new implicit overviews. The latter ones are only made visible to the internals of the GTiff driver when a raster operation takes place.

Lossless conversion of JPEG into JPEG-in-TIFF

This is a feature that appeared in GDAL 1.10 released last year, but which has probably been unnoticed in the NEWS. The conversion of a JPEG file to a JPEG-in-TIFF is done without decompression and recompression cycles, through the preservation of the MCU coefficients, making it effectively lossless (the initial JPEG compression was lossy, but the conversion into JPEG-in-TIFF is lossless).
This optimized conversion path is taken if all the following conditions are met :

the source dataset is a JPEG file (or a VRT with a JPEG as a single SimpleSource)
the target dataset is a JPEG-in-TIFF file
no explicity target JPEG quality is specified
no change in colorspace is specified
no sub-windowing is requested
etc...

But it is compatible with the generation of a tiled JPEG-in-TIFF from the original JPEG image. Explicit assigment of target SRS and bounds are also possible.

So, the following commands will use the lossless copy method :

$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG

$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG -co TILED=YES

$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG -a_srs EPSG:4326 -a_ullr -180 90 180 -90

whereas the following commands will NOT :

$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG -co QUALITY=60

$ gdal_translate in.jpg out.tif -srcwin 0 0 500 500 -co COMPRESS=JPEG

Lossless extraction of JPEG tiles from JPEG-in-TIFF

The fresh new jpeg_in_tiff_extract.py Python script (needs GDAL trunk) does (part of) the reverse operation. From a JPEG-in-TIFF, it can extract one particular tile/strip into a standalone JPEG file, and generate the companion .aux.xml file if the source JPEG-in-TIFF is georeferenced.

The following command will extract the tile at column 10 (count starts at 0), row 20 from a tiled JPEG-in-TIFF :

python jpeg_in_tiff_extract.py world.topo.bathy.200406.3x21600x21600.B2.tif out_10_20.jpg 10 20

Or to extract all the tiles (filenames will have the out_X_Y.jpg pattern) :

python jpeg_in_tiff_extract.py world.topo.bathy.200406.3x21600x21600.B2.tif out.jpg

This could be interesting for tiling servers that want to keep global mosaics as sources.

Note: this is not exactly the reverse operation from JPEG --> JPEG-in-TIFF conversion, since it will not merge several JPEG-in-TIFF strips/tiles into a single JPEG file.

Ideas for later...

Instead of the jpeg_in_tiff_extract.py script, we could imagine that the lossless extraction of JPEG from JPEG-in-TIFF could be done, in a natural way, with :

gdal_translate -srcwin X Y XSIZE YSIZE in.tif out.jpg -of JPEG

That would require detecting a sub-windowing pattern in the temporary VRT generated by gdal_translate, and then reassembling the right MCU coefficients. X, Y, XSIZE and YSIZE should be multiple of 8 or 16 to match MCU dimensions.

A more powerful, but even more complicated, idea would be to have first-class support in GDAL for the DCT coefficients, as raster bands ?, but it would require some thinking to find the right modelisation, and even more to implement it (with complications like YCbCr 4:2:0 subsampling).

In a similar vein, why not imagining:

gdal_translate mosaic_of_jpeg_images.vrt out.tif -co COMPRESS=JPEG

To make it easier, the VRT file should be made of JPEG tiles whose dimensions are a multiple of the MCU dimensions, and that are placed into the mosaic at offsets that are themselves multiple of the tile dimensions. An additional constraint is that all the JPEG tiles should share the same JPEG quantization and Huffman tables, since in JPEG-in-TIFF, those tables are common for all tiles/strips and placed in the JPEGTABLES TIFF tag.
Building a JPEG-in-TIFF from a mosaic in the GTiff driver might be tricky, but an ad-hoc Python script might be possible.

I will stop here with science-fiction. There is already enough to experiment !

dimanche 6 avril 2014

GML madness

I am convinced that most people wonder "how many ways are there to encode a polygon in GML ?" If you have never considered that before, you might be interested in reading the following lines.

To start gently, let us consider the following grey shape :

Mathematicians call it a square, which is a particularly case of a rectangle, which is itself a polygon. A simple way of describing a polygon is to list the coordinates of its corners :

Corner 0 coordinates are (0,0)
Corner 1 coordinates are (0,1)
Corner 2 coordinates are (1,1)
Corner 3 coordinates are (1,0)
And Corner 4 = Corner 0

One of the most compact way of describing that polygon in GML 3.2 is the use of the gml:Polygon element :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

We can forget the XML namespaces declaration and just concentrate on the fact that a Polygon is made of an exterior ring described by a list of positions. For those who wonder why we need to specify the "exterior", you must know that polygons may have holes in them, and those holes are called "interior rings", but we will not explore that level of complexity.

The documentation of the gml:LinearRing element shows that there are other ways of expressing the coordinates. We can isolate each corner in a separate gml:pos element :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                       http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
            <gml:pos>0 0</gml:pos>
            <gml:pos>0 1</gml:pos>
            <gml:pos>1 1</gml:pos>
            <gml:pos>1 0</gml:pos>
            <gml:pos>0 0</gml:pos>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

Or we can use a gml:Point inside a gml:pointProperty :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                    http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
            <gml:pointProperty>
                <gml:Point gml:id="ID2">
                    <gml:pos>0 0</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID3">
                    <gml:pos>0 1</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID4">
                    <gml:pos>1 1</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID5">
                    <gml:pos>1 0</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID6">
                    <gml:pos>0 0</gml:pos>
                </gml:Point>
            </gml:pointProperty>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

Those who carefully look at the above snippet realize that the content of the last pointProperty is the same as the first one. So we can use xlink:href power to optimize that a bit :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xmlns:xlink="http://www.w3.org/1999/xlink"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
            <gml:pointProperty>
                <gml:Point gml:id="ID2">
                    <gml:pos>0 0</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID3">
                    <gml:pos>0 1</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID4">
                    <gml:pos>1 1</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty>
                <gml:Point gml:id="ID5">
                    <gml:pos>1 0</gml:pos>
                </gml:Point>
            </gml:pointProperty>
            <gml:pointProperty xlink:href="#ID2"/>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

People nostalgic of the GML 2.1.2 era will probably want to use the now deprecated (but still valid) gml:coordinates element :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                    http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
        
            <gml:coordinates>0,0 0,1 1,1 1,0 0,0</gml:coordinates>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

We could play with cs (coordinate separator) and ts (tuple separator) attributes of gml:coordinates to generate alternate encoding for the coordinate list, but we will not do that. Enough with deprecated features ! Let us concentrate on modernity.

Our shape is a gml:Rectangle, isn'it ?

<?xml version="1.0"?>
<gml:Rectangle xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                 http://schemas.opengis.net/gml/3.2.1/gml.xsd">
    <gml:exterior>
        <gml:LinearRing>
            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
        </gml:LinearRing>
    </gml:exterior>
</gml:Rectangle>

Careful observers will notice that we have not simply substituted Polygon by Rectangle, but we have also removed the gml:id attribute. Why so ? Because a Polygon is a first citizen GML object deriving from gml:AbstractGMLType, whereas Rectangle just derives from gml:AbstractSurfacePatchType. Poor gml:Rectangle... We will come back to it later.

Until now, we have restricted the interior to be a LinearRing. But a LinearRing is a particular case of a gml:Ring :

<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                  http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:LineString gml:id="ID2">
                    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                </gml:LineString>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

As before we can use a series of gml:pos instead of gml:posList :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                 http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:LineString gml:id="ID2">
                    <gml:pos>0 0</gml:pos>
                    <gml:pos>0 1</gml:pos>
                    <gml:pos>1 1</gml:pos>
                    <gml:pos>1 0</gml:pos>
                    <gml:pos>0 0</gml:pos>
                </gml:LineString>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

But we could also use several gml:curveMember with a simple 2-point gml:LineString :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:LineString gml:id="ID2">
                    <gml:posList>0 0 0 1</gml:posList>
                </gml:LineString>
            </gml:curveMember>
            <gml:curveMember>
                <gml:LineString gml:id="ID3">
                    <gml:posList>0 1 1 1</gml:posList>
                </gml:LineString>
            </gml:curveMember>
            <gml:curveMember>
                <gml:LineString gml:id="ID4">
                    <gml:posList>1 1 1 0</gml:posList>
                </gml:LineString>
            </gml:curveMember>
            <gml:curveMember>
                <gml:LineString gml:id="ID5">
                    <gml:posList>1 0 0 0</gml:posList>
                </gml:LineString>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

Instead of a single gml:LineString, we could use a more powerful gml:Curve :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                  http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:Curve gml:id="ID2">
                    <gml:segments>
                        <gml:LineStringSegment>
                            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                        </gml:LineStringSegment>
                    </gml:segments>
                </gml:Curve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

But it is a bit of a shame to use a single gml:LineStringSegment inside a gml:segments. Let us fix that :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:Curve gml:id="ID2">
                    <gml:segments>
                        <gml:LineStringSegment>
                            <gml:posList>0 0 0 1</gml:posList>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:posList>0 1 1 1</gml:posList>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:posList>1 1 1 0</gml:posList>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:posList>1 0 0 0</gml:posList>
                        </gml:LineStringSegment>
                    </gml:segments>
                </gml:Curve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

Of course we can still use gml:pointProperty to avoid repeating the same coordinate tuples :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xmlns:xlink="http://www.w3.org/1999/xlink"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:Curve gml:id="ID2">
                    <gml:segments>
                        <gml:LineStringSegment>
                            <gml:pointProperty>
                                <gml:Point gml:id="ID3">
                                    <gml:pos>0 0</gml:pos>
                                </gml:Point>
                            </gml:pointProperty>
                            <gml:pointProperty>
                                <gml:Point gml:id="ID4">
                                    <gml:pos>0 1</gml:pos>
                                </gml:Point>
                            </gml:pointProperty>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:pointProperty xlink:href="#ID4"/>
                            <gml:pointProperty>
                                <gml:Point gml:id="ID5">
                                    <gml:pos>1 1</gml:pos>
                                </gml:Point>
                            </gml:pointProperty>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:pointProperty xlink:href="#ID5"/>
                            <gml:pointProperty>
                                <gml:Point gml:id="ID6">
                                    <gml:pos>1 0</gml:pos>
                                </gml:Point>
                            </gml:pointProperty>
                        </gml:LineStringSegment>
                        <gml:LineStringSegment>
                            <gml:pointProperty xlink:href="#ID5"/>
                            <gml:pointProperty xlink:href="#ID3"/>
                        </gml:LineStringSegment>
                    </gml:segments>
                </gml:Curve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

Another child element of gml:curveMember is a gml:CompositeCurve :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:CompositeCurve gml:id="ID2">
                    <gml:curveMember>
                        <gml:LineString gml:id="ID3">
                            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                        </gml:LineString>
                    </gml:curveMember>
                </gml:CompositeCurve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

But, you may have noticed that the child of a CompositeCurve is a curveMember, which is also the parent of the CompositeCurve. So we may put a CompositeCurve inside a CompositeCurve :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:CompositeCurve gml:id="ID2">
                    <gml:curveMember>
                        <gml:CompositeCurve gml:id="ID3">
                            <gml:curveMember>
                                <gml:LineString gml:id="ID4">
                                    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                                </gml:LineString>
                            </gml:curveMember>
                        </gml:CompositeCurve>
                    </gml:curveMember>
                </gml:CompositeCurve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

You have probably understood now that we could nest CompositeCurve as many times as wished. So we have now the answer to the initial question : there is an infinity of ways of expressing a polygon in GML 3.2 !

Another child element of gml:curveMember is a gml:OrientableCurve :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:OrientableCurve gml:id="ID2">
                    <gml:baseCurve>
                        <gml:LineString gml:id="ID3">
                            <gml:pos>0 0</gml:pos>
                            <gml:pos>0 1</gml:pos>
                            <gml:pos>1 1</gml:pos>
                            <gml:pos>1 0</gml:pos>
                            <gml:pos>0 0</gml:pos>
                        </gml:LineString>
                    </gml:baseCurve>
                </gml:OrientableCurve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

But the full power of OrientableCurve is to be able to express the orientation of the curve. So let us split the ring into 2 pieces, one with positive orientation and one with negative orientation :

<?xml version="1.0"?>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                 http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:Ring>
            <gml:curveMember>
                <gml:OrientableCurve gml:id="ID2">
                    <gml:baseCurve>
                        <gml:LineString gml:id="ID3">
                            <gml:pos>0 0</gml:pos>
                            <gml:pos>0 1</gml:pos>
                            <gml:pos>1 1</gml:pos>
                        </gml:LineString>
                    </gml:baseCurve>
                </gml:OrientableCurve>
            </gml:curveMember>
            <gml:curveMember>
                <gml:OrientableCurve gml:id="ID4" orientation="-">
                    <gml:baseCurve>
                        <gml:LineString gml:id="ID5">
                            <gml:pos>0 0</gml:pos>
                            <gml:pos>1 0</gml:pos>
                            <gml:pos>1 1</gml:pos>
                        </gml:LineString>
                    </gml:baseCurve>
                </gml:OrientableCurve>
            </gml:curveMember>
        </gml:Ring>
    </gml:exterior>
</gml:Polygon>

Enough with polygons. A polygon is just a particular case of a gml:Surface :

<?xml version="1.0"?>
<gml:Surface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                     http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:patches>
        <gml:PolygonPatch>
            <gml:exterior>
                <gml:LinearRing>
                    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                </gml:LinearRing>
            </gml:exterior>
        </gml:PolygonPatch>
    </gml:patches>
</gml:Surface>

Instead of a gml:PolygonPatch as a child of a gml:patches, we can use the gml:Rectangle we have used before :

<?xml version="1.0"?>
<gml:Surface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:patches>
        <gml:Rectangle>
            <gml:exterior>
                <gml:LinearRing>
                    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                </gml:LinearRing>
            </gml:exterior>
        </gml:Rectangle>
    </gml:patches>
</gml:Surface>

A Surface seems to be too simple. Why not using a gml:CompositeSurface ?

<?xml version="1.0"?>
<gml:CompositeSurface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                   http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:surfaceMember>
        <gml:Surface gml:id="ID2">
            <gml:patches>
                <gml:PolygonPatch>
                    <gml:exterior>
                        <gml:LinearRing>
                            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                        </gml:LinearRing>
                    </gml:exterior>
                </gml:PolygonPatch>
            </gml:patches>
        </gml:Surface>
    </gml:surfaceMember>
</gml:CompositeSurface>

But it looks a bit dumb to use only one gml:surfaceMember. Let us divide our square into 2 triangles :

<?xml version="1.0"?>
<gml:CompositeSurface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                     http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:surfaceMember>
        <gml:Surface gml:id="ID2">
            <gml:patches>
                <gml:PolygonPatch>
                    <gml:exterior>
                        <gml:LinearRing>
                            <gml:posList>0 0 0 1 1 1 0 0</gml:posList>
                        </gml:LinearRing>
                    </gml:exterior>
                </gml:PolygonPatch>
            </gml:patches>
        </gml:Surface>
    </gml:surfaceMember>
    <gml:surfaceMember>
        <gml:Surface gml:id="ID3">
            <gml:patches>
                <gml:PolygonPatch>
                    <gml:exterior>
                        <gml:LinearRing>
                            <gml:posList>0 0 1 1 1 0 0 0</gml:posList>
                        </gml:LinearRing>
                    </gml:exterior>
                </gml:PolygonPatch>
            </gml:patches>
        </gml:Surface>
    </gml:surfaceMember>
</gml:CompositeSurface>

Instead of a gml:CompositeSurface, why not using a gml:MultiSurface ?

<?xml version="1.0"?>
<gml:MultiSurface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                      http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:surfaceMember>
        <gml:Surface gml:id="ID2">
            <gml:patches>
                <gml:PolygonPatch>
                    <gml:exterior>
                        <gml:LinearRing>
                            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                        </gml:LinearRing>
                    </gml:exterior>
                </gml:PolygonPatch>
            </gml:patches>
        </gml:Surface>
    </gml:surfaceMember>
</gml:MultiSurface>

or maybe you prefer to use gml:surfaceMembers (with a final 's') instead of a gml:surfaceMember :

<?xml version="1.0"?>
<gml:MultiSurface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                       http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:surfaceMembers>
        <gml:Surface gml:id="ID2">
            <gml:patches>
                <gml:PolygonPatch>
                    <gml:exterior>
                        <gml:LinearRing>
                            <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                        </gml:LinearRing>
                    </gml:exterior>
                </gml:PolygonPatch>
            </gml:patches>
        </gml:Surface>
    </gml:surfaceMembers>
</gml:MultiSurface>

Similarly to gml:CompositeCurve, we can arbitrary nest as many gml:CompositeSurface as wished :

<?xml version="1.0"?>
<gml:CompositeSurface xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                 http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:surfaceMember>
        <gml:CompositeSurface gml:id="ID2">
            <gml:surfaceMember>
                <gml:Surface gml:id="ID3">
                    <gml:patches>
                        <gml:PolygonPatch>
                            <gml:exterior>
                                <gml:LinearRing>
                                    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
                                </gml:LinearRing>
                            </gml:exterior>
                        </gml:PolygonPatch>
                    </gml:patches>
                </gml:Surface>
            </gml:surfaceMember>
        </gml:CompositeSurface>
    </gml:surfaceMember>
</gml:CompositeSurface>

So we have now two different kind of infinities ! That we could combine together. But, do not hope to have discovered more ways of expressing polygons. The cardinality of the set of natural numbers times the set of natural numbers ( N x N ) is just the cardinality of the set of natural numbers...

To conclude, we should mention that the authors of the GML specification have admitted that encoding polygons was a bit too complicated. So they have invented a "compact encoding" in the extended schemas of GML 3.3 :

<?xml version="1.0"?>
<gmlce:SimplePolygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:gmlce="http://www.opengis.net/gml/3.3/ce"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                    http://schemas.opengis.net/gml/3.2.1/gml.xsd
                    http://www.opengis.net/gml/3.3/ce
                    http://schemas.opengis.net/gml/3.3/geometryCompact.xsd"
             gml:id="ID1">
    
    <gml:posList>0 0 0 1 1 1 1 0 0 0</gml:posList>
</gmlce:SimplePolygon>

But our SimplePolygon is indeed a SimpleRectangle. So let us use instead :

<?xml version="1.0"?>
<gmlce:SimpleRectangle xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:gmlce="http://www.opengis.net/gml/3.3/ce"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                    http://schemas.opengis.net/gml/3.2.1/gml.xsd
                    http://www.opengis.net/gml/3.3/ce
                    http://schemas.opengis.net/gml/3.3/geometryCompact.xsd"
             gml:id="ID1">
    
    <gml:posList>0 0 0 1 1 1 1 0</gml:posList>
</gmlce:SimpleRectangle>

You can found the above 25 snippets at the following URL : http://even.rouault.free.fr/gml/
They are all valid GML 3.2 snippets that validate the XML schemas and pass the GML 3.2 Conformance Test Suite (except gml4.xsd which uses the deprecated gml:coordinates element).

Oh, final fun, as GML is XML, we can also use XML substitutable entities ...

<?xml version="1.0"?>
<!DOCTYPE points [
<!ENTITY pt0 "0 0">
<!ENTITY pt1 "0 1">
<!ENTITY pt2 "1 1">
<!ENTITY pt3 "1 0">
]>
<gml:Polygon xmlns:gml="http://www.opengis.net/gml/3.2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://www.opengis.net/gml/3.2
                                     http://schemas.opengis.net/gml/3.2.1/gml.xsd"
             gml:id="ID1">
    <gml:exterior>
        <gml:LinearRing>
            <gml:posList>&pt0; &pt1; &pt2; &pt3; &pt0;</gml:posList>
        </gml:LinearRing>
    </gml:exterior>
</gml:Polygon>

Those who wonder why I decided to write this article might want to have a look at the following simplified GML sample of a real-world use case where a PolygonPatch has only an interior ring (a hole), but no exterior ring... Standalone holes : interesting concept, isn't it ?