mercredi 23 avril 2014

Advanced JPEG-in-TIFF uses in GDAL

This post is about advanced uses of JPEG compression in TIFF/GeoTIFF files. We will call such files "JPEG-in-TIFF" for the sake of shortness.

JPEG-in-TIFF is a popular variation of TIFF, described in TIFF specification supplement 2, well-suited for aerial/satellite imagery, that exhibits an interesting quality / (size * decompression_time) ratio, while remaining a format simple to encode/decode with Free and Open Source software.

Side note: while JPEG 2000 compression is a much more capable format, F.O.S.S. is still trying to catch up with proprietary implementations, although the OpenJPEG library (that can be used through the GDAL JP2OpenJPEG driver) has made recent advances that make it worth to be considered.

JPEG-in-TIFF creation options

To go back to JPEG-in-TIFF, quality/size can be controlled by selecting :
  • appropriate subsampling and colorspace. For RGB "natural" images, a good choice is YCbCr colorspace with subsampling of factor 2 on the chrominance difference componants (YCbCr 4:2:0). This is the PHOTOMETRIC=YCbCr creation option in the GDAL GTiff driver. Using it make the size of the image typically 2 to 3 times smaller than the default value for photometric interpretation (RGB)
  • the usual JPEG quality parameter that acts on the quantization coefficients. This is the JPEG_QUALITY creation option.
Generally, you will want to generate a tiled version of JPEG-in-TIFF (TILED=YES creation option), so as to be able to access efficiently and in a random way to parts of the image.

Implicit overviews

The very latest improvements added to the GDAL development version (trunk r27226 or later, already deprecating the soon to-be-released GDAL 1.11) make it possible to have faster downsampled versions of JPEG-in-TIFF than before. Despite this improvement, the recommandation remains to generate overviews, either external or internal, with the gdaladdo utility, in order to have very fast access to downsampled versions of a raster (at the expense of increased storage space)

But what can we do when such overviews are not (yet) generated ? Previously, the GTiff driver would decompress the queried part of the raster at its full resolution and compute a downsampled image from it. But this is more slow than needed.

Schematically (voluntary omitting quantization and Huffman compression steps), a JPEG compressed stream is made of a sequence of squares of size 8x8 (or 16x16 with YCbCr 4:2:0) pixels (the technical name for such as block is a MCU, Minimum Code Unit) that contain the coefficients resulting from the Discrete Cosine Transform of the original 8x8 (16x16) pixels. To decompress a MCU to its full resolution, you need to compute the inverse DCT on the whole set of 8x8 (16x16) coefficients, which has some cost. But an interesting property of MCU coefficients is that you only need to operate on the high order ones to compute a lower resolution of the uncompressed block, and libjpeg, the software library that does the low-level job of compressing and decompressing the JPEG codestream, is capable of that ! Actually, we had already used that capability in the GDAL JPEG driver of GDAL 1.10, to expose implicit overview levels (at x2, x4, x8 sub-sampling factors), but it was not yet plugged into the GTiff driver.

Now, JPEG-in-TIFF files, in all possible formulations (tiled / stripped / single-stripped, pixel-interleaved vs band-interleaved, single band vs YCbCr 4:2:0 vs RGB colorspace), will internally expose overview levels at x2, x4 and x8 sub-sampling factors for raster operations.

So computing the 1/16th reduction of a BMNG tile of size 21600x21600, with 256x256 tiling, JPEG RGB compression, now takes about 3.5s with the latest developmenet version about 21s in GDAL 1.11 :

GDAL trunk :
$ time gdal_translate world.topo.bathy.200406.3x21600x21600.B2.tif out.tif -outsize 6.25% 6.25%
real    0m3.441s

GDAL 1.11 :
$ time gdal_translate world.topo.bathy.200406.3x21600x21600.B2.tif out.tif -outsize 6.25% 6.25%
real    0m20.987s
Note that the whole JPEG codestream will still be read from the storage, so the new optimization will be especially worthwile when I/O speed is good w.r.t CPU speed (whereas with JPEG2000 compression, due to the way how wavelet coefficients are packed, you only need to read small portion of the file).

If you try gdalinfo on a JPEG-in-TIFF file, relax if you don't see the implicit overviews mentionned. They are hidden most of the time to avoid confusion : it would be difficult for users to distinguish between internal pre-computed overviews, which benefit from fast acces, and the new implicit overviews. The latter ones are only made visible to the internals of the GTiff driver when a raster operation takes place.

Lossless conversion of JPEG into JPEG-in-TIFF

This is a feature that appeared in GDAL 1.10 released last year, but which has probably been unnoticed in the NEWS. The conversion of a JPEG file to a JPEG-in-TIFF is done without decompression and recompression cycles, through the preservation of the MCU coefficients, making it effectively lossless (the initial JPEG compression was lossy, but the conversion into JPEG-in-TIFF is lossless).
This optimized conversion path is taken if all the following conditions are met :
  • the source dataset is a JPEG file (or a VRT with a JPEG as a single SimpleSource)
  • the target dataset is a JPEG-in-TIFF file
  • no explicity target JPEG quality is specified
  • no change in colorspace is specified
  • no sub-windowing is requested
  • etc...
But it is compatible with the generation of a tiled JPEG-in-TIFF from the original JPEG image. Explicit assigment of target SRS and bounds are also possible.

So, the following commands will use the lossless copy method :
$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG

$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG -co TILED=YES

$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG -a_srs EPSG:4326 -a_ullr -180 90 180 -90
whereas the following commands will NOT :
$ gdal_translate in.jpg out.tif -co COMPRESS=JPEG -co QUALITY=60

$ gdal_translate in.jpg out.tif -srcwin 0 0 500 500 -co COMPRESS=JPEG

Lossless extraction of JPEG tiles from JPEG-in-TIFF

The fresh new Python script (needs GDAL trunk) does (part of) the reverse operation. From a JPEG-in-TIFF, it can extract one particular tile/strip into a standalone JPEG file, and generate the companion .aux.xml file if the source JPEG-in-TIFF is georeferenced.

The following command will extract the tile at column 10 (count starts at 0), row 20 from a tiled JPEG-in-TIFF :
python world.topo.bathy.200406.3x21600x21600.B2.tif out_10_20.jpg 10 20
Or to extract all the tiles (filenames will have the out_X_Y.jpg pattern) :
python world.topo.bathy.200406.3x21600x21600.B2.tif out.jpg
This could be interesting for tiling servers that want to keep global mosaics as sources.

Note: this is not exactly the reverse operation from JPEG --> JPEG-in-TIFF conversion, since it will not merge several JPEG-in-TIFF strips/tiles into a single JPEG file.

Ideas for later...

Instead of the script, we could imagine that the lossless extraction of JPEG from JPEG-in-TIFF could be done, in a natural way, with :
gdal_translate -srcwin X Y XSIZE YSIZE in.tif out.jpg -of JPEG
That would require detecting a sub-windowing pattern in the temporary VRT generated by gdal_translate, and then reassembling the right MCU coefficients. X, Y, XSIZE and YSIZE should be multiple of 8 or 16 to match MCU dimensions.

A more powerful, but even more complicated, idea would be to have first-class support in GDAL for the DCT coefficients, as raster bands ?, but it would require some thinking to find the right modelisation, and even more to implement it (with complications like YCbCr 4:2:0 subsampling).

In a similar vein, why not imagining:

gdal_translate mosaic_of_jpeg_images.vrt out.tif -co COMPRESS=JPEG
To make it easier, the VRT file should be made of JPEG tiles whose dimensions are a multiple of the MCU dimensions, and that are placed into the mosaic at offsets that are themselves multiple of the tile dimensions. An additional constraint is that all the JPEG tiles should share the same JPEG quantization and Huffman tables, since in JPEG-in-TIFF, those tables are common for all tiles/strips and placed in the JPEGTABLES TIFF tag.
Building a JPEG-in-TIFF from a mosaic in the GTiff driver might be tricky, but an ad-hoc Python script might be possible.

I will stop here with science-fiction. There is already enough to experiment !

1 commentaire:

  1. Excellent post, very informative, as usual with you.

    About :
    $ gdal_translate in.jpg out.tif -co COMPRESS=JPEG -co TILED=YES
    Does this mean that it is possible to extract tiles from a jpeg file without decompression ?