lundi 24 mars 2014

Draft GDAL/OGR class hierarchy for GDAL 2.0

As a result of the first day of the OSGeo Code Sprint 2014 Vienna, I just wanted to share the outcome of my thoughts for a possible re-organisation of the GDAL/OGR class hierarchy, to achieve the mythical "Grand Unification". This is really work in progress, and I'm not even sure I will stick with it tomorrow morning... But here we go...

I have identified two principal aims :
  • adding support for metadata to OGR driver, datasource and layers (validation of creation options, etc...). That one is easy : make derive the 3 base classes from the GDALMajorObject class
  • more difficult and ambitious: making it possible to have a Dataset that contains both raster and vector data. You just open the data container once and can get both raster and vector data. Possible use cases: GeoPackage, PCIDSK, Spatialite/Rasterlite, Postgis/Postgis raster, ...
And one major constraint : avoid rewriting each of the existing 211 drivers... which represent 1.2 million lines of C/C++ code, 140 000 lines of code of Python autotests...

The class hierarchy of GDAL/OGR 1.X versions is quite simple :


So definitely 2 seperate worlds.

To achieve the first aim of getting metadata into OGR, you just have to do :



And now let's consider where the second aim could lead us :






Another way of presenting it with more details is the following pseudo-code :
/* Interface for major object */
class GDALIMajorObject
{
    public:
        virtual char      **GetMetadataDomainList() = 0;
        virtual char      **GetMetadata( const char * pszDomain = "" );
        [other methods go here]
};

/* Implementation of major object, and base class for dataset, bands, layers, etc.. */
class GDALMajorObject: public GDALIMajorObject
{
     /* existing code of GDALMajorObject */
};

/* Interface for raster functions */
class GDALIRasterDataset: public GDALIMajorObject
{
    public:
        virtual int         HandleRasterData() = 0;

        virtual int         GetRasterXSize( void ) = 0;
        virtual int         GetRasterYSize( void ) = 0;
        virtual int         GetRasterCount( void ) = 0;
        virtual GDALRasterBand *GetRasterBand( int ) = 0;
        [other methods go here]
};

/* Interface vor vector functions*/
class GDALIVectorDataset: public GDALIMajorObject
{
    public:
        virtual int         HandleVectorData() = 0;
       
        virtual int         GetLayerCount() = 0;
        virtual OGRLayer    *GetLayer(int) = 0;
        [other methods go here]
};

/* Convenience interface for both raster and vector functions */
class GDALIDataset: public GDALIRasterDataset, public GDALIVectorDataset
{
    public:
        /* That's all ! */
};

/* Partial implementation of GDALIRasterDataset */
class GDALAbstractRasterDataset : public GDALIRasterDataset, public GDALMajorObject
{
    /* Current code of GDALDataset GDAL v1 goes here */
   
    public:
        virtual int         HandleRasterData() { return TRUE; }
};

/* Convenience class used by vector only drivers */
class GDALEmptyRasterDataset : GDALAbstractRasterDataset
{
    public:
        virtual int         HandleRasterData() { return FALSE; }
};

/* Partial implementation of GDALIVectorDataset */
class GDALAbstractVectorDataset : public GDALIVectorDataset, public GDALMajorObject
{
    /* Current code of OGRDatasource GDAL v1 goes here*/
   
    public:
        virtual int         HandleVectorData() { return TRUE; }
};

/* Convenience class used by raster only drivers */
class GDALEmptyVectorDataset : public GDALIVectorDataset
{
    public:
        virtual int         HandleVectorData() { return FALSE; }
       
        virtual int         GetLayerCount() { return 0; }
        virtual OGRLayer    *GetLayer(int) { return NULL; }
        [other methods go here]
};

/* Equivalent of GDALDataset GDAL v1 (plus dummy vector interface). Existing GDAL drivers would derive from it. */
class GDALRasterDataset : public GDALAbstractRasterDataset, public virtual GDALEmptyVectorDataset, public virtual GDALIDataset
{
};

/* Equivalent of OGRDatasource GDAL v1 (plus dummy raster interface). Existing OGR drivers would derive from it. */
class GDALVectorDataset : public GDALAbstractVectorDataset, public virtual GDALEmptyRasterDataset, public virtual GDALIDataset
{
};

/* GDAL v2 base class for new drivers that need both raster and vector data */
class GDALDataset : public GDALAbstractRasterDataset, public virtual GDALAbstractVectorDataset, public virtual GDALIDataset
{
};

The impact on the existing code base is :
  • Current GDAL drivers must replace mentions of GDALDataset by GDALRasterDataset (automatic conversion)
  • Current OGR drivers must replace mentions of OGRDatasource by GDALVectorDataset (automatic conversion)
  • OGRDatasourceH becomes an alias of GDALDatasetH
  • C methods cast the opaque dataset pointer GDALDatasetH to GDALIDataset before invoking the C++ methods.
 Of course, all the above is just "nice" theory (rather complicated admitedly), and I should really try to go to the practice part of it to test if it can actually work...