But when you deal with more than 200 file formats, it is difficult (not to say impossible) to ensure that no defect exists in a code base of nearly 1 million lines (C and C++ files, empty and comment lines included), and I don't mention the sources of the load of libraries, open source or sometimes closed source, that GDAL/OGR might depend on. Especially defects that are normally not triggered by correct datasets.
If you have to process data that can come from untrusted sources, you could find yourself in a situation where an hostile party would submit a specially crafted dataset aimed at triggering a defect, with unfortunate consequences (e.g. arbitrary code execution, theft of data, ...). A page aimed at discussing security issues has been recently added on the Trac wiki to collect knowledge on that topic and provide a few recommendations (contributions from people having deployed GDAL and wishing to share the security measures that they have taken are welcome)
I have recently discovered an interesting and very elegant security mechanism provided by the Linux kernel : seccomp. The principle of that mechanism is very simple to understand : once an executable (more exactly a thread) has turned seccomp on, it can only run 4 (yes four) system calls : read(), write(), exit() and sigreturn(). System calls are the interface between a user program (e.g. a GDAL/OGR utility) and the Linux kernel. From the name, you probably figured that read() and write() are used to... read and write files (regular files, but also pipelines, network sockets). exit() is called at process termination, and sigreturn() is too obscure to be worth an explanation.
Reducing the number of system calls available to a binary considerably restricts what the user program can do, in good... or bad. In particular, once in seccomp mode ("strict seccomp", since there is also a relaxed and more customizable form of seccomp in newer Linux kernels), a program can no longer open files, create threads, initiate network connections, or even jump to an arbitrary position in a file (seek() operation) etc...
I have started recently an experiment, seccomp_launcher, to use that mechanism in order to provide a sandbox for the benefit of GDAL/OGR utilities.
Using seccomp_launcher is very simple. It is just a matter of writting "seccomp_launcher", with an optional acces mode, in front of the command. See the below examples :
$ ./seccomp_launcher gdalinfo some.tif -statsAnd now, a situation where it can prevent a confidential file (my private SSH key) from being accessed :
$ ./seccomp_launcher -rw gdal_translate some.tif target.tif
$ ./seccomp_launcher -rw ogr2ogr -f filegdb out.gdb poly.gdb -progress
$ ./seccomp_launcher python swig/python/samples/gdalinfo.py some.tif
$ ./seccomp_launcher gdal_translate hostile.vrt out.tif
INFO: in PR_SET_SECCOMP mode
AccCtrl: open(/home/even/.ssh/id_dsa,2,00) rejected. Not in white list
AccCtrl: open(/home/even/.ssh/id_dsa,0,00) rejected. Not in white list
ERROR 4: Unable to open /home/even/.ssh/id_dsa.
GDALOpen failed - 4
Unable to open /home/even/.ssh/id_dsa.
The software is made of two main parts :
- the seccomp_launcher binary (source: seccomp_launcher.c), which as implied by its name, launches the user binary, and can run priviledge system calls (opening a file, etc...) on its behalf, after having checked that they are authorized.
- the libseccomp_preload.so dynamic library (source: seccomp_preload.c) that is "injected" into the user binary (e.g. gdalinfo) before it starts, to force it to run in seccomp mode, and forward priviledged system calls to seccomp_launcher, by overriding some interesting entry points of the GNU libc library.
How to build it ? (provided that you have a C compiler, gcc or clang, and make already installed)
- git clone https://github.com/rouault/seccomp_launcher.git (or download and unzip https://github.com/rouault/seccomp_launcher/archive/master.zip )
- cd seccomp_launcher
- (optionnally) : sudo make install
What else to mention ?
- This is still alpha / experimental code made available under the "release early, release often" motto. In particular, no independant audit of the seccomp_launcher.c code has been made yet, which is the critical place where the security checks and delegated system calls are done. So if you intend to use it in production, please take some time to review it. Please also take time to read the README carefully, in particular the intended scope of the software (in short: do not use it to protect against hostile binaries, but only against hostile input data)
- It is available under the same X/MIT licence as the GDAL/OGR sources.
- Contributions (testing, bug reports, code contributions) are of course welcome !