Image Orientation in Different Disciplines
Image Orientation in Different Disciplines
Disciplines
Computer Sciences (20%); Mathematics (20%); Environmental Engineering, Applied Geosciences (60%)
Keywords
-
PHOTOGRAMMETRY,
COMPUTER VISION,
IMAGE ORIENTATION,
FUNDAMENTAL MATRIX,
TRIFOCAL TENSOR,
QUADRIFOCAL TENSOR
In photogrammetry one works with central perspective images of spatial objects. The representation of the central projection in the conventional elements of inner and outer orientation is non-linear. For some tasks the reference to a global system of coordinates is not necessary. In such cases one works with the so-called relative orientation. This is the alignment of at least two images in such a way that homologous projection rays intersect each other in one point in space. The relative orientation, expressed in terms of inner and outer orientation, is also a non-linear problem. The observed image coordinates include accidental errors. For decreasing their disturbing influence on the results the number of observations is much higher than the number of unknowns. In a computing way this decreasing is done by a so-called least-squares-adjustment. But for such an adjustment linear equations are needed. Since the equations of the central projection are non-linear, they have to be linearized. And for that approximate values of the mentioned elements are necessary. But in many cases the determination of these approximate values is quite tedious. At the relative youth discipline of computer vision also working with central perspective images, for that reason a linear representation of the central projection is aimed for. This linear representation is achieved by projective geometry but for the price that this representation uses more parameters having no easily comprehensible geometric meaning (compared to the inner and outer orientation mentioned above). The linear representation of the central projection is the so-called `Projection matrix`. Further, in computer vision certain indexed systems of numbers (so- called tensors) are known, which describe the relative orientation of 2, 3 and 4 images in a linear manner: fundamental-matrix, trifocal tensor and quadrifocal tensor. Now the question arises, in which way can these alternative methods provided by computer vision be of interest for photogrammetry. For solving this question actually the following topics have to be investigated: a) form of the constraints induced by the overparameterisation b) loss in accuracy due to the overparameterisation and non-inclusion of the constraints in a) c) shape of the dangerous surfaces; i.e. those configurations which do not allow a unique solution d) inclusion of the radial distortion into the formulae e) gross-error-detection using RANSAC or an evolutionary algorithm f) optimisation of the point-and-camera-arrangement using an evolutionary algorithm for the special task of camera calibration Parts of these topics are already known, the solution of the remaining questions is aim of this project.
The project "Image orientation in different disciplines" investigated how the orientation of images is mathematically represented in the fields of Photogrammetry and Computer Vision. The orientation of an image describes the relation between a spatial object and its two dimensional image and is determined by identifying and measuring corresponding features on the object and in the image; and afterwards the orientation is computed using an adjustment, which minimizes the accidental errors (so-called reprojection error) of these measurements. In Photogrammetry the orientation of an image is represented by means of the so-called physical parameters. These are divided into two groups: the exterior orientation of the image, telling where the camera is placed and in which direction it looks; and the interior orientation, telling where the lens of the camera is placed with respect to the image plane. Although this physical representation has the advantage of easy comprehensibility due to their Euclidian geometric funding, it also has the main disadvantage, that this representation is mathematically non-linear. In many applications, this non-linearity constitutes a severe drawback, because then the underlying mathematics turn out to be very cumbersome and difficult to handle. To overcome this problem of non-linearity, in Computer Vision, a linear representation for image orientation is aimed for; and it is achieved by using projective geometry. However, the linearity of this representation is achieved by using more parameters than actually needed (called over-parameterization) and by not minimizing the errors in the original feature measurements but some other quantities, called algebraic error. In this project one of these linear representations, the so-called trifocal tensor (TFT), which describes the relative orientation of three images, is investigated thoroughly. Relative orientation means that no information about the spatial object is required and the orientation is solely determined by the content of the images. Due to the over- parameterization the elements of the tensor have to satisfy some constraints to represent a valid TFT. New forms for these constraints together with a simple geometric interpretation were derived in this project, as well as an alternative expression for the tensor itself. The pros and cons of the new and existing constraints and alternative expressions were investigated. It turned out that the simplest way to determine a valid TFT is to use a parameterization proposed by R. Hartley already in 1994. Due to the projective founding of these linear representations, they cannot be determined if all features, especially points, lie in the same object plane. Therefore it is important to know the minimum deviation of the object points from a common plane to still allow a correct solution for the TFT. This investigation was done by computing various simulation runs in a C++ environment. The results of these investigations are quite promising as it turned out that already a deviation from a common plane by about 1% of the viewing distance assures a correct solution for the TFT. These findings suggest that the TFT can also be used for applications where nearly planar objects are usually encountered; like in the case of images taken from a facade of a building.
- Günther R. Raidl, Technische Universität Wien , associated research partner