Svorad Štolc, Reinhold Huber-Mörk and Dorothea Heiss
The Austrian Institute of Technology (AIT) is working on novel in-line methods to infer object and material properties in automated visual inspection. Here we describe a real-time method for concurrent extraction of light-field and photometric stereo data using a multi-line-scan acquisition and processing framework.
Reasoning about objects, their characteristics and behaviour from 2D images is a challenge, despite there being many successful applications of 2D computer and machine vision. The light space of an observed scene comprises all rays hitting objects from every possible direction, and then after interacting with the objects being emitted in every possible direction. Static illumination and a single 2D sensor only sparsely samples this ray space. Placing a multitude of sensors partly overcomes this limitation. Light-field cameras sample this space more densely by considering a fine angular resolution, e.g., plenoptic cameras typically observe a scene from up to 100 different directions . On the other hand, illumination variation as used in photometric stereo facilitates a variety of directions for rays hitting the objects under consideration .
Light-field imaging and photometric stereo can be seen as complementary methods in computational imaging. The first method samples the space of light rays emerging from the object while the second samples the subspace of illuminating rays. Considering the application area of 3D reconstruction, the main advantages of photometric stereo based methods are their sensitivity to fine details and independence from surface structure, while disadvantages include lacking or insufficient metrical and global accuracy. On the other hand, light-field processing for 3D is able to provide globally and metrically correct depth estimations, but fine details are prone to be lost. Therefore, both methods complement each other advantageously, compensating each other’s shortcomings.
Established setups for photometric stereo based computational imaging involve images taken sequentially under a set of differing illumination directions. Their mutual displacement is obtained by switching between displaced light sources or a mechanical movement to different positions. Similarly, early light-field acquisition systems used either camera arrays or gantries transporting the camera between sequential acquisitions. Both setups – and even more their combination – would not be feasible for typical industrial applications, e.g. quality inspection, where a faster, more compact and tightly integrated solution is desirable.
We designed an inline machine vision system deploying concurrent light-field and photometric stereo  for industrial use cases where there is a relative movement between inspected objects and the sensing/illumination system e.g. via conveyor belt. The basic idea is to construct the light-field data structure over time using a multi-line scan sensor. The light-field is constructed by moving the multi-line scan sensor together with an illumination source relative to the observed object (or vice versa) at constant spatial and time increments. Furthermore, a photometric variation due to different illumination angles is observed for different views onto the observed object. In particular, specular reflections which typically cause problems with 3D reconstruction could be handled by inferring the local surface orientation directly from the observed specular behaviour.
Figure 1: Multi-line scan image acquisition principle and photograph of the AIT prototype.
Figure 2: 3D reconstruction of a coin using conventional stereo (left), multi-view stereo obtained with our multi-line scan camera (middle), and combination of multi-view and photometric stereo also obtained with the multi-line scan camera (right).
Figure 3: Coin texture (left) and fine surface detail contributed by photometric stereo (middle and right) as recorded by our approach.
We built a 3D reconstruction system based on fusion of the globally more trustable light-field processing derived depth with the locally more precise photometric stereo derived surface properties. The key idea in obtaining an improved depth by fusion is to balance globally vs. locally precise estimations. We investigated a variety of methods ranging from a combination of low- and high-pass filtered depth maps over graphical models to energy minimisation approaches. We propose the combined approach for application in in-line automated visual inspection. The analysis of fine surface disruptions, even for specular objects, is of wide interest in quality assurance. Furthermore, the inference of material properties and classes, which are described by the ‘bidirectional reflection distribution function’ (BRDF) comprising the variation of incoming and reflected light, becomes possible because a slice of the BRDF is obtained by our approach.
Collaborative research with companies in the field of production industry is currently being undertaken. Further work will include very high resolution imaging as well as novel computational methods. The latter is done in collaboration with the Institute of Computer Vision and Graphics of the Technical University of Graz.
 R. C. Bolles, H. H. Baker, D. H. Marimont: “Epipolarplane image analysis: an approach to determining structure from motion”, Int. J. Computer Vision, 1(1):7–55, 1987.
 R.J. Woodham: “Photometric method for determining surface orientation from multiple images”, Optical Engineering, 19(1), 139-144, 1980.
 S. Štolc, D. Soukup, B. Holländer, R. Huber-Mörk: “Depth and all-in-focus imaging by a multi-line-scan light-field camera”, J. of Electronic Imaging, 23(5):053020, 2014.
AIT Austrian Institute of Technology, Austria