by László Havasi and Tamás Szirányi (MTA SZTAKI)
The Distributed Events Analysis Research Laboratory (DEVA) has more than 10 years of research experience in security and surveillance, including multi-view systems of optical, thermal, infra-red and time-of-flight cameras, as well as LIDAR sensors. The laboratory’s research and development work has been addressing critical issues of surveillance systems regarding the protection of critical infrastructures against incursions and terrorist attacks.
The DEVA laboratory has been involved in several security related projects funded by the European Commission and the European Defence Agency, contributing significant improvements regarding multi-view computer vision and target tracking efforts. During a recently finished project (PROACTIVE, EU FP-7 [L1, L2]) that included several European partners in defence and security, a holistic IoT framework was developed enabling enhanced situational awareness in urban environments in order to pre-empt and effectively respond to terrorist attacks. The framework integrates many novel technologies enabling information collection, filtering, analysis and fusion from multiple, geographically dispersed devices. At the same time, the framework integrates advanced reasoning techniques in order to intelligently process and derive high level terrorist oriented semantics from a multitude of sensor streams.
The DEVA Laboratory is responsible for processing and understanding multimodal visual information from cameras and 3D sensors, sampled in different time instants, and situated in different locations. Special emphasis is on the fusion of different sources, such as satellite or airborne image data for remote sensing, potentially amended with terrestrial and UAV based imaging.
In our surveillance projects, an important issue is the tracking of objects/targets, and the detection and recognition of events by using multi-view camera networks, including infra-red sensors. In these applications calibration is always a problem, since security scenarios usually require quick installation and continuous troubleshooting. Another challenge is the co-registration of optical cameras and infra sensors for 3D tracking, since features of different modalities are usually hard to associate and compare.
In a recently finished project (PROACTIVE, EU FP-7) that included several European partners in defence and security, our main task was the visual tracking and analysis of human and vehicle behaviour  and crowd events.
The project addressed some specific emergency situations involving man-made or natural disasters and terrorism, i.e., frequent threats within our society. Avoiding an incident and mitigating its potential consequences requires the development and deployment of new solutions that exploit the recent advances in terms of technological platforms and problem solving strategies.
PROACTIVE produced an end-user driven solution. PROACTIVE prototypes include the following parts:
- Terrorist Reasoning Kernel: the reasoning layer provides the needed intelligence in order to infer additional information regarding the incoming suspicious event stream. This layer aids law enforcement officers by reasoning about threat levels of each incoming event and potentially inferring its association with a possible terrorist attack.
- Context Awareness Kernel: these processing modules provide semantic description about the environment and the static and moving (e.g., foreground) objects and sufficient information about the suspicious events and actions.
- C2 platform: the command and control platform is a multi-touch and multi-user web-application that provides a graphical user interface and enables the user to view maps (2D/3D), devices/sensors and alerts from the system.
The capabilities of Context Awareness Kernel could be demonstrated with different scenarios including the showcasing of the advantages of a multi-spectral sensor network:
Monitoring activities in crowded scenes
Analysing the dynamic parameters of motion trajectories and sending signals when ‘running’ movements are detected [L3]. Running pedestrians can also find a place to hide and observe the area. Crowd density estimation provides information for the definition of a top view mask image where the crowd density might cause errors during tracking. Alarms are generated when the average detected speed is higher than 2 m/s, and the detected object and its trajectory are highlighted with red (Figures 1 and 2).
Figure 1: Feature level fusion of multispectral (EO, IR) views which highlights mismatches and ‘invisible’ human shapes.
Figure 2: In this example, the general sensor configuration contained four cameras with overlapping fields of view.
Monitoring the parking area
In this scenario the objective was to validate the waiting/parking time durations in different areas (including where parking is prohibited). Object interactions were also investigated: the vehicle and driver connections were continuously checked and alarms were raised when possibly suspicious loitering movements were detected. To measure the parking duration, the behaviour analyser module followed the state changes/transitions and associated timestamps while the tracking method remained stable. In the following figure the parking car in the restricted area and the loitering human are marked with red (Figure 3).
Figure 3: The sensor configuration was comprised of three cameras.
As a continuation of the project, we are working on augmenting terrestrial camera networks with airborne (UAV) and satellite (Sentinel-2) information to get up-to-date and full surveyed area scans of critical infrastructures.
 D. Varga et al.: “A multi-view pedestrian tracking method in an uncalibrated camera network”, IEEE International Conference on Computer Vision Workshops, Santiago de Chile, 2015.
MTA SZTAKI, Hungary