Moving object detection under dynamic backgrounds is important in many applications in human life, such as autonomous driving and smart home.
Much attention has been paid to the speed and accuracy of the detection in various applications. Besides the trade-off between the speed and accuracy of object detection, multi-angle, camera motion and real-time requirement make it more difficult to detect moving target(s) under unmanned aerial vehicle (UAV).
A research team led by Prof. ZHOU Yimin from the Shenzhen Institutes of Advanced Technology (SIAT) of the Chinese Academy of Sciences proposed a fast-running human detection system for the UAV based on optical flow and deep convolution networks.
This study was published in IET Intelligent Transport Systems on April 30.
The researchers conducted a serial of preprocessing to extract effective region of interest (ROI), including image greying and image blurring. The image blurring started with filtering the original frame with a high-pass filter in the time domain and a low-pass filter in the spatial domain under multi-scales.
To fast locate the candidate targets, the optical flow representing motion information was calculated with every two successive frames.
Afterwards, a series of prior-processing operations, including spatial average filtering, morphological expansion and outer contour extraction, were performed to extract the salient region of the moving targets, i.e. running and walking people.
In order to avoid the failure of the extraction, researchers further proposed a leak-filling algorithm based on morphology.
A mean filtering was adopted to blur the motion information, and its outcome was operated through the morphological expansion for accurate bounding-box prediction.
Besides, an adaptive threshold based on optical flow as the segmentation standard was adopted to eliminate the ill effects caused by UAV motion or jittering.
Field experiments and benchmark testing have demonstrated that the proposed system can achieve detection of a running person at the speed of 15 frames per second (fps) with 81.1% accuracy under complex environments with UAV scenes.
In the future, the team will investigate methods with the combination of different modules, such as Kalman filter to track moving targets to achieve high-level semantic segmentation or video-based action classification for abnormal event detection.
52 Sanlihe Rd., Beijing,
Copyright © 2002 - Chinese Academy of Sciences