Contrastive-regularized U-Net for video anomaly detection

Video anomaly detection aims to identify anomalous segments in a video. It is typically trained with weakly supervised video-level labels. This paper focuses on two crucial factors affecting the performance of video anomaly detection models. First, we explore how to capture the local and global temp...

Full description

Saved in:
Bibliographic Details
Main Authors: Gan, Kian Yu, Cheng, Yu Tong, Tan, Hung-Khoon, Ng, Hui-Fuang, Leung, Maylor Karhang, Chuah, Joon Huang
Format: Article
Published: Institute of Electrical and Electronics Engineers 2023
Subjects:
Online Access:http://eprints.um.edu.my/39002/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Video anomaly detection aims to identify anomalous segments in a video. It is typically trained with weakly supervised video-level labels. This paper focuses on two crucial factors affecting the performance of video anomaly detection models. First, we explore how to capture the local and global temporal dependencies more effectively. Previous architectures are effective at capturing either local and global information, but not both. We propose to employ a U-Net like structure to model both types of dependencies in a unified structure where the encoder learns global dependencies hierarchically on top of local ones; then the decoder propagates this global information back to the segment level for classification. Second, overfitting is a non-trivial issue for video anomaly detection due to limited training data. We propose weakly supervised contrastive regularization which adopts a feature-based approach to regularize the network. Contrastive regularization learns more generalizable features by enforcing inter-class separability and intra-class compactness. Extensive experiments on the UCF-Crime dataset shows that our approach outperforms several state-of-the-art methods.