Event detection for smart conference room using multi-stream convolutional neural network

Conferencing/meeting is an activity that is inevitable in almost every workplace because it acts as a prominent role for determining the future of business operations. most of the existing systems are developed upon occupancy analysis technique, which are just aimed to detect the presence of occupan...

Full description

Saved in:
Bibliographic Details
Main Author: Khoo, Belinda Pai Lin
Format: Final Year Project / Dissertation / Thesis
Published: 2020
Subjects:
Online Access:http://eprints.utar.edu.my/3843/1/16ACB02442_FYP.pdf
http://eprints.utar.edu.my/3843/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Conferencing/meeting is an activity that is inevitable in almost every workplace because it acts as a prominent role for determining the future of business operations. most of the existing systems are developed upon occupancy analysis technique, which are just aimed to detect the presence of occupants in a meeting room instead of the on-going activities. Event detection in meeting rooms is critically important as one may misuse the conference room by occupying it merely for irrelevant purposes. To ensure that everyone is utilizing company resources in a proper way, this project delivers a web-based Smart Conference Room System for classifying the happening events in meeting rooms. In order to achieve this, some human action recognition techniques would be applied for capturing and understanding the motion information of the occupants. In this project, the R(2+1)D with variant of 34 layers architecture (Tran et al. 2018) is proposed as the network architecture and it will be built within a two-stream framework for capturing the spatiotemporal features of a video. The model is pretrained on the Kinetics Human Action dataset before finetuning with the Conference Dataset collected from meeting rooms in company X. The raw footages collected from Company X are being preprocessed through in-depth data annotation and labelling based on the ongoing activities in different meeting rooms. After the successful attempt of acquiring the pretrained model, the learned features and weights are then transferred for finetuning the newer model that is based on the preprocessed Conference Dataset. Consequently, the newer model is integrated into a web-based system in order to handle event detections in a meeting room. Apart from that, one of the approaches for object detection, You Live Only Once, also known as YOLO, will be incorporated into this system to act as an object counter for providing extensive information. Additional analytics are delivered in this system for companies to gain insights into the usage of meeting rooms.