A scene invariant convolutional neural network for visual crowd counting using fast-lane and sample selective methods
Convolutional neural network (CNN) based crowd counting aims to estimate the number of pedestrians from the image. Existing research usually follow the training-testing protocol within a single dataset and the accuracy drops when conducting cross-dataset evaluation. Density map prediction methodolog...
Saved in:
Main Author: | |
---|---|
Format: | Final Year Project / Dissertation / Thesis |
Published: |
2023
|
Subjects: | |
Online Access: | http://eprints.utar.edu.my/5945/1/1606100_Teoh_Shen_Khang.pdf http://eprints.utar.edu.my/5945/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my-utar-eprints.5945 |
---|---|
record_format |
eprints |
spelling |
my-utar-eprints.59452024-01-01T13:06:15Z A scene invariant convolutional neural network for visual crowd counting using fast-lane and sample selective methods Teoh, Shen Khang T Technology (General) TR Photography Convolutional neural network (CNN) based crowd counting aims to estimate the number of pedestrians from the image. Existing research usually follow the training-testing protocol within a single dataset and the accuracy drops when conducting cross-dataset evaluation. Density map prediction methodology is widely used but it has drawbacks in ground truth generation and the use of Euclidean distance results in low quality density map. Additionally, CNN models face the challenges of vanishing gradients and zero weights, leading to low accuracy in predictions. This study uses global regression methodology and whole image-based training pattern to directly estimates the final count from image. The proposed model is designed with single column architecture using single filter size and max pooling size. Fast lane connection and sample selective algorithms have been designed specifically to tackle the issue of vanishing gradient and enhance the quality of the model. The performance of the proposed model, which is scene-invariant, was assessed using the ShanghaiTech dataset, the UCSD dataset, and the Mall dataset. It achieved an average MAE of 2.75 and a MSE of 3.65. As a result of the proposed method, the model performs well overall and exhibits improved generalisability to unseen scenes. 2023-05 Final Year Project / Dissertation / Thesis NonPeerReviewed application/pdf http://eprints.utar.edu.my/5945/1/1606100_Teoh_Shen_Khang.pdf Teoh, Shen Khang (2023) A scene invariant convolutional neural network for visual crowd counting using fast-lane and sample selective methods. PhD thesis, UTAR. http://eprints.utar.edu.my/5945/ |
institution |
Universiti Tunku Abdul Rahman |
building |
UTAR Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Tunku Abdul Rahman |
content_source |
UTAR Institutional Repository |
url_provider |
http://eprints.utar.edu.my |
topic |
T Technology (General) TR Photography |
spellingShingle |
T Technology (General) TR Photography Teoh, Shen Khang A scene invariant convolutional neural network for visual crowd counting using fast-lane and sample selective methods |
description |
Convolutional neural network (CNN) based crowd counting aims to estimate the number of pedestrians from the image. Existing research usually follow the training-testing protocol within a single dataset and the accuracy drops when conducting cross-dataset evaluation. Density map prediction methodology is widely used but it has drawbacks in ground truth generation and the use of Euclidean distance results in low quality density map. Additionally, CNN models face the challenges of vanishing gradients and zero weights, leading to low accuracy in predictions. This study uses global regression methodology and whole image-based training pattern to directly estimates the final count from image. The proposed model is designed with single column architecture using single filter size and max pooling size. Fast lane connection and sample selective algorithms have been designed specifically to tackle the issue of vanishing gradient and enhance the quality of the model. The performance of the proposed model, which is scene-invariant, was assessed using the ShanghaiTech dataset, the UCSD dataset, and the Mall dataset. It achieved an average MAE of 2.75 and a MSE of 3.65. As a result of the proposed method, the model performs well overall and exhibits improved generalisability to unseen scenes. |
format |
Final Year Project / Dissertation / Thesis |
author |
Teoh, Shen Khang |
author_facet |
Teoh, Shen Khang |
author_sort |
Teoh, Shen Khang |
title |
A scene invariant convolutional neural network for visual crowd counting using fast-lane and sample selective methods
|
title_short |
A scene invariant convolutional neural network for visual crowd counting using fast-lane and sample selective methods
|
title_full |
A scene invariant convolutional neural network for visual crowd counting using fast-lane and sample selective methods
|
title_fullStr |
A scene invariant convolutional neural network for visual crowd counting using fast-lane and sample selective methods
|
title_full_unstemmed |
A scene invariant convolutional neural network for visual crowd counting using fast-lane and sample selective methods
|
title_sort |
scene invariant convolutional neural network for visual crowd counting using fast-lane and sample selective methods |
publishDate |
2023 |
url |
http://eprints.utar.edu.my/5945/1/1606100_Teoh_Shen_Khang.pdf http://eprints.utar.edu.my/5945/ |
_version_ |
1787140942538473472 |
score |
13.214268 |