Online network traffic classification with incremental learning

Conventional network traffic detection methods based on data mining could not efficiently handle high throughput traffic with concept drift. Data stream mining techniques are able to classify evolving data streams although most techniques require completely labeled data. This paper proposes an impro...

Full description

Saved in:
Bibliographic Details
Main Authors: Loo, H. R., Marsono, M. N.
Format: Article
Published: Springer Verlag 2016
Subjects:
Online Access:http://eprints.utm.my/id/eprint/72428/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-84970016970&doi=10.1007%2fs12530-016-9152-x&partnerID=40&md5=a6993ee33d55a22bd76faa6e25a23b9c
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.72428
record_format eprints
spelling my.utm.724282017-11-21T08:17:10Z http://eprints.utm.my/id/eprint/72428/ Online network traffic classification with incremental learning Loo, H. R. Marsono, M. N. TK Electrical engineering. Electronics Nuclear engineering Conventional network traffic detection methods based on data mining could not efficiently handle high throughput traffic with concept drift. Data stream mining techniques are able to classify evolving data streams although most techniques require completely labeled data. This paper proposes an improved data stream mining algorithm for online network traffic classification that is able to incrementally learn from both labeled and unlabeled flows. The algorithm uses the concept of incremental k-means and self-training semi-supervised method to continuously update the classification model after receiving new flow instances. The experimental results show that the proposed algorithm is able to classify 325 thousands flow instances per second and achieves up to 91–94 % average accuracy, even when using 10 % of labeled input flows. It is also able to maintain high accuracy even in the presence of concept drifts. Although there are drifts detected in the datasets evaluated using the Drift Detection Method, our proposed method with incremental learning is able to achieve up to 91–94 % accuracy compared to 60–69 % without incremental learning. Springer Verlag 2016 Article PeerReviewed Loo, H. R. and Marsono, M. N. (2016) Online network traffic classification with incremental learning. Evolving Systems, 7 (2). pp. 129-143. ISSN 1868-6478 https://www.scopus.com/inward/record.uri?eid=2-s2.0-84970016970&doi=10.1007%2fs12530-016-9152-x&partnerID=40&md5=a6993ee33d55a22bd76faa6e25a23b9c
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Loo, H. R.
Marsono, M. N.
Online network traffic classification with incremental learning
description Conventional network traffic detection methods based on data mining could not efficiently handle high throughput traffic with concept drift. Data stream mining techniques are able to classify evolving data streams although most techniques require completely labeled data. This paper proposes an improved data stream mining algorithm for online network traffic classification that is able to incrementally learn from both labeled and unlabeled flows. The algorithm uses the concept of incremental k-means and self-training semi-supervised method to continuously update the classification model after receiving new flow instances. The experimental results show that the proposed algorithm is able to classify 325 thousands flow instances per second and achieves up to 91–94 % average accuracy, even when using 10 % of labeled input flows. It is also able to maintain high accuracy even in the presence of concept drifts. Although there are drifts detected in the datasets evaluated using the Drift Detection Method, our proposed method with incremental learning is able to achieve up to 91–94 % accuracy compared to 60–69 % without incremental learning.
format Article
author Loo, H. R.
Marsono, M. N.
author_facet Loo, H. R.
Marsono, M. N.
author_sort Loo, H. R.
title Online network traffic classification with incremental learning
title_short Online network traffic classification with incremental learning
title_full Online network traffic classification with incremental learning
title_fullStr Online network traffic classification with incremental learning
title_full_unstemmed Online network traffic classification with incremental learning
title_sort online network traffic classification with incremental learning
publisher Springer Verlag
publishDate 2016
url http://eprints.utm.my/id/eprint/72428/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-84970016970&doi=10.1007%2fs12530-016-9152-x&partnerID=40&md5=a6993ee33d55a22bd76faa6e25a23b9c
_version_ 1643656438640279552
score 13.18916