Summarizing Text Articles with Dirichlet Distribution

The Latent Dirichlet Allocation (LDA) is based on the hypothesis that a person writing a document has topics in mind. To write about a topic then means to pick a word with a certain probability from the pool of words of that topic. A document can then be represented as a mixture of various topics...

Full description

Saved in:
Bibliographic Details
Main Author: Mohamed, Noor Zalifah
Format: Final Year Project
Language:English
Published: Universiti Teknologi Petronas 2011
Subjects:
Online Access:http://utpedia.utp.edu.my/8730/1/2011%20-%20Summarizing%20text%20articles%20with%20dirichlet%20distribution.pdf
http://utpedia.utp.edu.my/8730/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utp-utpedia.8730
record_format eprints
spelling my-utp-utpedia.87302017-01-25T09:41:42Z http://utpedia.utp.edu.my/8730/ Summarizing Text Articles with Dirichlet Distribution Mohamed, Noor Zalifah T Technology (General) The Latent Dirichlet Allocation (LDA) is based on the hypothesis that a person writing a document has topics in mind. To write about a topic then means to pick a word with a certain probability from the pool of words of that topic. A document can then be represented as a mixture of various topics. LDA is a generative probabilistic model for a corpus of discrete data, such as the words in a set of documents. LDA models the words in the documents under "bag-of-words" assumption, which basically ignores the orders of the words in the documents. Following this "exchangeability", the distribution of the words would be independent and identically distributed given conditioned on some parameters. This conditionally independence allows us to build a hierarchical Bayesian model for a corpus of documents and words. The objective is to develop a text sununarization system base on the Latent Dirichlet Allocation (LDA) method. The system would be used to determine the accuracy level of the method. This is done by comparing the result produced by the text summarization system with an existing sununary that is produced by a human. Universiti Teknologi Petronas 2011-09 Final Year Project NonPeerReviewed application/pdf en http://utpedia.utp.edu.my/8730/1/2011%20-%20Summarizing%20text%20articles%20with%20dirichlet%20distribution.pdf Mohamed, Noor Zalifah (2011) Summarizing Text Articles with Dirichlet Distribution. Universiti Teknologi Petronas. (Unpublished)
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Electronic and Digitized Intellectual Asset
url_provider http://utpedia.utp.edu.my/
language English
topic T Technology (General)
spellingShingle T Technology (General)
Mohamed, Noor Zalifah
Summarizing Text Articles with Dirichlet Distribution
description The Latent Dirichlet Allocation (LDA) is based on the hypothesis that a person writing a document has topics in mind. To write about a topic then means to pick a word with a certain probability from the pool of words of that topic. A document can then be represented as a mixture of various topics. LDA is a generative probabilistic model for a corpus of discrete data, such as the words in a set of documents. LDA models the words in the documents under "bag-of-words" assumption, which basically ignores the orders of the words in the documents. Following this "exchangeability", the distribution of the words would be independent and identically distributed given conditioned on some parameters. This conditionally independence allows us to build a hierarchical Bayesian model for a corpus of documents and words. The objective is to develop a text sununarization system base on the Latent Dirichlet Allocation (LDA) method. The system would be used to determine the accuracy level of the method. This is done by comparing the result produced by the text summarization system with an existing sununary that is produced by a human.
format Final Year Project
author Mohamed, Noor Zalifah
author_facet Mohamed, Noor Zalifah
author_sort Mohamed, Noor Zalifah
title Summarizing Text Articles with Dirichlet Distribution
title_short Summarizing Text Articles with Dirichlet Distribution
title_full Summarizing Text Articles with Dirichlet Distribution
title_fullStr Summarizing Text Articles with Dirichlet Distribution
title_full_unstemmed Summarizing Text Articles with Dirichlet Distribution
title_sort summarizing text articles with dirichlet distribution
publisher Universiti Teknologi Petronas
publishDate 2011
url http://utpedia.utp.edu.my/8730/1/2011%20-%20Summarizing%20text%20articles%20with%20dirichlet%20distribution.pdf
http://utpedia.utp.edu.my/8730/
_version_ 1739831597802192896
score 13.18916