Bernoulli Mixture Models for Markov Blanket Filtering and Classification

Mehreen Saeed
Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008, PMLR 3:77-91, 2008.

Abstract

This paper presents the use of Bernoulli mixture models for Markov blanket filtering and classification of binary data. Bernoulli mixture models can be seen as a tool for partitioning an n-dimensional hypercube, identifying regions of high data density on the corners of the hypercube. Once Bernoulli mixture models are computed from a training dataset we use them for determining the Markov blanket of the target variable. An algorithm for Markov blanket filtering was proposed by Koller and Sahami (1996), which is a greedy search method for feature subset selection and it outputs an approximation to the optimal feature selection criterion. However, they use the entire training instances for computing the conditioning sets and have to limit the size of these sets for computational efficiency and avoiding data fragmentation. We have adapted their algorithm to use Bernoulli mixture models instead, hence, overcoming the short comings of their algorithm and increasing the efficiency of this algorithm considerably. Once a feature subset is identified we perform classification using these mixture models. We have applied this algorithm to the causality challenge datasets. Our prediction scores were ranked fourth on SIDO and our feature scores were ranked the best for test sets 1 and 2 of the same dataset.

Cite this Paper


BibTeX
@InProceedings{pmlr-v3-saeed08a, title = {Bernoulli Mixture Models for Markov Blanket Filtering and Classification}, author = {Saeed, Mehreen}, booktitle = {Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008}, pages = {77--91}, year = {2008}, editor = {Guyon, Isabelle and Aliferis, Constantin and Cooper, Greg and Elisseeff, André and Pellet, Jean-Philippe and Spirtes, Peter and Statnikov, Alexander}, volume = {3}, series = {Proceedings of Machine Learning Research}, address = {Hong Kong}, month = {03--04 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v3/saeed08a/saeed08a.pdf}, url = {http://proceedings.mlr.press/v3/saeed08a.html}, abstract = {This paper presents the use of Bernoulli mixture models for Markov blanket filtering and classification of binary data. Bernoulli mixture models can be seen as a tool for partitioning an n-dimensional hypercube, identifying regions of high data density on the corners of the hypercube. Once Bernoulli mixture models are computed from a training dataset we use them for determining the Markov blanket of the target variable. An algorithm for Markov blanket filtering was proposed by Koller and Sahami (1996), which is a greedy search method for feature subset selection and it outputs an approximation to the optimal feature selection criterion. However, they use the entire training instances for computing the conditioning sets and have to limit the size of these sets for computational efficiency and avoiding data fragmentation. We have adapted their algorithm to use Bernoulli mixture models instead, hence, overcoming the short comings of their algorithm and increasing the efficiency of this algorithm considerably. Once a feature subset is identified we perform classification using these mixture models. We have applied this algorithm to the causality challenge datasets. Our prediction scores were ranked fourth on SIDO and our feature scores were ranked the best for test sets 1 and 2 of the same dataset.} }
Endnote
%0 Conference Paper %T Bernoulli Mixture Models for Markov Blanket Filtering and Classification %A Mehreen Saeed %B Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008 %C Proceedings of Machine Learning Research %D 2008 %E Isabelle Guyon %E Constantin Aliferis %E Greg Cooper %E André Elisseeff %E Jean-Philippe Pellet %E Peter Spirtes %E Alexander Statnikov %F pmlr-v3-saeed08a %I PMLR %P 77--91 %U http://proceedings.mlr.press/v3/saeed08a.html %V 3 %X This paper presents the use of Bernoulli mixture models for Markov blanket filtering and classification of binary data. Bernoulli mixture models can be seen as a tool for partitioning an n-dimensional hypercube, identifying regions of high data density on the corners of the hypercube. Once Bernoulli mixture models are computed from a training dataset we use them for determining the Markov blanket of the target variable. An algorithm for Markov blanket filtering was proposed by Koller and Sahami (1996), which is a greedy search method for feature subset selection and it outputs an approximation to the optimal feature selection criterion. However, they use the entire training instances for computing the conditioning sets and have to limit the size of these sets for computational efficiency and avoiding data fragmentation. We have adapted their algorithm to use Bernoulli mixture models instead, hence, overcoming the short comings of their algorithm and increasing the efficiency of this algorithm considerably. Once a feature subset is identified we perform classification using these mixture models. We have applied this algorithm to the causality challenge datasets. Our prediction scores were ranked fourth on SIDO and our feature scores were ranked the best for test sets 1 and 2 of the same dataset.
RIS
TY - CPAPER TI - Bernoulli Mixture Models for Markov Blanket Filtering and Classification AU - Mehreen Saeed BT - Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008 DA - 2008/12/31 ED - Isabelle Guyon ED - Constantin Aliferis ED - Greg Cooper ED - André Elisseeff ED - Jean-Philippe Pellet ED - Peter Spirtes ED - Alexander Statnikov ID - pmlr-v3-saeed08a PB - PMLR DP - Proceedings of Machine Learning Research VL - 3 SP - 77 EP - 91 L1 - http://proceedings.mlr.press/v3/saeed08a/saeed08a.pdf UR - http://proceedings.mlr.press/v3/saeed08a.html AB - This paper presents the use of Bernoulli mixture models for Markov blanket filtering and classification of binary data. Bernoulli mixture models can be seen as a tool for partitioning an n-dimensional hypercube, identifying regions of high data density on the corners of the hypercube. Once Bernoulli mixture models are computed from a training dataset we use them for determining the Markov blanket of the target variable. An algorithm for Markov blanket filtering was proposed by Koller and Sahami (1996), which is a greedy search method for feature subset selection and it outputs an approximation to the optimal feature selection criterion. However, they use the entire training instances for computing the conditioning sets and have to limit the size of these sets for computational efficiency and avoiding data fragmentation. We have adapted their algorithm to use Bernoulli mixture models instead, hence, overcoming the short comings of their algorithm and increasing the efficiency of this algorithm considerably. Once a feature subset is identified we perform classification using these mixture models. We have applied this algorithm to the causality challenge datasets. Our prediction scores were ranked fourth on SIDO and our feature scores were ranked the best for test sets 1 and 2 of the same dataset. ER -
APA
Saeed, M.. (2008). Bernoulli Mixture Models for Markov Blanket Filtering and Classification. Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008, in Proceedings of Machine Learning Research 3:77-91 Available from http://proceedings.mlr.press/v3/saeed08a.html.

Related Material