Inferring (k,l)-context-sensitive probabilistic context-free grammars using hierarchical Pitman-Yor processes

Chihiro Shibata
The 12th International Conference on Grammatical Inference, PMLR 34:153-166, 2014.

Abstract

Motivated by the idea of applying nonparametric Bayesian models to dual approaches for distributional learning, we define (k,l)-context-sensitive probabilistic context-free grammars (PCFGs) using hierarchical Pitman-Yor processes (PYPs). The data sparseness problem that occurs when inferring context-sensitive probabilities for rules is handled by the smoothing effect of hierarchical PYPs. Many possible definitions or constructions of PYP hierarchies can be used to represent the context sensitivity of derivations of CFGs in Chomsky normal form. In this study, we use a definition that is considered to be the most natural as an extension of infinite PCFGs defined in previous studies. A Markov Chain Monte Carlo method called blocked Metropolis-Hastings (MH) sampling is known to be effective for inferring PCFGs from unsupervised sentences. Blocked MH sampling is applicable to (k,l)-context-sensitive PCFGs by modifying their so-called inside probabilities. We show that the computational cost of blocked MH sampling for (k,l)-context-sensitive PCFGs is O(|V|^l+3|s|^3) for each sentence s, where V is a set of nonterminals. This cost is too high to iterate sufficient sampling times, especially when l ≠0, thus we propose an alternative sampling method that separates the sampling procedure into pointwise sampling for nonterminals and blocked sampling for rules. The computational cost of this sampling method is O(\min{|s|^l,|V|^l} (|V||s|^2+|s|^3) ).

Cite this Paper


BibTeX
@InProceedings{pmlr-v34-shibata14a, title = {Inferring (k,l)-context-sensitive probabilistic context-free grammars using hierarchical Pitman-Yor processes}, author = {Shibata, Chihiro}, booktitle = {The 12th International Conference on Grammatical Inference}, pages = {153--166}, year = {2014}, editor = {Clark, Alexander and Kanazawa, Makoto and Yoshinaka, Ryo}, volume = {34}, series = {Proceedings of Machine Learning Research}, address = {Kyoto, Japan}, month = {17--19 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v34/shibata14a.pdf}, url = {https://proceedings.mlr.press/v34/shibata14a.html}, abstract = {Motivated by the idea of applying nonparametric Bayesian models to dual approaches for distributional learning, we define (k,l)-context-sensitive probabilistic context-free grammars (PCFGs) using hierarchical Pitman-Yor processes (PYPs). The data sparseness problem that occurs when inferring context-sensitive probabilities for rules is handled by the smoothing effect of hierarchical PYPs. Many possible definitions or constructions of PYP hierarchies can be used to represent the context sensitivity of derivations of CFGs in Chomsky normal form. In this study, we use a definition that is considered to be the most natural as an extension of infinite PCFGs defined in previous studies. A Markov Chain Monte Carlo method called blocked Metropolis-Hastings (MH) sampling is known to be effective for inferring PCFGs from unsupervised sentences. Blocked MH sampling is applicable to (k,l)-context-sensitive PCFGs by modifying their so-called inside probabilities. We show that the computational cost of blocked MH sampling for (k,l)-context-sensitive PCFGs is O(|V|^l+3|s|^3) for each sentence s, where V is a set of nonterminals. This cost is too high to iterate sufficient sampling times, especially when l ≠0, thus we propose an alternative sampling method that separates the sampling procedure into pointwise sampling for nonterminals and blocked sampling for rules. The computational cost of this sampling method is O(\min{|s|^l,|V|^l} (|V||s|^2+|s|^3) ). } }
Endnote
%0 Conference Paper %T Inferring (k,l)-context-sensitive probabilistic context-free grammars using hierarchical Pitman-Yor processes %A Chihiro Shibata %B The 12th International Conference on Grammatical Inference %C Proceedings of Machine Learning Research %D 2014 %E Alexander Clark %E Makoto Kanazawa %E Ryo Yoshinaka %F pmlr-v34-shibata14a %I PMLR %P 153--166 %U https://proceedings.mlr.press/v34/shibata14a.html %V 34 %X Motivated by the idea of applying nonparametric Bayesian models to dual approaches for distributional learning, we define (k,l)-context-sensitive probabilistic context-free grammars (PCFGs) using hierarchical Pitman-Yor processes (PYPs). The data sparseness problem that occurs when inferring context-sensitive probabilities for rules is handled by the smoothing effect of hierarchical PYPs. Many possible definitions or constructions of PYP hierarchies can be used to represent the context sensitivity of derivations of CFGs in Chomsky normal form. In this study, we use a definition that is considered to be the most natural as an extension of infinite PCFGs defined in previous studies. A Markov Chain Monte Carlo method called blocked Metropolis-Hastings (MH) sampling is known to be effective for inferring PCFGs from unsupervised sentences. Blocked MH sampling is applicable to (k,l)-context-sensitive PCFGs by modifying their so-called inside probabilities. We show that the computational cost of blocked MH sampling for (k,l)-context-sensitive PCFGs is O(|V|^l+3|s|^3) for each sentence s, where V is a set of nonterminals. This cost is too high to iterate sufficient sampling times, especially when l ≠0, thus we propose an alternative sampling method that separates the sampling procedure into pointwise sampling for nonterminals and blocked sampling for rules. The computational cost of this sampling method is O(\min{|s|^l,|V|^l} (|V||s|^2+|s|^3) ).
RIS
TY - CPAPER TI - Inferring (k,l)-context-sensitive probabilistic context-free grammars using hierarchical Pitman-Yor processes AU - Chihiro Shibata BT - The 12th International Conference on Grammatical Inference DA - 2014/08/30 ED - Alexander Clark ED - Makoto Kanazawa ED - Ryo Yoshinaka ID - pmlr-v34-shibata14a PB - PMLR DP - Proceedings of Machine Learning Research VL - 34 SP - 153 EP - 166 L1 - http://proceedings.mlr.press/v34/shibata14a.pdf UR - https://proceedings.mlr.press/v34/shibata14a.html AB - Motivated by the idea of applying nonparametric Bayesian models to dual approaches for distributional learning, we define (k,l)-context-sensitive probabilistic context-free grammars (PCFGs) using hierarchical Pitman-Yor processes (PYPs). The data sparseness problem that occurs when inferring context-sensitive probabilities for rules is handled by the smoothing effect of hierarchical PYPs. Many possible definitions or constructions of PYP hierarchies can be used to represent the context sensitivity of derivations of CFGs in Chomsky normal form. In this study, we use a definition that is considered to be the most natural as an extension of infinite PCFGs defined in previous studies. A Markov Chain Monte Carlo method called blocked Metropolis-Hastings (MH) sampling is known to be effective for inferring PCFGs from unsupervised sentences. Blocked MH sampling is applicable to (k,l)-context-sensitive PCFGs by modifying their so-called inside probabilities. We show that the computational cost of blocked MH sampling for (k,l)-context-sensitive PCFGs is O(|V|^l+3|s|^3) for each sentence s, where V is a set of nonterminals. This cost is too high to iterate sufficient sampling times, especially when l ≠0, thus we propose an alternative sampling method that separates the sampling procedure into pointwise sampling for nonterminals and blocked sampling for rules. The computational cost of this sampling method is O(\min{|s|^l,|V|^l} (|V||s|^2+|s|^3) ). ER -
APA
Shibata, C.. (2014). Inferring (k,l)-context-sensitive probabilistic context-free grammars using hierarchical Pitman-Yor processes. The 12th International Conference on Grammatical Inference, in Proceedings of Machine Learning Research 34:153-166 Available from https://proceedings.mlr.press/v34/shibata14a.html.

Related Material