Next: Acknowledgements Up: Introduction to Special Issue Previous: Shallow Parsing using Noisy

Conclusions

In summary, a few points can be made:

Feature selection, as in machine learning in general, is an important consideration for machine learning of shallow parsers. Some learning approaches only work well when the features have been carefully selected and weighted, whilst others can cope with large numbers of irrelevant features. The Winnow and MBL papers both clearly illustrated these considerations.
A recent trend in the literature is for performance to be potentially improved by training several classifiers on the task and combining their results to produce a final result. This can be done in various ways such as using various (weighted) voting methods and using stacked classifiers. This however is not guaranteed to produce the best results as Osborne's paper above illustrates.
The majority of the systems are probabilistic, with the obvious exception of MBL. Few shallow parsers reported in the literature are, for example based upon Inductive Logic Programming or neural networks. It seems that the reason for this is the need for scalability.
All parsers assumed labelled input. Clearly this limits performance, as only a small amount of labelled training material exists. [Zhang et al.(2002)Zhang, Damereau, and Johnson] did use other knowledge sources, in addition to the training set.
Shallow parsers are noise-tolerant, and only massive quantities of noise will significantly undermine performance.
Not all shallow parsers used generative models (as might be expected from the nature of the task). Discriminative models (those which attempt to maximise the difference between alternative labels, but not necessarily model the distribution of annotated sentences) are also employed. However, the exact link between these two classes of models has yet to be demonstrated.

Research in shallow parsing is clearly ongoing. We hope that more machine learning researchers will take-up the gauntlet and include shallow parsing as an additional, real-world domain with which to evaluate machine learning systems.

Next: Acknowledgements Up: Introduction to Special Issue Previous: Shallow Parsing using Noisy

Hammerton J. 2002-03-12