next up previous
Next: Conclusions Up: Overview of Papers Previous: Learning Rules and their

Shallow Parsing using Noisy and Non-Stationary Training Material

[Osborne(2002)] considered an issue that has gone largely unaddressed in the shallow parsing literature, namely what happens when the training set is either noisy, or else drawn from a different distribution to the testing material.

This paper took a range of shallow parsers (including both single model parsers and ensemble parsers) and trained them using various types of artificially noisy material. In a second set of experiments, the issue of whether naturally occurring disfluencies have more impact on performance than a change in the distribution of the training material was investigated. It was found that the changes in the distribution are more important.

The author drew various conclusions from this work. Shallow parsers are robust and only large quantities of noise will significantly impair performance. Should one wish to improve performance then simple parser specific extensions can help. No single technique worked best with all types of noise with different kinds of noise favouring different parsers. Regarding the results on changes in the distribution of training data, the clear lesson is that if one wishes to improve the performance of shallow parsers on a particular task, it is better to annotate more examples from the target distribution than to use additional training material from other distributions.

One surprise in this paper is that the parsers employing system combination, although generally the best performers in the literature, were not always the best at dealing with noise. Clearly, ensemble learning is not always a sure-fire strategy.


next up previous
Next: Conclusions Up: Overview of Papers Previous: Learning Rules and their
Hammerton J. 2002-03-12