next up previous
Next: Background Up: A Comparative Evaluation of Previous: A Comparative Evaluation of


Introduction

This paper is concerned with the task of predicting whether a sentence contains a grammatical error. An accurate method for carrying out automatic grammaticality judgements has uses in the areas of computer-assisted language learning and grammar checking. Comparative evaluation of existing error detection approaches has been hampered by a lack of large and commonly used evaluation error corpora. We attempt to overcome this by automatically creating a large error corpus, containing four different types of frequently occurring grammatical errors. We use this corpus to evaluate the performance of two approaches to the task of automatic error detection. One approach uses low-level detection techniques based on POS n-grams. The other approach is a novel parser-based method which employs deep linguistic processing to discriminate grammatical input from ungrammatical. For both approaches, we implement a basic solution, and then attempt to improve upon this solution using a decision tree classifier. We show that combining both methods improves upon the individual methods.

N-gram-based approaches to the problem of error detection have been proposed and implemented in various forms by Atwell 1987, Bigert and Knutsson 2002, and Chodorow and Leacock 2000 amongst others. Existing approaches are hard to compare since they are evaluated on different test sets which vary in size and error density. Furthermore, most of these approaches concentrate on one type of grammatical error only, namely, context-sensitive or real-word spelling errors. We implement a vanilla n-gram-based approach which is tested on a very large test set containing four different types of error.

The idea behind the parser-based approach to error detection is to use a broad-coverage hand-crafted precision grammar to detect ungrammatical sentences. This approach exploits the fact that a precision grammar is designed, in the traditional generative grammar sense [Chomsky 1957], to distinguish grammatical sentences from ungrammatical sentences. This is in contrast to treebank-based grammars which tend to massively overgenerate and do not generally aim to discriminate between the two. In order for our approach to work, the coverage of the precision grammars must be broad enough to parse a large corpus of grammatical sentences, and for this reason, we choose the XLE [Maxwell and Kaplan 1996], an efficient and robust parsing system for Lexical Functional Grammar (LFG) [Kaplan and Bresnan 1982] and the ParGram English grammar [Butt 2002] for our experiments. This system employs robustness techniques, some borrowed from Optimality Theory (OT) [Prince and Smolensky 1993], to parse extra-grammatical input [Frank 1998], but crucially still distinguishes between optimal and suboptimal solutions.

The evaluation corpus is a subset of an ungrammatical version of the British National Corpus (BNC), a 100 million word balanced corpus of British English [Burnard 2000]. This corpus is obtained by automatically inserting grammatical errors into the original BNC sentences based on an analysis of a manually compiled ``real'' error corpus.

This paper makes the following contributions to the task of automatic error detection:

  1. A novel deep processing XLE-based approach
  2. An effective and novel application of decision tree machine learning to both shallow and deep approaches
  3. A novel combination of deep and shallow processing
  4. An evaluation of an n-gram-based approach on a wider variety of errors than has previously been carried out
  5. A large evaluation error corpus

The paper is organised as follows: in Section 2, we describe previous approaches to the problem of error detection; in Section 3, a description of the error corpus used in our evaluation experiments is presented, and in Section 4, the two approaches to error detection are presented, evaluated, combined and compared. Section 5 provides a summary and suggestions for future work.


next up previous
Next: Background Up: A Comparative Evaluation of Previous: A Comparative Evaluation of
jwagner@computing.dcu.ie