Software Demos
This online tool allows the user to type in plain text. This text is automatically tagged. The tagged string is sent to a probabilistic parser which uses a grammar that has been annotated with F-Structure equations. It has also been compacted using a simple thresholding method. The results of the parser are sent to a constraint solver which outputs an LFG F-Structure
This is an online version of my final year project which also incorporates a stochastic part of speech tagger
This is a treebank tool suite (TTS) for and derived from the Penn-II treebank resource (Marcus et al, 1993). The tools include treebank inspection and viewing options which support search for CF-PSG rule tokens extracted from the treebank, graphical display of complete trees containing the rule instance, display of subtrees rooted by the rule instance and display of the yield of the subtree (with or without context). The search can be further restricted by constraining the yield to contain particular strings. Rules can be ordered by frequency and the user can set frequency thresholds. To process new text, the tool suite provides a PCFG chart parser (based on the CYK algorithm) operating on CFG grammars extracted from the treebank following the method of (Charniak, 1996) as well as a HMM bi-/trigram tagger trained on the tagged version of the treebank resource. The system is implemented in Java and Perl. We employ the InterArbora module based on the Thistle display engine (LTG, 2001) as our tree grapher.
This project involved extracting a probabilistic context-free grammar from the Penn II Treebank. Using this grammar, I then developed a chart parser, based on the CYK Algorithm.
As part of a language module in my final year, we developed a webpage describing the Applied Computational Linguistics Course in DCU. (Written mostly in French and German)