User Manual


Contents

Back to List of Documentation

Getting Started
Back to Top

To run the PCFG Parser type
java Parser

This will bring up the Graphical User Interface.

Graphical User Interface
Back to Top
The main screen of the Interface looks like:


This Button allows you to select a particular file to parse
This Button will try to parse any text in the large text area in the center of the screen
This Button will clear any text that is in the text area
Clicking on this button will exit the program

The About Menu



Clicking on will give you some information about the project

Clicking on will give you some information about the project


The Progress Monitor will give you information about which sentence the parser is parsing at any particular time, as well as report results found and any errors that occur.

Inputting Text for Parsing
Back to Top
There are two possible methods of inputting text for parsing.
  1. Manual Entry of Text

    You can type a tagged sentence into the large text area in the center of the screen. A sample tagged sentence might look like:
    The/DT man/NN died/VBN ./.
    
    The tags must be part of the Penn II Tagset or the parser will be unable to find a parse for the sentence. If more than one sentence is entered, the parser will only try to find a parse for one sentence. If you want to parse more than one sentence at a time, you should save all the sentences in a file, and then parse the file following the instructions below.

    If while typing your sentence you make some mistakes and you want to clear the text area, simply click on the button.
    Once you are satisfied with the tagged sentence you have typed in, the program will parse the sentence if you click on the button.

  2. Using a Tagged File as Input

    Clicking on the button will open a dialogue asking you to choose a file to parse. You will be presented with all the files on your local network and can simply select one. It is not possible to select more than one file at a time.

    If you are choosing a file that is not part of the Penn Treebank, make sure that it only contains tags from the Penn II Tagset. If sentence boundaries are unclear, you should mark them with a row of = signs. The following example clearly shows that the ./. is not the end of the sentence.
    [ stocks/NNS ]
    and/CC offset/VBP 
    [ the/DT trade/NN ]
    in/IN 
    [ futures/NNS ]
    to/TO lock/VB in/RP 
    [ a/DT price/NN difference/NN ]
    ./. 
    
     )/) 
    ========================================
    
    To be certain that all sentences are interpreted as you intended, you should separate all sentences with a row of = signs.

    Once you have selected your file, click on the button and the parser will try to parse all tagged sentences in that file.

Viewing the Results
Back to Top
The results are displayed in graphical tree form using the Treescape program downloaded from http://www.cis.upenn.edu/~josephr/Treefig/ .
Here is what a sample output might look like:



If you parsed a file, you can click on the and buttons to navigate through all the successful parses found in the file. You can also enter the number of a particular sentence to view its parse tree.
Caution:
If you type in the number of a sentence, be aware that the number may not correspond to the required tree. If any sentence previous to the required sentence could not be parsed, it will not be represented in the sequence of parse trees. This means that the numbers will not correspond to the original sequence of sentences.
Saving the Results
Back to Top
The results of your parse will be saved in bracketed form in a file. The probability of each parse will be stored in a file called probs.txt This file gets overwritten after each file or sentence is parsed, so if you want to save it you should rename it. You will need this file if you want to view the results at a later stage without having to run the parser again.

If you typed in a sentence manually, the results will be stored in a file called output_parse.prd. Remember that this file is overwritten every time you click on the button, so if you want to save it, you should rename it.

If you chose a file to parse, for example my_file.pos, the results will be stored in a file called new_my_file.prd

You can open these results files in any text editor. If any sentence could not be parsed, a reason why will be written to this file instead of a bracketed parse.
Viewing the results at a later stage
Back to Top
If you have saved the file containing the output parse(s) and the file containing the corresponding probabilities (see above) you can view them at a later stage without having to go through the parsing stage again.

To do this simply type the command
java View parse_file.prd prob_file.txt

where parse_file.prd is the file in which you have saved the bracketed output and prob_file.txt is the file in which you have saved the corresponding probabilities.

This will launch the Treescape program described earlier and you can view the results as parse trees.