list-rewrites - reads penn treebanks, prints out all rewrites found
list-rewrites [options] [file ...]
Options:
-help brief help message
-man full documentation
--verbose more verbose to STDERR
--directinput allow TTY to STDIN
--format FORMAT provide a different output format
--terminal include (exclude) terminal expansions
--noterminal default is --terminal
$ echo "(S (NP (DET the) (NN dog)) (VP ran))" | ./list-rewrites
S => NP VP
NP => DET NN
DET => the
NN => dog
VP => ran
- --help
- -?
- Show this help message.
- --man
- Show the manual page for this script.
- --directinput
- By default, if there is a human-operated TTY on STDIN, this script issues
a usage message and exits (this is so users can run
"list-rewrites" and get the usage
message). If you really want to type trees by hand on STDIN, add the
--directinput flag.
- --verbose
- Repeatable option. Report more of what we're doing.
- --format FORMAT
- provide an alternative output format. The default is
"%s =" %s\n>,
which creates output like the example in "Sample output".
This program lists all rewrites in all trees presented by file or on STDIN to
this script.
The trees must be in Penn treebank format.
The rewrites will not necessarily be unique; if you want them to
be unique, you will have to pipe the output of this program into (e.g.)
"sort | uniq". This is deliberate, so that
you can get counts from the output of this program as well as a survey of
the rewrites in a corpus.
Jeremy G. Kahn <jgk@ssli.ee.washington.edu>