When searching the lexicon, one provides a list of conditions (patterns),
each of which must match individually for a lexical unit to be included in
the result (they are and-ed together).
Each pattern is shown in the search form as a single "chip" (e.g.
id=blu-n-
close
) and consists of two parts. The match key (id
in the example) and a regular expression (blu-n-
in the example). The match key determines the attribute of the lexical unit
the regular expression will be matched against.
The system also allows negative conditions (e.g. list only units which do
not match a pattern). These are entered into the search form in the same way
except that instead of = one uses
!= and the condition shows with a red background (
id!=blu-n-
close
)
The match key
Each match key determines (generates) (a list of) string(s), at least one of
which must match the regular expression in order that the condition is
considered a match. The following types of match keys are available:
Keys for unstructured attributes
Each attribute of a lexical unit has a corresponding match key of the same name.
Most attributes are unstructured (at least from the point of view of the system)
and the string determined by the key is just the textual source of the attribute
as present in the input data.
Keys for structured attributes
There are some attributes (e.g. frame or
lvc) which have further structure recognized by the
system. For these attributes the key with the same name will determine strings
on a case by case basis. For the frame attribute, for
example, it will still be a list of all the slots. In general, appending
.src to the key name will match the attributes textual
source, if available. Additionally these attributes will have subkeys matching
particular parts of their value. These keys will typically consist of the
attribute name concatenated with a dot (.) and the
name of the relevant part. For example, the frame attribute has a subkey
frame.functor which corresponds to the list of functor
names and frame.PAT.forms which corresponds to the
forms of the slot corresponding to the functor PAT.
The list of suggestions shown when entering queries into the search form will
typically include a short description for each structured match key to help with
knowing what strings the key determines.
One can also click on the
vpn_key
icon shown with each lexical unit to bring up a dialog showing the different
match keys available and the corresponding strings determined by them (the
different strings are separated by a dot: •).
The regular expressions
The patterns which are matched against the strings generated by the match keys
are
regular expressions
as interpreted by the
Python regular expression library. In particular this means that the Python extensions to regular expressions
and the python re syntax are available. It is also important to note that a
string can match the expression anywhere (not just at the start, i.e. the
search
function is used as opposed to the
match
function).