2. Microconcord
This, as the name suggests, is a concordancing
programme. It searches specified files for a given keyword
without indexing them. It is surprisingly fast (it munched
through over 2MBytes - 245,000 words within 10 seconds). It
uses DOS wild cards and lets you define one context word. It
lets you view the keyword in context of about 8 words, from
where you can get directly to the whole text. You can sort the
output according to one of the six context words on either side
which can be subsequently printed or saved on the disk. You can
also display the most frequent collocations. It counts words
having been searched and the relative frequency of the keyword.
It is also possible to categorise the output in up to 9
categories and use them in further sorting. Its output is
limited to about 1600 entries depending on the available
memory. It seems to be the only one the allows for verbatim
searches (good for differentiating majuscules and minuscules).
3. Tact
It would be no use trying to hide that this one is my
favourite. Tact is, also, a group of subprograms. One for
indexing, viewing, merging databases, and collocating. Let's
have a look at the indexer first. It creates an index file
about twice as large as the original one but then you don't
need it after that. It allows you to specify characters to be
used in searches, those to be used for marking text for further
reference. There are four different types of reference items
(one of them with numeric values). These references can be used
in the viewer, not unlike in Wordcruncher, but on a far more
sophisticated level. Indexing takes about as long as with
Wordcruncher. You cannot index multiple files however, but you
can merge the Tact databases later with a special programme.
The second programme from the Tact bundle is the viewer. It
lets you view the result of your searches in several windows
which you can display all at once, resize and synchronize.
There are five types of displays within Tact each of them
displaying interesting information about your text. They are
Index (word in a line), KWIC (Key Word In Context), Text (with
the keywords highlighted), Distribution (with respect to
several criteria in both graphical and numeric formats),
Collocate (compiles a list of collocations sorted according to
their Z-scores, making available intermediate results such as
Collocate Freq., Type Frq., Type Prob., Expected Observ.,
Standard Deviation). It is possible to change settings for all
of the displays.
What is most interesting about tact, however, are its
searching capabilities. It lets you specify the desired output
with next to no limitations. The criteria you can set range
from simple regular expressions to searches based on percentual
similarity, or with respect to frequency, or reference. The
result of your search can be a word or a phrase (I can send
more detail description of the specification language upon
request). You can then define categories based on your search
for further use within Tact.
The last programme is used for creating lists of collocations
in a specified database. This list can be used within Tact for
selection of lists.
In the whole Tact is a very advanced tool for analysis of large
structured textual corpora.
That's it folks, and again, any input welcome.
Your sincere 'n' faithful,
Dominik Lukes