From: Grady Ward
Subject: language databases
Date: 
Message-ID: <27452@well.sf.ca.us>
Computational linguists:

To answer the great number of questions we've received about
the Moby lexical database suite, here are the basics about
these publicly-available English language databases:

Moby Words -- 560,000 English language entries
(applications: spelling checkers, password screening,
word recreations, compression modeling)

Moby Hyphenator -- 155,000 fully hyphenated entries
(applications: correctly formatted textual output,
music lyric syllabification)

Moby Part-of-Speech -- 214,000 entries marked with
up to seventeen part(s)-of-speech in English, in order of use.
(applications: increase accuracy of input parsing, principle or
rule-based grammars, automatic generation of
syntactically-correct English)

Moby Pronunciator -- 167,000 entries with full International
Phonetic Alphabet encoding, including syllabification and
primary, secondary, and tertiary stress marks.
(applications: text-to-speech drivers for multimedia,
speech recognition models, rhyming dictionaries)

Moby Thesaurus -- 1.2 million synonyms and related ideas
(applications: concept-driven database searches, free-form
English language input parsing [such as that required for
Loebner Prize contestants], on-line thesauruses,
universal parsing machines, generative semantics)

Taken together, these databases provide a cluster of
projections into the English language and are intended
to free the scientist and researcher from the tedium of
attempting to collect similar sets of data.  We hope that
publishing this material coevally will stimulate a number
of interrelational studies such as the investigations
of Professor Robert C. Berwick at MIT.

All databases are supplied in pure ASCII, royalty-free, in
both Macintosh and MS-DOS disk formats (also in .Z file
formats) Both commercial (to resell derived structures as
part of commercial applications) and educational/research
licenses are available.  During October, all licensees receive
the complete works of William Shakespeare in plain ASCII,
free.  (These works of Shakespeare are in the public domain
and may be freely redistributed to your colleagues or students.)

For a free brochure with sample entries and details on
getting your own copy of this material, write or telephone
your postal address:

Illumind Unabridged
Grady Ward
571 Belden St., Ste. A.
Monterey, CA  93940
USA
(408) 373-1491
Applelink: D2783