Hello there 'bbrendon'!
I had *exactly* the same problem as you a little while ago, when I
needed to find lists of words based on quite complex criteria for the
work I do in psycholinguistics. The solution I would recommend is the
Medical Research Council's psycholinguistics dictionary, version 2,
which contains 150837 English words along with up to 26
psycholinguistic measures for each. You can find a free web interface
to this at a number of locations, including:
http://www.psy.uwa.edu.au/mrcdatabase/uwa_mrc.htm
A formal citation for the database is:
Wilson, M.D. (1988) The MRC Psycholinguistic Database: Machine
Readable Dictionary, Version 2. Behavioural Research Methods,
Instruments and Computers, 20(1), 6-11.
Using this interface, I filled in the following fields to answer the
example that you give:
Display: Word
Number of syllables: MIN=1 MAX=1
Common part of speech: incl N
Simple letter match: *ey
... and I received the following list of one syllable nouns ending in
-ey:
DEY
KEY
LEY
PREY
TREY
WEY
WHEY
One area where the MRC database is exceeding cunning, is that I can
also restrict the output on the basis of how the word *sounds*. So for
example, say I wanted a list of one syllable nouns ending in the sound
'ee' instead, I can enter the same criteria as above, but instead of
using a simple letter match I can specify a precise phonetic
transcription of: *i . This gives the following output:
BEE
BREE
D
DE
E
FEE
FLEA
G
GEE
GHEE
GLEE
KEY
KNEE
LE
LEA
LEE
LI
ME
MI
P
PEA
PLEA
QUAY
RE
SCREE
SE
SEA
SEE
SI
SKI
SPREE
T
TE
TEA
TEE
THREE
TREE
V
... which seems appropriate.
The dictionary web pages give quite good instructions next to each
possible criterion, as you'll see.
You can also download the dictionary for use on your own computer -
this is particularly useful if you use a UNIX system. The link for
this is at:
http://ota.ahds.ac.uk/texts/1054.html
... though note that the dictionary file itself is about 12MB in
filesize!
I hope you find this answer of some use - I've come to rely on the MRC
psycholinguistics database rather a lot, as it's a very useful
dataset.
Good luck with your work!
stuartwoozle.
(Note that no search strategy was used for this answer, as I had the
relevant websites bookmarked in my browser) |