Hi,
I would like to create a service like Systran or Babelfish - a
multilingual translator. I have read some information on how to go
about this. Some of the issues involved are at:
http://en.wikipedia.org/wiki/Machine_translation. I have compiled
lexical data but now I need to add the "morphologic, syntactic, and
semantic information, and large sets of rules."
My question is: Assuming that alot of work has already been done in
this field - where can I obtain this data? (i.e. the "morphologic,
syntactic, and semantic information, and large sets of rules.") -
essentially what I need is:
a series of algorithms, 1 for each language translation direction,
which spcify how to translate for a certain language pairing. (In
essence, this would be a grammatical mapping function for each
language translation set).
To take English-French as an example, I have adapted some of the work
done by Sleator et. al with the link parser to parse (English)
sentences into individual linkages. Now what I need to do is to apply
some (English-French) translation mapping algorithm to the linkages in
order to translate the sentence into French. Is such an algoritm
available? Please provide details. If not, please provide information
on how I might go about developing a suitable algorithm myself.
I would like to gather this data and to apply it to my lexical
wordlists. Naturally, I would prefer not to pay for this data but
would be willing to do so if no open (free) sources are available.
I would like to implement this system with the major European
languages and I would like to start with English <> French or English
<> German. I speak French and German and I have extensive computing
expertise.
best regards,
Greener. |