Welcome back, apteryx!
This TinyURL resolves to the actual URL
Open Source computer code is software that has been freely made
available to any developer who wishes to experiment with, tweak,
and/or enhance that software.
BerliOS is an Open Source Mediator for computer program developers:
"The goal of BerliOS is to provide support for different interest
groups in the area of Open Source Software (OSS). Our aim is to fulfil
a neutral mediator function. The target groups of BerliOS are on one
hand the developers and users of Open Source Software and on the other
hand commercial manufacturers of OSS operating systems and
applications as well as support companies."
CVS is a software file Version Control System helps prevent a
developer from uploading a file on top of the file uploaded by another
developer working on the same thing simultaneously, thus destroying
the other's changes. It also allows you to compare two different
versions of a file to see only the differences between them, which can
be very helpful in debugging software.
This file is one of several contained in the "Templates" folder for
GHS, which stands for "Gento-Hilfe-System", which is German for "Gento
The files in this folder are used to "parse", or process, Search terms
in the Gento-Hilfe-System. This particular file, "Affix",
is used to determine common prefixes and suffixes for English Words,
such as re-, in-, un-, -ed, -ly, -est, etc.
The file about which you are asking is a file of English words
appearing in the GHS. When used in conjunction with the Affix file
listed above, different forms of these words which appear in the Help
system can be created and matched to search terms entered by a user,
thus enabling the Help system to find more results by using the root
words of the search terms the user enters.
I used various forms of the URL you provided to get a feel for what
the site contained:
http://www.berlios.de (click on "English", then on "About Us")
http://cvs.berlios.de/cgi-bin/viewcvs.cgi (click on "CVS Help")
Open Source computer code
Google Language Translation
Translate text: Gento Hilfe System from German to English
Before Rating my Answer, if you have any questions about the
information which I have presented, please post a Request for
I hope that this Answer provides exactly the information which you were seeking!
Request for Answer Clarification by
03 Jan 2004 16:06 PST
So far, so good, aceresearcher. I haven't been away, though--I've
posted more questions in the past week than I usually do in a month!
Guess I must be running a high level of QSH, the curiosity hormone,
Before posting my question, I did poke around the BerliOS site and
look at the description and mission statement and so on, but I wasn't
able to piece it all together well enough to figure out what this list
is for. (I made the tiny URL from the expanded one.) I was also
looking for any evidence that the site might be a cover for or aid to
some kind of marketing operation, maybe serving spammers, but I am not
experienced enough to know what signs to look for.
I'm still not sure I really understand your explanation. I know
little bit about search algorithms and information retrieval, and I
understand parsing and affixes and other concepts having to do with
language. What I would still like to know is *how* the list is
used--meaning what kinds of operations would be performed on or with
it--and how the result could aid in a search. Can you give an example
that doesn't involve too many leaps or guesses? Maybe you could
illustrate with some terms from the list: how about 'Agamemnon',
'tergiversator', 'stipitate', and 'collywobbles'? (Bonus points if
you can use them all in one sentence.)
One other thing, and a guess would be just fine here: can you imagine
someone's being able to use this or a similar list to generate spam
messages, and if so, how would that work? No need to research
this--just your top-of-the-head reaction is all I'm after.
Clarification of Answer by
03 Jan 2004 22:08 PST
I'm sorry if you felt that my explanation was not very clear. Let me
try it this way:
I'm searching a dictionary (or glossary, or help system, or website,
etc) for the word "tergiversating". The search software checks the
built-in list of available keywords in the database against the search
term(s) I entered. Well, that file contains "tergiversator" and
"tergiversate", which are close, but not quite a cigar. **However**,
because the programmer was so clever, instead of just telling me that
it can't find the word in its files and that sorry, I'm just out of
luck, the software goes on to check, using the Affix file, if
"tergiversating" might be a variant of the word "tergiversate".
Since "-ing" is listed in the Affix file as a legitimate variant of
words ending in "-e", lo and behold, the software, instead of
disappointing me and traumatizing me forever, can return a list of
search results to me which include the word "tergiversate", a word
that *is* contained in the help database.
Your hunch that this kind of "dictionary" list can be used to generate
random "Subject" lines and "From" addresses for spam mailings is a
correct one. With software that randomly selects words from a
dictionary file, then randomly creates variants of those words using
an Affix file to append prefixes or suffixes to those words, a pool of
potential words *much* larger than the original dictionary can be
If you are interested in a detailed, step-by-step technical
explanation of how such a program would work -- or in obtaining
program code that actually performs this function -- there are a
number of Researchers who are programming gurus and who might be able
to do this for you; however, you would likely need to post that
Question with a considerably higher fee attached to justify the work
that it would involve.
As for your last additional Question:
Although he was an accomplished tergiversator, Agamemnon got a severe
case of the collywobbles every time he told the outright lie that
palms were trees, rather than stipitate cycads.
Clarification of Answer by
01 Jan 2007 05:41 PST
The process of using root words, prefixes, and suffixes in computer
algorithms is known as stemming or conflation. You can read more about
it in Wikipedia: