We want to use a regular expression to determine whether a list
contains a specific item. The list format we are using is the same
format that is used in the TCL programming language: the list items
are seperated by a space character, and if a list item contains a
space character, then that item is surrounded by curly braces.
For example:
"A B C" is a list containg the items "A", "B" and "C".
"A B {C D}" is a list containing the items "A", "B" and "C D" ("C D"
is one item, not two).
We are seeking a regular expression that will determine whether or not
a list contains a specific item. It should only match when the list
contains exactly the item in question and not a substring or
superstring of that item.
To help you test your regular expression, you can use it to search for
item "Bb" over the following lists.
These lists contain the item "Bb", so it should find a match for these:
List 1: Aa Bb {Aa Bb} Cc
List 2: Aa Bb Cc
List 3: {Aa Cc} Bb {Cc Aa}
List 4: Bb {Ac Bc}
These lists do NOT contain the item "Bb", so they should not match:
List 5: {Aa BbCc D}
List 6: {Aa Bb Cc}
List 7: {Aa Bb} Cc {Bb Aa}
List 8: A Bbc Bc
List 9: Bbc Aa CBb
You should also try your regular expression on these lists searching for "12".
Will Match:
List 10: 12 6
List 11: {3 4} 12 4
Will Not Match:
List 12: {1 12} 15
List 13: 312 {1 12 612} 123
Thanks for your help. |
Request for Question Clarification by
studboy-ga
on
30 Mar 2004 17:08 PST
So I assume you can have the case:
A {Bb} C
and it should match, right?
Please let me know what's your programming language (Perl, Java, .NET)?
I believe it can be done in two steps:
1) Tokenize the elements
2) Match on boundaries
|
Request for Question Clarification by
efn-ga
on
30 Mar 2004 19:18 PST
There is no universal standard notation for writing regular
expressions. What regular expression language should be used in the
answer to your question?
|
Clarification of Question by
gunzler-ga
on
31 Mar 2004 07:54 PST
rerdavies-ga brings up something I should have clarified. There are
NO sublists. When I wrote {A B} that is a single element ("A B") with
a space in it, not a sub-list.
studboy-ga, there will never be a list item {Bb}, because there are no
spaces in the item it will not be surrounded by the curly braces. So
your list would be written as: A Bb C
efn-ga, I'll be doing this regular expression in PostgresQL, if that helps.
maltezefalkon-ga, I'll try your solution a little later when I get some time.
|
Clarification of Question by
gunzler-ga
on
23 Apr 2004 07:40 PDT
dewmeht-ga, sorry it took so long to answer, I just got an email from Google.
No, a bracket will never be within a set of brackets. Or, in other
words, there is only one level of bracketing.
|