Google Answers Logo
View Question
 
Q: Regular expression for finding items in a list ( No Answer,   5 Comments )
Question  
Subject: Regular expression for finding items in a list
Category: Computers > Programming
Asked by: gunzler-ga
List Price: $30.00
Posted: 30 Mar 2004 14:50 PST
Expires: 29 Apr 2004 15:50 PDT
Question ID: 322619
We want to use a regular expression to determine whether a list
contains a specific item.  The list format we are using is the same
format that is used in the TCL programming language:  the list items
are seperated by a space character, and if a list item contains a
space character, then that item is surrounded by curly braces.

For example:

"A B C" is a list containg the items "A", "B" and "C".
"A B {C D}" is a list containing the items "A", "B" and "C D" ("C D"
is one item, not two).

We are seeking a regular expression that will determine whether or not
a list contains a specific item.  It should only match when the list
contains exactly the item in question and not a substring or
superstring of that item.

To help you test your regular expression, you can use it to search for
item "Bb" over the following lists.

These lists contain the item "Bb", so it should find a match for these:
List 1: Aa Bb {Aa Bb} Cc
List 2: Aa Bb Cc
List 3: {Aa Cc} Bb {Cc Aa}
List 4: Bb {Ac Bc}

These lists do NOT contain the item "Bb", so they should not match:
List 5: {Aa BbCc D}
List 6: {Aa Bb Cc}
List 7: {Aa Bb} Cc {Bb Aa}
List 8: A Bbc Bc
List 9: Bbc Aa CBb

You should also try your regular expression on these lists searching for "12".

Will Match:
List 10: 12 6
List 11: {3 4} 12 4

Will Not Match:
List 12: {1 12} 15
List 13: 312 {1 12 612} 123

Thanks for your help.

Request for Question Clarification by studboy-ga on 30 Mar 2004 17:08 PST
So I assume you can have the case:

A {Bb} C

and it should match, right?

Please let me know what's your programming language (Perl, Java, .NET)?

I believe it can be done in two steps:

1) Tokenize the elements
2) Match on boundaries

Request for Question Clarification by efn-ga on 30 Mar 2004 19:18 PST
There is no universal standard notation for writing regular
expressions.  What regular expression language should be used in the
answer to your question?

Clarification of Question by gunzler-ga on 31 Mar 2004 07:54 PST
rerdavies-ga brings up something I should have clarified.  There are
NO sublists.  When I wrote {A B} that is a single element ("A B") with
a space in it, not a sub-list.

studboy-ga, there will never be a list item {Bb}, because there are no
spaces in the item it will not be surrounded by the curly braces. So
your list would be written as: A Bb C

efn-ga, I'll be doing this regular expression in PostgresQL, if that helps.

maltezefalkon-ga, I'll try your solution a little later when I get some time.

Clarification of Question by gunzler-ga on 23 Apr 2004 07:40 PDT
dewmeht-ga, sorry it took so long to answer, I just got an email from Google.

No, a bracket will never be within a set of brackets.  Or, in other
words, there is only one level of bracketing.
Answer  
There is no answer at this time.

Comments  
Subject: Re: Regular expression for finding items in a list
From: rerdavies-ga on 30 Mar 2004 20:21 PST
 
Regular expressions are not powerful enough to perform this kind of
match. The specific problem is that you need to balance braces in
order to determine when the end of a sub-list occurs. (e.g.   "Aa
{{{{Bb}}}} CC", where each of the closing braces must be matched with
it's corresponding opening brace. There's no way to define recursive
constructs in regular expressions, but you must in order to match
sub-lists of sub-lists of sub-lists of.....

If you have only one level of sub-lists, and you will never match one
of the sub-lists, it is possible. However, the exact mechanism depends
highly on the syntax of the regular expression language you are using,
and the extensions supported by your flavor of regular expression
language.

The easiest way to do it with relatively standard re syntax: set up an
anchored match, followed by a pattern that matches zero or more tokens
or sublists, followed by a tagged expression for what you are
searching for.

Something like this:

${\{[~}]+\}|[A-Za-z]* *}+(TargetToken)

And then check for tagged expression zero "\0" in the dialect I'm used to.

The key here is the expression "\{[~}]\}" which skips over sub-lists
as long as they don't, themselves, contain sub-lists.
Subject: Re: Regular expression for finding items in a list
From: maltezefalkon-ga on 30 Mar 2004 23:26 PST
 
(?<!\{[^\}]*)(?<=^|\s)TOKEN(?=$|\s)(?!.[^\{]?\})|(?<=\{)TOKEN(?=\})

Replace "TOKEN" with the item being searched for.

In answering I assumed (1) since there was no specific Regular
Expression implementation mentioned, that I could use any set of
conventions that I chose (this has been tested successfully using
Microsoft's .NET framework) and (2) that there is to be only one level
of nesting.

I will leave it to the questioner to clarify if I am incorrect.
Subject: Re: Regular expression for finding items in a list
From: dewmeht-ga on 01 Apr 2004 05:37 PST
 
Can the bracketed items contain another bracket?

For example "A B {C D {}" where the items are
"A"
"B"
"C D }"
Subject: Re: Regular expression for finding items in a list
From: gunzler-ga on 23 Apr 2004 07:40 PDT
 
dewmeht-ga, sorry it took so long to answer, I just got an email from Google.

No, a bracket will never be within a set of brackets.  Or, in other
words, there is only one level of bracketing.
Subject: Re: Regular expression for finding items in a list
From: efn-ga on 24 Apr 2004 15:57 PDT
 
Did either of the suggestions from rerdavies-ga and maltezefalkon-ga
work?  If not, can you say something about how they failed?

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy