Google Answers Logo
View Question
 
Q: Visual Basic.net array based string search ( Answered 5 out of 5 stars,   3 Comments )
Question  
Subject: Visual Basic.net array based string search
Category: Computers > Programming
Asked by: pbcdhs-ga
List Price: $10.00
Posted: 05 Apr 2003 12:46 PST
Expires: 05 May 2003 13:46 PDT
Question ID: 186515
In several text files I have the entries origin, name, company, mass
and description. I use a streamreader to read the file into the array
line by line. Each of these values has a word or number value on the
same line but separated from them by a space except for the
description which may be on several lines but will be between double
quotes. In each text file this value will be different. What I need to
do is search for an occurrence of for example origin and then place
its value in this case country name or USA into a textbox.

From the following text the first words are the search items and the
second items are to be placed in a text box with the exception of the
description which requires all words between the quotes.
origin USA
name Collins
company hewlett
mass 20000
description "the quick brown fox
jumps over the lazy dog"

Could anybody supply the vb.net code to do this?
 
Thanks

Request for Question Clarification by mathtalk-ga on 06 Apr 2003 06:19 PDT
Hi, pbcdhs-ga:

Something of this sort used to be common practice with parsing *.ini
files in Visual Basic.

I see that you "use a streamreader to read the file into the array
line by line".  Do you want code that works from such an array (of
strings?)?  Or would another approach, say one that uses another data
structure or reads directly from the file, be acceptable?

If you want to work with your existing code, you should post at least
a snippet so we get the idea of your naming, datatypes, etc.  And
speaking of datatypes, do you have in mind to treat all the values as
strings, or did you intend "mass" (for example) to have a numeric
datatype?  If so, what type?

Your request talks about finding different key:value pairs from
different files.  Are the five keys (origin, name, company, mass,
description) you discuss an exhaustive list or are these merely
illustrative?

Finally the price you've quoted is a bit low for working code.  If you
simply wanted a sketch of how to do the programming yourself, it would
be fine.  Please judge for yourself, but Google Answers has pricing
guidelines here:

http://answers.google.com/answers/pricing.html

regards, mathtalk-ga

Clarification of Question by pbcdhs-ga on 06 Apr 2003 13:33 PDT
Thanks for your interest mathtalk. It really doesn't matter which
method is used as long as the values end up in a text box. Please note
that some items will not always have a value.
Here is the code that I use to create the array 
 Dim Open2() As String

    Private Sub mnuOpen_Click(ByVal sender As System.Object, ByVal e _
As System.EventArgs) Handles mnuOpen.Click
         Dim objSR1 As StreamReader, oldSize1 As Integer, str6 As
String
        objSR1 = New StreamReader("myfile.txt")
        Do While objSR1.Peek <> -1
            If Open2 Is Nothing Then
                oldSize1 = -1
            Else
                oldSize1 = UBound(Open2)
            End If
            ReDim Preserve Open2(oldSize1 + 1)
            str6 = objSR1.ReadLine
        Loop
        objSR1.Close() 'Close file
        objSR1 = Nothing 'destroy the object

I think if the streamreader is used then items such as mass will be
treated as a string. It really doesn't matter as all that will be
happening is the user loads a file in edits it and then puts it back
into the file. There will be no calculations based on any entries
loaded.

The five keys are illustrative. They are actually in the files but in
some files there may be up to fifty different keypairs.

As for the price I set it at $10 because I thought this would be a
very easy task for a VB.Net guru and also because I can't afford much
more. I can't go above $20  for an answer.

Thanks again

Request for Question Clarification by mathtalk-ga on 08 Apr 2003 07:44 PDT
Hi, pbcdhs-ga:

In reviewing your code to populate the array of strings Open2(), it
appears to me that while you re-dimension this array within the loop,
you never assign the values held by the local variable str6 into the
corresponding array elements.

I recommend that you check the results of this routine, perhaps simply
by using the debugger to inspect the contents of Open2() at the
conclusion of the loop.

regards, mathtalk

Clarification of Question by pbcdhs-ga on 08 Apr 2003 14:32 PDT
Mathtalk-ga
I have no control over the way in which the text file is set out. All
keys and values are seperated by a space and not an = as reguired by
ini. The arrangement of the keys in the text file changes also for
example in one file company may be the first line and in another file
mass may be the first.

By adding these line to my code I get the entire contents of the text
file added to the listbox line by line.

For Each str6 In Open2
     ListBox1.Items.Add(str6)
Next

Thanks again for your help

pbcdhs

Request for Question Clarification by mathtalk-ga on 09 Apr 2003 08:31 PDT
Thanks, pbcdhs-ga.  I can appreciate that you have no control over the
input file format; I just wanted to point out the "typical" sorts of
things that are done to manage these small datafiles of key:value
pairs.

Let me ask if there is any provision in the given file format for
'comments' that would (obviously) need to be disregarded in supplying
the list of keys for editing (and subsequently the accepted values).

For example, .INI files precede comments with a semi-colon.

regards, mathtalk

Clarification of Question by pbcdhs-ga on 09 Apr 2003 12:26 PDT
Thanks mathtalk
No there are no comments allowed in the file structure.

regards, pbcdhs
Answer  
Subject: Re: Visual Basic.net array based string search
Answered By: mathtalk-ga on 11 Apr 2003 00:07 PDT
Rated:5 out of 5 stars
 
Hi, pbcdhs-ga:

The functionality which your program needs, in between reading the
text file into the array Open2() and ultimately writing it back out
again, seems to have three parts:

1) Parse the key names from the array and place them in a listbox.

2) Given a selected key name from listbox, search for the
corresponding value in Open2() and assign this string to a textbox
control.

3) Allow the user to edit the value in the textbox and then save the
revised value back into Open2().

The question you asked focuses on the middle step of this
functionality, namely searching for values once the key names are
given.

However the first and second steps are really quite closely related,
so I will present code for both of those parts.  The third part is
much less related, and to design it correctly would require additional
clarifications.  Therefore I will not attempt to address the third
part in my answer.  If you would like help with that functionality,
I'll ask that you post a separate question about that.

In my VB.Net project I have the default form Form1, to which I've
added a textbox for the name of an input/output file (TextBox1), a
listbox for the key names (ListBox1), and a second textbox for the
corresponding string values (TextBox2).  I tweaked the attributes of
these controls slightly, e.g. the presence of scrollbars, to make
their behavior and appearance appropriate for the application in mind.

To the code "behind" Form1, I've added a declaration of the global
string array:

Dim Open2() As String

just as you apparently did.  I've inserted code there in the same
module for a subroutine that performs the first functionality
described above, parsing out the key names from Open2() after that
array is initialized according to your choice of input files:

Public Sub FindKeys()

    REM find keys and insert them into listbox

    Dim i As Integer
    Dim keyDesc As Boolean
    Dim keyStr As String

    keyDesc = False

    For i = 0 To Open2.Length - 1

        If Not (keyDesc) Then

            keyStr = Open2(i).Substring(0, Open2(i).IndexOf(" ") - 1)
            ListBox1.Items.Add(keyStr)

            If keyStr.Equals("description") Then
                keyDesc = (Open2(i).IndexOf("""") > -1) And _
                  (Open2(i).IndexOf("""")=Open2(i).LastIndexOf(""""))
            End If

        Else

            keyDesc = (Open2(i).IndexOf("""") < 0)

        End If

    Next i

End Sub

Let's review the ideas of this code before proceeding to the code for
the second part of the functionality, the actual search for values
procedure.

This first code is more complex than the tiny snippet you used to
insert all the contents of Open2() into ListBox1.  This is due to our
need to distinguish the key name from the following value (they are
assumed to be separated by a space), and also to the machinery needed
to keep track of a value associated with the special "description" key
that uses quotation marks and _may_ extend over more than one line.

To the latter end we declare a local boolean variable keyDesc that
will help us by tracking whether we are in the midst of one of these
multiline descriptions.

With keyDesc initialized to False at the outset, we are prepared to
loop through the strings Open2() in the same order (presumably) as the
corresponding lines that appeared in the input file.

We anticipate finding a key name at the beginning of a line only when
keyDesc is False, ie. only when we are _not_ skipping through the
contents of a multiline "description".  Hence this loop is structure
by the if...then...else logic shown.

Since you have asked for help with doing the searching in a VB.Net
compatible manner, I have consistently chosen the String class methods
over functions available in VB.Net which are more or less intended for
backward compatibility.

Thus I've used the String.Substring and String.IndexOf methods to
isolate the key name at the beginning of such lines in Open2(), rather
than Mid() and Instr() functions that might be familiar to you from
VB6 or before.  Recall that a key name will be terminated by a space.

Once a key name is found, we immediately insert it into the listbox. 
Then we check to see if perhaps the key name is "description".  If so,
we make some further checks to see whether the quoted value that we
expect to follow extends beyond the current line.  If it does extend
to the next lines, then we wind up setting keyDesc to True.

The additional tests referred to ask whether that line actually
contains a quotation mark (strangely enough four consecutive quotation
marks are needed in order to syntactically define a single quotation
mark in a string argument). But if so, that line may contain both the
opening and closing quotes (in which case the description value is
confined to just one line and we assign keyDesc to be False).  If
there is only one quote mark (the first is the last), then we set
keyDesc to be True.

Finally for this loop, the "else" part of the logic simply needs to be
on the lookout for the closing quote mark, namely the next line which
contains a quote mark.  This arrangement (unlike the VB syntax parser)
does not allow for nesting of quotation marks inside description
values.  The way we formulate this is:

keyDesc = (Open2(i).IndexOf("""") < 0)

The IndexOf method returns -1 only when the search fails.  In this
case that means when the line contains no quotation mark, keyDesc will
(continue to) be True.

In laying out this code I've tried to honor all the essential logical
tests but to keep the code simple enough to be understood.  Some
tweaks that might have merit for the sake of robustness would check
for errors.  For example, a quote mark is presumably not valid as part
of a key name, but I make no such checks.  That's not a big deal here,
one way or the other, because I've only implemented the quotation mark
machinery for the single key name "description".  If you had in mind
to allow other key names to assume multiline text values using the
same quotation marks to unify the lines, then the issue of checking
for the locations of quote marks within key names might be more
germane.

A second "robustness" tweak would be to test the value of keyDesc
after exiting the loop.  The value of keyDesc should be False at that
point, unless a multiline quoted description was opened but never
closed in the input file.  An obvious error, but arguably one that is
better handled by exposing it to the user rather than by concealing
it.

I'm going to take a "break" in the discussion at this point before
continuing with the "search" code.  The code is similar enough that
I'd rather put it into a separate section ("clarification") to avoid
any confusion.

regards, mathtalk

Request for Answer Clarification by pbcdhs-ga on 12 Apr 2003 00:27 PDT
Thanks mathtalk-ga
The code worked perfectly except for the line
keyStr = Open2(i).Substring(0, Open2(i).IndexOf(" ") - 1) 
once I changed it to 
keyStr = Open2(i).Substring(0, Open2(i).IndexOf(" ")) 
it worked. I now have all the keys listed in listbox1. It was cutting
the last character off the key and it had problems with lines without
any text.

regards 

pbcdhs

Clarification of Answer by mathtalk-ga on 12 Apr 2003 09:58 PDT
Hi, pbcdhs-ga:

Good catch on the shortened key names!

Here's the second patch of code; I was more parsimonious with
indentations to avoid problems with the code "wrapping around". 
Discussion follows:

Public Function GetValue(ByVal keyGet As String) As String

REM given key name keyGet, return corresponding value string

Dim i As Integer
Dim keyFound As Boolean
Dim keyDesc As Boolean
Dim keyStr As String

keyFound = False
keyDesc = False

For i = 0 To Open2.Length - 1

    If Not (keyDesc) Then

        keyStr = Open2(i).Substring(0, Open2(i).IndexOf(" "))

        If keyStr.Equals(keyGet) Then keyFound = True

        If keyStr.Equals("description") Then
            keyDesc = (Open2(i).IndexOf("""") > -1) And _
              (Open2(i).IndexOf("""") = Open2(i).LastIndexOf(""""))
        End If

        If keyFound Then
            GetValue = Open2(i).Substring(Open2(i).IndexOf(" ") + 1)
        End If

    Else

        keyDesc = (Open2(i).IndexOf("""") < 0)

        If keyFound Then
            GetValue &= Chr(13) & Chr(10) & Open2(i)
        End If

    End If

    If keyFound And Not (keyDesc) Then Exit For

Next i

If Not (keyFound) Then GetValue = String.Empty

End Function

The basic loop structure is the same, but our logic is complicated by
needing to extract information from more than one line in the
"description".  To emphasize the similarity of structure, I put all
the logic within a single loop, as before.  However I did add logic at
the loop's end to return an empty value string if the requested key is
not found.

I think the code is written in such a way that if the "description"
value is on a single line, the correct value is extracted regardless
of whether or not it is set off by quotation marks.  This is a small
detail of the spec that was not clear to me, so I tried to make it
work either way.

Please let me know if anything needs clarification, esp. bugs or .Net
framework calls that should be explained in greater detail.

regards, mathtalk-ga

Request for Answer Clarification by pbcdhs-ga on 12 Apr 2003 14:06 PDT
Thanks mathtalk
I have two questions.
The first is how do I remove the quotation marks from around the
returned description value?

The second question takes the code beyond what was originally asked
for and it is up to you if you answer it.
The question is how would I split a returned value into two pieces.
For example this is one of the values "<:87214:9999>" I need to split
the value and place the first number into textbox1 and the second
number into textbox2 without the "<" and ":".

Regards

pbcdhs

Request for Answer Clarification by pbcdhs-ga on 12 Apr 2003 17:55 PDT
mathtalk-ga
Please ignore the second question. I found how to split the string
myself.

There is however another small problem and that is if a line contains
a single character and no space. This causes a "length cannot be less
than zero error"

regards

pbcdhs

Clarification of Answer by mathtalk-ga on 12 Apr 2003 18:57 PDT
Hi, pbcdhs-ga:

To remove those two quotation marks from around a description string,
you could use the String.Substring method to drop the first and last
characters, by specifying:

myString.Substring(1,myString.Length - 2)

[.Net Substring Method]
http://msdn.microsoft.com/library/en-us/cpref/html/
frlrfsystemstringclasssubstringtopic.asp

Alternatively you could use the String.Replace method, which replaces
all occurrences of the first string argument with those of the second.
 I have in mind:

myString.Replace("""","")

[.Net Replace Method]
http://msdn.microsoft.com/library/en-us/cpref/html/
frlrfsystemtextregularexpressionsregexclassreplacetopic.asp

which cryptically says to replace all the quotation marks with
nothing, i.e. to remove them.

As far as how to "parse" a string like the given example:

"<:87214:9999>"

into the two separate numbers, you will probably want to make use of
the method String.Split which "splits" a string instance into an array
of strings delimited by one or more of the characters passed in a
parameter array.

Specifically if myString = "<:87214:9999>", then:

Dim myStrArray as String()
myStrArray = myString.Split(":")

should produce:

myStrArray(0) = "<"
myStrArray(1) = "87214"
myStrArray(2) = "9999>"

If instead you used:

myStrArray = myString.Split("<",":",">")

you would produce:

myStrArray(0) = ""
myStrArray(1) = ""
myStrArray(2) = "87214"
myStrArray(3) = "9999"
myStrArray(4) = ""

This is a powerful tool for taking strings apart and would repay some
experimentation on your part to better understand how it can be
applied to your goals.

[.Net Split Method]
http://msdn.microsoft.com/library/en-us/cpref/html/
frlrfsystemstringclasssplittopic.asp

regards, mathtalk-ga

Clarification of Answer by mathtalk-ga on 12 Apr 2003 19:02 PDT
The hyperlinks posted in my last clarification are "broken"
(literally).  This usually doesn't happen, so let's try again:

[.Net Substring Method] 
http://msdn.microsoft.com/library/en-us/cpref/html/frlrfsystemstringclasssubstringtopic.asp

[.Net Replace Method] 
http://msdn.microsoft.com/library/en-us/cpref/html/frlrfsystemtextregularexpressionsregexclassreplacetopic.asp
 
[.Net Split Method] 
http://msdn.microsoft.com/library/en-us/cpref/html/frlrfsystemstringclasssplittopic.asp
 
If that fails to fix them, I'll have to let you apply a little text
editing to patch them up again!

best wishes, mathtalk

Clarification of Answer by mathtalk-ga on 12 Apr 2003 19:17 PDT
Here's a slightly more cogent/consistent Microsoft reference on the
Replace Method:

[.Net String.Replace Method]
http://msdn.microsoft.com/library/en-us/cpref/html/frlrfsystemstringclasssplittopic.asp

-- mathtalk

Request for Answer Clarification by pbcdhs-ga on 13 Apr 2003 17:16 PDT
Thanks for the links mathtalk
Would there be any chance of you showing how ot fix the problem when
the line contains a single character and no space. This causes a
"length cannot be less
than zero error"

Thanks again

pbcdhs

Clarification of Answer by mathtalk-ga on 13 Apr 2003 20:24 PDT
Hi, pbcdhs-ga:

In addition to remarking upon the text of the error message, it would
be helpful if you were to indicate what line of code the error occurs
on.  I'd give the name of the subroutine or function and the code
text, assuming that the context will be clear from that much
information and our prior discussion.
 
I'm not clear what you mean by a line that contains a single character
and no space.  According to the "specs" outlined in your original
question, the only circumstance I can imagine this occurring with
would be when a multiline description ends on a line that contains
only a closing quotation mark.  Is that the circumstance you have in
mind?

If an input line contains a key name but no following space, then I'd
expect exactly the cited error to occur because IndexOf(" ") will
return -1 if no space is present, regardless of how many characters
the key name has.  How to "correct" this situation depends mostly on
what "should" happen in such a case.

regards, mathtalk-ga
pbcdhs-ga rated this answer:5 out of 5 stars and gave an additional tip of: $15.00
Excellent answer. Clear and concise. I really learnt a lot from
mathtalk.The answers given saved me a lot of time.

Comments  
Subject: Re: Visual Basic.net array based string search
From: mathtalk-ga on 06 Apr 2003 16:57 PDT
 
Hi, pbcdhs-ga:

Thanks for the clarification.  It sounds like the goal is a program
that allows the user to edit particular values associated with given
keys (and perhaps to create new such key:value pairs).

I think you have settled on an overall design for doing this:  read
the input file line by line into an array of Strings, Open2(); perform
user-directed editing; write the file back out line by line.

In framing the question you have then asked a VB.Net expert to address
only the middle phase of this effort.  One could certainly do this. 
Given a "key" token, the array Open2() can be linearly searched for
the existing entry (if any) and its corresponding value assigned to
the text property of a textbox.

However I think a real VB.Net guru (not me) would want to use the
StringDictionary Class of the .Net framework to manage your key:value
pairs.  Please take a look at the documentation for this and let me
know if it would be worthwhile to you to change your design to
incorporate that .Net feature.

regards, mathtalk-ga
Subject: Re: Visual Basic.net array based string search
From: pbcdhs-ga on 06 Apr 2003 17:45 PDT
 
Thanks mathtalk
This is exactly what I want to achieve:

Quote "One could certainly do this. Given a "key" token, the array
Open2() can be linearly searched for the existing entry (if any) and
its corresponding value assigned to the text property of a textbox."

I cannot work out how to search an array and if a key is found place
it's value  in a textbox. I have only been using VB.Net for about a
month and I am not yet fluent on what the requirements are for
searching an array for a substring and then referencing the line the
substring was found on so that it can be placed into a text box.

Thanks again
Subject: Re: Visual Basic.net array based string search
From: mathtalk-ga on 08 Apr 2003 09:27 PDT
 
Hi, pbcdhs-ga:

Given my doubts about the correctness of your code for reading in the
text file line by line to Open2(), I'd like to focus my answer to your
question on the narrow issue of how to parse the data once it's there.
 This seems to be in keeping with your previous comment, so no need
for you to respond to this unless you disagree with that "scope".

For future reference here is a link to an article by Russell Jones
that discusses an approach to this sort of thing using XML with
VB.Net, a design that is more in keeping with the .Net philosophy and
architecture:

[Upgrade your INI files to XML with .Net]
http://www.devx.com/dotnet/Article/7008

regards, mathtalk-ga

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy