![]() |
|
![]() | ||
|
Subject:
VB - Faster keyword comparison
Category: Computers > Programming Asked by: xymox-ga List Price: $50.00 |
Posted:
21 Jun 2002 06:27 PDT
Expires: 21 Jun 2002 08:58 PDT Question ID: 31134 |
I have a function listed below (dims and error handling taken out for brevity) that takes a list of keywords and compares them to a body of text looking for a keyword match in the text. In the example below the function will find java, vb and xml as a match. The function works fine as it is: I'm using the VB6 Split function to load the comma delimited keyword list into an array and then I loop through each array element one-by-one doing a compare (InStr) on the body text. I then create another comma delimited list of the matches. The problem is it will not scale very well. The body text could be very large and the keyword list could eventually be several thousand. That would take forever with this method. What I need is a much faster way to do this process in VB6. Please include code if you have it. Keywords = "programming, java, html, xml, vb" BodyText = "How long will it take me to learn Java? I already know VB and XML." Call KeyCompare(Keywords, BodyText) Function KeyCompare(Keywords, BodyText) KeyArray = Split(Keywords, ",") For i = LBound(KeyArray) To UBound(KeyArray) ParseResults = InStr(1, BodyText, KeyArray(ii), vbTextCompare) ParseResults = ParseResults - 1 ii = ii + 1 If ParseResults = -1 Then Else xKey = xKey & Trim(KeyArray(i)) & "," & " " End If Next If xKey = "" Then Else intLen = Len(xKey) - 2 Keystring = Left(Trim(xKey), intLen) KeyCompare = Keystring End If End Function |
![]() | ||
|
There is no answer at this time. |
![]() | ||
|
Subject:
Re: VB - Faster keyword comparison
From: j_philipp-ga on 21 Jun 2002 06:52 PDT |
Xymox, For one thing, explicitly use Left$() and Trim$() instead of Left() and Trim(). Quote Steven R. Hamby at VB Helper - Performance Tuning: http://www.vb-helper.com/perform.htm "If you need to do a lot of string/file processing, use mid$ (and trim$ etc.) rather than mid as the latter treats the data type as a variant as opposed to a string, which can be up to 3 times slower" Above resource is a good read for VB speed optimizing. |
Subject:
Re: VB - Faster keyword comparison
From: chuckbo-ga on 21 Jun 2002 07:37 PDT |
Okay, here's some stuff to consider. First. To help the scaling, realize that you're probably looping through the wrong list. Let's assume that your list of keywords will become very large (thousands) and that the list of words in the sentence is relatively small. What you want to do is extract each word from the sentence and compare it to the array of keywords. That way, you're performing fewer search operations. Second. You're going to say, "that won't help -- I still have to loop through the large array once for each word to test." But here's where I say to use a collection instead of an array. That way, you're taking advantage of VB's internal search algorithms, which I hope are better optimized. (Note that these are suggestions, not guarantees.) For instance, if you use the code Dim x As New Collection x.Add 1, "Java" x.Add 1, "VB" x.Add 1, "REXX" x.Add 1, "language" x.Add 1, "Spanish" Now you can do isnull(x.item(strTestword)) -- if it's true, meaning that a null was returned from the collection, then the word being tested is not in the keyword list. A value of false means that it did find a match. (So be careful, you may even want to do NOT IsNull(...) so that your If-logic isn't written as a reverse logic to read.) 3) Now, the difficulty and bottleneck becomes parsing the sentence and extracting out each word to test. This'll take some experimentation. I'd make a string of word separators strSeparators = " ,.?!();:" and either a) search through the string, character by character, converting each of these to space, and then looping through a second time using InStr to find spaces and find each word to test that way; b) search through the string, character by character, building the next testword until you hit a separator. c) another idea -- to keep from having to do so much searching for separators, maybe go through the string, character by character, and for each character, if it's between ASCII 65 and 90 or 97 and 122 or 48-57 (maybe test for 39, the apostrophe, as well - but we don't want this to get out of hand), append the character to next testword string, and when you run into a character outside of these ranges, you know you've got some type of separator, so it's time to search for the testword that you've been building to see if it's in the collection. |
If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you. |
Search Google Answers for |
Google Home - Answers FAQ - Terms of Service - Privacy Policy |