We have several Excel Spreadsheets that, through legacy, our team has
used as flat file databases. We often need to find information
(variables) on one sheet (source sheet) and match it/combine it/add it
to another sheet(destination sheet). The common value (key) is the
The caveat is people are not consistently naming the projects, e.g.
"Big Project", "Big Project 2003" and "Big Important Project 2003"
would be the project names on all three sheets and all referring to
the same project. Several projects repeat year after year, so
depending on the sheet, we might have "Big Project 2003" AND "Big
Project 2004" in the same column. A plus is that almost every source
sheet has a YEAR variable/column that we can either sort by or define
an array by, e.g., 2003 is is rows 1-799, 2004 is in rows 800-1100
(rows numbers can be entered and are not consistent through source
sheets). So on the source sheet, the Project Name column might have
"Big Project" and right below it "Big Project" again, but you would
need to look in the YEAR column to realize they are different years.
The REQUEST: create a macro that can search based on Project Name in
Column A on the destination sheet, can go to other sheets (source
sheets) and search for Project Name or a variation (Within a range of
rows - 1-799, 800-1100 that can be given or found via code and the
YEAR column), then return a variable value that is offset a given
number of columns. We need to be able to change the source variable
column as needed, as each time we need different things.
Ideally, I would put all flat files in database and link them, but as
I am the only one who understands databases in the team and new SOURCE
sheets are pulled and created daily by other teams, I need a solution
that can work on excel and kind of match up the sheets based on
Sounds like VLOOKUP right, unfortunately no, VLOOKUP won't return an
accurate match, even using TRUE. I'm thinking the ANSWER will
be a macro that loops through ALL values in COLUMN 1 and using some
combination of VLOOKUP, MATCH, INDEX, FIND, etc., etc. returns a
corresponding variable/column. It needs to search for variations "Big
Complex Project 2003", "Big Complex Project", and "Big Project", but
only within a given range of rows (the dates) on the source sheet
(which should be found via code or asked for via code), and return the
needed variable. It should probably also return in a new column
whether it was an exact match, a 3 out of 4 words match, a 2 out of 4
words match, etc. on the Project Name so we might be able to manually
go in an do further investigation for the ones where the answer was
only a 1 out of 4 match, etc.
It should probably launch from a simple button I can easily move from
sheet to sheet and have some way to for me to specify the source and
destination sheets. I can definitely edit VBA code, etc and a VBA
form with the ability to choose files for source and desintation would
Other solutions would be welcome. What do you think?
Clarification of Question by
27 Sep 2006 11:02 PDT
Also key is the ORDER of a project name may change as well as the
length, "Big Important Project 2003" in the destination spreadsheet
could just as easily be "2003 Important Big Project" or "2003
Important Big" in the source spreadsheets.
I need the tool to basically parse "Project_Name" column in the
destination sheet, determine how many words are in each Project Name
and then search for various combinations of all the words that project
name in source spreadsheets and return information from other same
row, different column once the match is found.
The search should, of course, value a match with EQUAL number of EQUAL
words highest, regardless of order. For example, the match of "2003
Big Important Project" should best be "Big Project Important 2003"
OVER "2003 Big Important". Length is more important than order.
Also when searching the source sheets, it should should disregard '-',
'(', ')', '\', '/' and " " (spaces) in both the source and desintation
sheets as these extraneous characters are my key problem in accurate
The WORDS should match between the sheets. The NUMBER of WORDS that
match is of most importance. Disregard extraneous characters. Order
is valued second after number of words.
I hope this helps clarify.