I am currently working on a project where I am trying to create a data
model which will accomodate data which comes from a spreadsheet. There
are two variants to this problem:
1) There are a collection of row and column headers, but the data in
each cell is the same. For example, the horizontal columns are for
products while the columns are for customers and the cells contain
values for percent market share. My take on this problem is that
products and customers could be viewed analogously to dimension tables
which would be parents to a fact table with a record containing the
percent market share. Am I right?
2) The second case is more complicated. There are column headers and
row headers, but the value of each cell in the grid changes depending
on the column. For example, the rows could be corporate divisions,
whereas the columns could include "headcount," measured in number of
employees and "Time to Fill," which is measured in days. In this case,
a dimensional model such as I hypothesize in 1) above would fail since
the measure, or fact, would vary depending on the value of the column
dimension.
I would think that it is a common problem to try to convert Excel
spreadsheets into relational tables, but I am stumped by this problem,
especially as outlined in question 2. |