Google Answers Logo
View Question
 
Q: MS Access - Creating the 'flatest' dataset possible ( No Answer,   4 Comments )
Question  
Subject: MS Access - Creating the 'flatest' dataset possible
Category: Computers > Software
Asked by: epi_girl-ga
List Price: $20.00
Posted: 10 Feb 2006 11:08 PST
Expires: 12 Mar 2006 11:08 PST
Question ID: 444225
I am doing research using a slice (1 year of data) of a large access
relational database with over 100 look-up tables. My unique identifier
is an ID number. I need to make my dataset as 'flat' as possible so
that I can import it into SPSS for analysis. My problem is: I have a
few variables (such as a diagnosis variable that has several hundred
selections) in a look-up table that relates to my main dataset by ID
number, but some IDs have several diagnoses attached to them instead
of just one. Now, the database was not set up so that they come up as
diagnosis #1, diagnosis #2 etc. so instead I end up with records with
duplicate IDs but different diagnosis (multiple rows instead of
columns) Is there a way for me to create a dataset with multiple
diagnosis columns other than doing it manually?
Using a simplified illustration, this is what my dataset looks like:

ID     Diagnosis
1      Cardiac
1      Respiratory
2      Pediatric
3      Obstetrics
3      Respiratory
4      Cardiac
4      Medical
4      Respiratory

What I would like it to look like is:

ID    Diagnosis 1   Diagnosis 2   Diagnosis 3
1     Cardiac       Respiratory
2     Pediatric
3     Obstetric     Respiratory
4     Cardiac       Medical       Respiratory

Are there any Access tricks to make this happen? I have this problem
for a couple different variables - it was all in the way it was set
up. Unfortunately, the dataset was not created with research in mind.
I know you may need some clarification with this question - please let
me know how I can better explain it to you.
Thank you for your help - I am pulling my hair out!

Request for Question Clarification by answerguru-ga on 10 Feb 2006 15:22 PST
Hi epi_girl-ga,

I've certainly seen this before and understand what a nuisance such a
design can be. I've used a technique before where I used a query to
manipulate the existing table into the format you are looking for.

However, I'm not certain that this form of dataset is actually the
best solution. If you don't mind, I'd like to know what you would do
with your 'dataset' if it were manipulated into this format?

answerguru-ga

Clarification of Question by epi_girl-ga on 10 Feb 2006 17:37 PST
Hi Answer_guru,

Thank you for taking a look at my question. First, I should say that I
have tried several queries with no luck.

As for what I want to do with the dataset, I want to analze it using
SPSS with patient ID as my dependent variable and other variables,
such as diagnosis, sex, age, facility...as independent variables.
Using SPSS, I will be performing basic regressions and modeling. For
SPSS, each variable must be discrete which makes it difficult when
they haven't been created in access in the first place.

I hope this clarifies things. Please let me know if you need anymore information.

Thanks - I hope you can help me!

Request for Question Clarification by answerguru-ga on 11 Feb 2006 09:11 PST
Hi again,

I just saw the comment from applehead77-ga and although I haven't
tried it out, it may resolve your problem. However, if it goes over
your head, I could provide you with a simpler approach. Try that one
out first and let me know how it goes.

answerguru-ga
Answer  
There is no answer at this time.

Comments  
Subject: Re: MS Access - Creating the 'flatest' dataset possible
From: applehead77-ga on 10 Feb 2006 20:47 PST
 
Hi epi_girl-ga,

In order to accomplish what you require, you need to make use of VBA
scripting to create a temporary table and load values into it.

1. The first step is to create a temporary table with enough fields to
hold your 'flat' data. For instance, for the example dataset given by
you, a table with 4 fields - ID, Field1, Field2, and Field3.

Name/Number the fields sequentially so as to make it amenable to
coding. You can do a simple select max on the ID count to arrive at
the maximum number of fields you would require (Each field of datatype
'Text').

2. Lets call the table created in step 1 as "Temp_Table"

You now have to iterate through every row in your 'Source_Table' (with
fields 'ID' and 'Desc') and populate Temp_Table. The idea is to
perform either an 'Insert' query OR an 'Update' query depending on
whether there is a change in the ID field. DAO code that accomplishes
this is presented below...Note that the source_table will be sorted on
the ID field as it is the primary key of the table.

-----------------------------------
Dim rs As DAO.Recordset
'counters to keep track of columns
Dim fieldCt As Integer
dim tempF as Integer
Dim sql As String
'A boolean variable to determine whether an insert or an update is
required. True indicates Insert, False indicates Update
Dim insertFlag As Boolean
 
Set rs = CurrentDb.OpenRecordset("Source_Table")
'initialize to the variables
fieldCt = 1
insertFlag = True
'Iterate through each row in the source table
While (rs.EOF = False)

    'store the ID field at the start of loop to check for changes at
the end.. like a control break process
    tempF = rs!ID

    'Build the sql statements.. note how Field1, Field2 etc is
dynamically constructed depending on the fieldCt variable

    If (insertFlag = True) Then
        sql = "Insert into Temp_Table(ID,Field" & fieldCt & ")
Values(" & rs!ID & "," & "'" & rs!Desc & "');"
    Else
        sql = "Update Temp_Table set Field" & fieldCt & "=" & "'" &
rs!Desc & "' Where ID=" & rs!ID
    End If
    'Execute the SQL statement
    DoCmd.RunSQL (sql)

    insertFlag = False
    fieldCt = fieldCt + 1
    'over to the next row in our source
    rs.MoveNext

    'check and see if there is a change in ID.. if there is reset the
counter.. and change the flag..
    If (rs.EOF = False) Then
	    If (tempF <> rs!ID) Then
        	fieldCt = 1 'reinitialize
	        insertFlag = True
	    End If
    End If
Wend
'voila...we're done !
rs.Close
---------------------------------------

 
I hope this helps.. Good luck!
Subject: Re: MS Access - Creating the 'flatest' dataset possible
From: roxrox-ga on 12 Feb 2006 18:27 PST
 
If you would rather not use the VBA Script this can be done in Access
using the regualr Access Query Grid.
It is a bit clumsy but not all that bad, it gets you what you want and
you don't have to write any VBA script.

Let's say you have 3 tables
Table One = Patients Table and it has a Patient ID Field
Table Two = Diagnosis Table and it has either a Diagnosis ID or jsut
the Plain Diagnosis or both a Diagnosis ID and a Diagnosis.
Table Three = You have a Transactions Table, Which is what you described above.
This table has one or many transactions (Diagnosises) per patient.


Overview of steps-
#1- determine the maximum number of diagnosies any one patient had on one visit-
Run a Cross tab Query Described below.
#2- Copy the Transactions Table (the table that has the patient visits
with their diagnostics)to a new table.
#3 - Make Your Export table
#4 - Write another Crosstab Query to find the greatest diagnostic,
(doesn't matter if your Diagnosic field is text or a number, this will
still work) for A patient by Date by using Min as a Value in the
Crosstab
#5 Write an Append Query that uses as its source the CrossTab Query
from #4 above. The Append Query will copy a record from the Table that
is a Copy of your Transactions table, into your Export table
#6 Write a Delete Query- this Query will be exactly the same as #5,
with one small change instead of, appending the record to the
ExportTable it will delete the record form the CopyOfTransactions
table.

In our example we said that one patients had an amazing 9 diagnoses in
one visit. You will run # 4 & #5 9 times and your export table will be
done, jsut how you want it.

Let's Do it
---------------------------------

In Access Create a Query in design View.

Once you have the Query Grid Screen, go up to the Menu Bar up on top
(File, Edit, View, Insert, Query) Slect Query and form the drop down
list select Cross Tab Query. Use the Transactions Table (I don't know
what it is called in your database but it is the table that you
described above)as the table for your query.

In the First Column in the Query Desin Grid
Field = The date of the Patient Visit
Table= This should be filled in for you already I'll call it the
transactions table.
Total = Group By
Crosstab = Row Heading
Sort = Ascending

In the Second Colum in the Query Design Grid:
Field = Patient ID
Table= This should be filled in for you already I'll call it the
transactions table.
Total = Group By
Crosstab = Row Heading
Sort = Ascending

In the Third Column of the Query Design Grid
Field = Diagnosis or else Diagnosis ID
Table = This should already be filled in
Total = Group By
CrossTab = Column Heading (You can select this form the drop down
list, or type it in)

In the Fourth Colum of the Query Design Grid- Little Tricky read this carefully
Field = The same field as the third colum, Diagnosis or Diagnosis ID
Table = This should be filled in for you
Total = Max (Use upper and lower case just like I have it)
Crosstab = Value
After you have done the above steps for the Fourth Column, go back to
the Field in the Fouth Column. Carefully insert the word MyCount right
before the word "Diagnosis" and put a : after the newly inserted My
Count. End result for Field is now MyCount: Diagnosis

In the Fifth Column
Field = TotalDiagnosises: [MyCount]
Table = Blank Nothing in there
Total = Sum
Crosstab = Row Heading

Run your query by clicking on the red Exclamination Point on the ToolBar.

The output will show the following columns and data
Date   |  PatientID   TotalDiagnosises |Cardiac   |Medical   |Obstetrics
1/1/2006    1,          2                  1                      1
1/1/2006    2,          1                            1
2/23/2006   1,          1                            1

Once you ahve your query run and are looking at the output data, use
the sort button and sort from Z to A on the TotalDiagnosises Field.
This will show you the highest number of diagnoses a single patient
had in one visit. Write that number down.

Now make a copy of your Transactions Table
Name this Table "CopyTransactonsTable"

Now make a new Table, call the new Table "ExportTable"
Put the following Fields in this table
Field One = Date
Field two = Patient ID
That nubmer I asked you to write down above, the maximum number of
diagnosists  any one patient had on one visit, let's say it is 9. You
will now create 9 fields in the ExportTable, Table. Call them
Diagnostic1,Diagnostic2, etc. all the way to Diagnostic9

I am starting to get tired... If I dont' finish this tonight I'll come
back and finish tomorrow.

Write the Crosstab Query to get the Minimum Diagnosis for one patient on one visit.
I'm getting really tired, so I will shorten this up.
This Query, amke it a crosstab, 
Column 1
Field = Date
Table = CopyOfTransactionsTable
Total = Group By
Crosstab = RowHeading

Colum2
Field = PatientID
Table = CopyOfTransactionsTable
Total = Group By
Crosstab = Column Heading

Column 3
Field = Diagnosis
Table = CopyOfTransactionsTable
Total = Count
Crosstab = Value

Clumn 4
Field = Diagnosis
Table = CopyOfTransactionsTable
Total = Min
Crosstab = RowHeading

Column 5
Field = Patient ID
Table = CopyOfTransactionsTable
Total = Min
Crosstab = RowHeading

Next you would Write an Append Query, as the Source for this Query it
is theLast CrossTab Query we jsut built. You would append the fields
Date, MinOfDiagnosis,MinOfpatientNumber to your export table.
MinOfDiagnosis in Query2 goes into Diagnosis1 in the Export Table.

Then you have to write a Delete Query to Delet the same records you
jsut appended to your Export Table, so that you don't append them
again. Once those records have been deleted from the
CopyofTransactionsTable, you will just repeat the Append and Delete
Queries, in our example 9 times. However you do have to change the
Append Query, the second time you run it you will need to append the
MinOfDiagnosis Field in the Query into Diagnosis2. The Third time you
run it, change the query to insert MinOfDiagnosis into Diagnosis 3
etc.

I have a perfectly valid Delete query written and I keep getting an
error that I don't have permissions, I rarely use access anymore, so I
will ahve to check out how the permissions are set up and why it is
saying I don't have permission.

I hope that the very first thing you did was make a copy fo the who
database so that you are not working on the live database. Good Luck
to you, this really wasn't all that hard to do usg the basic tools in
MS Access.
Subject: Re: MS Access - Creating the 'flatest' dataset possible
From: q_juno-ga on 13 Feb 2006 11:07 PST
 
I analyze inpatient data all the time (as a hospital Decision Support
Coordinator/RN) using TrendStar, and have some experience with Access,
but none with SPSS.  Lots of inpatients have multiple diagnoses per
encounter, I'm afraid, and I frequently have to choose between a
totally flat database and one with the diagnosis and/or procedure
detail I need.
I'd like to add some questions and comments.

How many individual accounts are you dealing with?  If the total is
only a few thousand, Excel might be able to help you.  Export or copy
your table with Acct # and diagnoses, add a column to assign Dx1, Dx2,
Dx3, etc., and pivot.

Do you have access to the diagnosis rank?  As in Principal, 2nd, 3rd,
etc?  If so, you have a built-in way to pivot your data.  The rank
also gives your analysis much more validity, I think.

How important is it to keep all the diagnoses for each account #?  Do
you want to be able to say "show me readmission rate for all females
over 50 with a prior history of Type 2 Diabetes" for example?

Are you interested in just the broad categories you mention?  An
if/then statement with a case count might work for you.  "Heart
Cases:=iif(Dx>=##.## And Dx<##.##,1,0)" or, even better, a mapping
table, then summarize.

I really enjoy my job!  and wish you good luck with yours.
Subject: Re: MS Access - Creating the 'flatest' dataset possible
From: epi_girl-ga on 14 Feb 2006 13:43 PST
 
Hi Everyone - Thank you so much for all of your comments (I actually
thought I would be sent an email saying there had been activity on my
question and I was getting worried...until I checked it out today!)

I will be trying some of the suggestions given to me using Access,
though some of the suggestions may be a bit over my head. I know
Access fairly well, but maybe not well enough ;-)

To just respond to some of the other questions: I am working with
350,000 records, of which about 115, 000 have diagnoses attached to
them (not all unfortunately). And the diagnosis codes are not
prioritized, I'm afraid. As well, my numbers preclude me importing to
excel and using pivot tables as someone kindly suggested. I am working
with patient records, but not from a hospital - they are
inter-facility patient tranfer records. Some of my analysis might look
like this "identifying cardiac transfers from a particular
facility/region maybe even by age". I would eventually like to create
a model that could predict some patterns in patient movement
throughout our health care system.

If anyone else has any more suggestions for me I would really
appreciate it. In the meantime, I will attempt to try the suggestions
already posted.

Cheers, epi_girl

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy