Google Answers: Oracle SQL*Loader program needed

View Question

Q: Oracle SQL*Loader program needed ( No Answer, 1 Comment )

Question

Subject: Oracle SQL*Loader program needed
Category: Computers > Programming
Asked by: westonian-ga
List Price: $40.00

Posted: 15 Nov 2004 10:32 PST
Expires: 15 Dec 2004 10:32 PST
Question ID: 429276

I like to have an efficient Oracle SQL*Loader program to load a datafile into an relational table in Oracle 10g (10.1.0.3) database. I know SQL loader pretty well. Here there is a small twist. The data file is in a slightly different format. Please see the example. This is just a sample file that I have created to explain the problem. Data file price.txt =================== Product ID,Product Name,Black List Price,Black Sale Price,White List Price,White Sale Price,Green List Price,Green Sale Price, 10011,Shoe,89.95,49.99,79.95,39.99,69.95,49.99 10032,Pant,99.99,69.99,89.99,59.99,99.99,69.99 Relational table price ========================= ------------------------------------------------------------------ PRODUCT_ID PRODUCT_NAME COLOR LIST_PRICE SALE_PRICE ------------------------------------------------------------------ 10011 Shoe Black 89.95 49.99 10032 Pant Black 99.99 69.99 10011 Shoe White 79.95 39.99 10032 Pant White 89.99 59.99 10011 Shoe Green 69.95 49.99 10032 Pant Green 99.99 69.99 I am looking for the price.ctl file containing the SQL logic to convert the data to have a "color" column. The above example is just a sample. In the real life problem, I have more than 500,000 records to load in the databse on daily basis. I also have files with different headers. Instead of "color" it may be "size" that varies. The solution must be scalable to high number of records. I want to avoid running a sql script after loading the data in the original format. But I will do if it can't be avoided. Also, I like the solution should be flexible so that I can just change few things in the ctl file to use the same for other data loads that requires the same solution. Thanks Westonian
Clarification of Question by westonian-ga on 15 Nov 2004 10:35 PST The header of the test file is =================================================== Product ID,Product Name,Black List Price,Black Sale Price,White List Price,White Sale Price,Green List Price,Green Sale Price =================================================== It got wrapped when send it. Thanks
Request for Question Clarification by mathtalk-ga on 15 Nov 2004 16:56 PST Hi, westonian-ga: So the basic issue is how to get SQL*Loader to insert multiple rows per single line of text in the input file, right? Are there potential issues with "missing colors"? For example, shoes and pants come in all three colors in your example, but perhaps gloves are only available in black and white (color green prices are missing). regards, mathtalk-ga
Clarification of Question by westonian-ga on 15 Nov 2004 17:50 PST Dear Mathtalk, Thanks for picking this up. In my data set, there will be always zeros if there is a missing color. So technically, it is not a problem for our data loading program. But it is the shortcoming of the system that generated the original data. It would be great to optionally NOT to insert a row in the table if both LIST_PRICE and SALE_PRICE are zero. We can have two modes. 1. Insert all rows. 2. Insert only the rows that at least have one non-zero value. You can tell what part of the code I should change when I switch between modes? As I said, the above example is just a sample. In my real data, there are more than two (some times up to 15) repeating attributes like LIST_PRICE and SALE_PRICE. Thanks Westonian
Clarification of Question by westonian-ga on 17 Nov 2004 18:16 PST Veiledlady, Thanks for the help. You are right about the difficulty in doing it in SQLLoader. I am not a SQL expert. But I have done some simple data loading with SQLLoader. If it is difficult to do it with SQLLoader or it affects the scalability, then I am ready to use a SQL script to change the data after it has been loaded. BTW, somebody told me that the SQLLoader internally uses SQL commands to load the data. If that so, will there any performance difference between SQLLoader and SQL scripts? In fact I found SQLLoader is much faster than SQL insert statements. I could be wrong. I use Oracle 10g. I think external tables will work in 10g also. But I don't how to use external tables to solve this problem. In other words, I need the SQL script to flip the data. Westonian Thanks Anto.
Request for Question Clarification by mathtalk-ga on 19 Nov 2004 21:05 PST Hi, Anto.: I suspect we can trick SQL*Loader into loading multiple records per line of input, along the lines you want, by varying the concept of loading multiple tables in one pass described here: [Can one load data into multiple tables at once? -- orafaq.com] http://www.orafaq.com/faqloadr.htm#MULTITABLE It'll take me a while to polish an illustration of this (assuming it works), and if you are already proficient with control files, you'll be able to proceed independently in the meantime. Oracle's own FAQ (the above is a third party) describes the difference in load methods: [sql_loader_overview] http://www.oracle.com/technology/products/database/utilities/htdocs/sql_loader_overview.html A "convention path" load does indeed use underlying INSERT statements to import the data, on a record by record basis. A "direct path" load is an optimization for speed in which memory blocks are built and then saved "directly into the extents allocated for the table being loaded." regards, mathtalk-ga
Clarification of Question by westonian-ga on 23 Nov 2004 06:04 PST Dear Mathtalk, Thanks for the quick reply and sorry for the delay from my side. The core part of my question is "how to pivot the table". It can be done at load-time or after the load (using SQL). I assume I can avoid intermediate steps and tables if I can some how do it at the load time. There is another option. Use multiple tables to load as you described in your email. Then create a "view" joining the two tables and finally producing my output format. If we go by this path, I still need to have the option of ignoring or including the records with zero values. May be this can be done by defining the view different ways (with and without zero values). In either case, I like to get the SQL to pivot (or merge in the case of ?view?) the data. If I write the SQL my own, I doubt it may not be the most efficient one. I am looking for the solution that works efficiently in Oracle environment. Thanks Westonian
Clarification of Question by westonian-ga on 24 Nov 2004 11:07 PST mathtalk-ga, Please let me know whether you are working on this question. BTW, I see that this may involve more of your time and I have increased the List Price to $50. Thanks Westonian.
Request for Question Clarification by mathtalk-ga on 24 Nov 2004 12:47 PST Thanks, westonian-ga. I will be happy to share some SQL for pivoting the table, as well as a stab at trying to make SQL*Loader do the trick. I don't think there is any special SQL optimization involved here, though. You will be taking all the rows from one "scratch" table and inserting them into another table (probably after truncating the records from the latter "target" table so that you start afresh). Apart from dropping all or most indexes that you might like to have in production for this target table, and rebuilding them after the load, I dont' think that there is any particular trick to doing this quickly. But I'll be happy to kick around with you the experimental results & see if some optimizations occur to us. regards, mathtalk-ga
Clarification of Question by westonian-ga on 24 Nov 2004 13:40 PST mathtalk-ga, Thanks for committing to this. I am already too late in getting this work done. I hope you will find one of the following options. Option 1. SQL*Loader does the magic with an option to ignor zero value rows. Option 2. Load the data into multiple tables (normalized - with an ID to join the tables). Then create two materialized views for mode 1 and mode 2. (Mode 1 = ignore records with all zero prices. Mode 2= keep all rows). The view definition has the pivoting SQL. Option 1 Advantage: It is simple. It is a single step process. This is what I asked for in my original question. Option 2 Advantage: This option has an additional step (view creation). But this option stores for much lesser amount of data due to normalization. I guess the performance will be better in this option because we select on small number of records. For example, if we have 10 colors, Option 1 may have about 10 times more number of records than the Option 2. Materialized views (instead of ordinary views) cache the data and make it really work like a table. The other advantage is that we don't have to store the zero-value records for both modes. The view can populate the zero values based on a SQL statement. This will further reduce storage. I want to know what you think about these two options. Let me know which way you are proceeding. Thanks Westonian
Clarification of Question by westonian-ga on 02 Dec 2004 22:10 PST mathtalk-ga, Are you there? I need at least "one" solution by tommorrow (EOB 12/3/04). Even if it is not the most efficient, it is OK. I am Ok to use SQL after loading it. Please help ASAP. Thanks westonian
Request for Question Clarification by mathtalk-ga on 03 Dec 2004 06:02 PST Hi, westonian-ga: Let's try this. Naturally I'm guessing at the datatypes based on your example. CREATE table DataFile ( PRODUCT_ID Number(9), PRODUCT_NAME Varchar2(30), BLACK_LIST_P Number(9.2), BLACK_SALE_P Number(9.2), WHITE_LIST_P Number(9.2), WHITE_SALE_P Number(9.2), GREEN_LIST_P Number(9.2), GREEN_SALE_P Number(9.2) ); Then load the comma-delimited data file price.txt straight into table DATA_FILE using SQL*Loader, skipping if it's present the header row (or just allowing it to kick out into the error log since "Product ID" won't convert to a number). Now assuming that you have the column names/datatypes fixed up in the above to match your actual target table, you can do this: INSERT into Product_Price ( PRODUCT_ID, PRODUCT_NAME, COLOR, LIST_PRICE, SALE_PRICE ) SELECT PRODUCT_ID, PRODUCT_NAME, 'Black', BLACK_LIST_P, BLACK_SALE_P from DataFile where BLACK_LIST_P > 0 and BLACK_SALE_P > 0; COMMIT; INSERT into Product_Price ( PRODUCT_ID, PRODUCT_NAME, COLOR, LIST_PRICE, SALE_PRICE ) SELECT PRODUCT_ID, PRODUCT_NAME, 'White', WHITE_LIST_P, WHITE_SALE_P from DataFile where WHITE_LIST_P > 0 and WHITE_SALE_P > 0; COMMIT; INSERT into Product_Price ( PRODUCT_ID, PRODUCT_NAME, COLOR, LIST_PRICE, SALE_PRICE ) SELECT PRODUCT_ID, PRODUCT_NAME, 'Green', GREEN_LIST_P, GREEN_SALE_P from DataFile where GREEN_LIST_P > 0 and GREEN_SALE_P > 0; COMMIT; The speed at which this works for large datasets may be enhanced by first dropping all the indexes on the target table Product_Price and rebuilding them at the end. If you have a "development" database to try this on, I'd certainly give it a try that way. Rebuilding the indexes can take a long time, but it's usually more efficient to rebuild them all at once than to incur the penalty of updating them incrementally as INSERT's take place. regards, mathtalk-ga

Answer

There is no answer at this time.

Comments

Subject: Re: Oracle SQL*Loader program needed
From: veiledlady-ga on 17 Nov 2004 16:39 PST

If you want a scalable solution, the best thing is to load the data
as-is using SQL loader, and then running a SQL script to process the
data into the format you want.

Trying  to make SQL loader, pivot the data for you will prove
expensive in the long run and will not be a scalable option.

Please check the Oracle documentation for more on Oracle Utilities
like SQL Loader. Not sure if you are using  Oracle 9i, if so you can
use external tables.


Hope this helps.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy