Google Answers Logo
View Question
 
Q: MySql Query Optimization ( No Answer,   0 Comments )
Question  
Subject: MySql Query Optimization
Category: Computers > Software
Asked by: jeffiel-ga
List Price: $20.00
Posted: 02 Jan 2006 13:29 PST
Expires: 02 Jan 2006 17:33 PST
Question ID: 428113
Hello,

I have a MySql query which is stumping my abilitites to optimize it. 
Given a set of rows marked "dirty", and a column containing
heirarchial data in LDAP DN format, I want to mark all "peers" of the
dirty rows for updating, which I'll do in another query.  From all I
can tell, this query has proper indexes, and is not doing something
stupid like an accidntal cross join.  So assistance in getting this to
run in reasonable time over a decent sized data set is worth money to
me.

Here is my query:

UPDATE derived d1, derived d2 SET
d1.runid='01964652a821d326741c67cbbe943684' WHERE
d2.axis_ReceiptDate_parent = 'year=2005' AND
d2.parent_dirty_ReceiptDate = 1 AND d2.type=1 AND d1.type=1 AND
d1.hash != '' AND d1.runid IS NULL AND d1.axis_ReceiptDate_parent =
d2.axis_ReceiptDate_parent AND d1.axis_SalesPersonGeography =
d2.axis_SalesPersonGeography AND d1.axis_Departments =
d2.axis_Departments

NOTE: my original version of this query used a subquery, but MySql
doesn't allow a select and update on the same table in the same query,
so I had to rewrite using this JOIN.

Here is the table definition:

+-----------------------------------+--------------+------+-----+-------------------+-------+
| Field                             | Type         | Null | Key |
Default           | Extra |
+-----------------------------------+--------------+------+-----+-------------------+-------+
| alter_date                        | timestamp    | YES  |     |
CURRENT_TIMESTAMP |       |
| type                              | int(1)       |      | PRI | 0   
             |       |
| hash                              | varchar(32)  | YES  | MUL | NULL
             |       |
| user_data_Subtotal                | double(18,8) | YES  |     |
0.00000000        |       |
| goal_Subtotal                     | double(18,8) | YES  |     |
0.00000000        |       |
| avg_Subtotal                      | double(18,8) | YES  |     | NULL
             |       |
| user_data_TotalCogs               | double(18,8) | YES  |     |
0.00000000        |       |
| goal_TotalCogs                    | double(18,8) | YES  |     |
0.00000000        |       |
| avg_TotalCogs                     | double(18,8) | YES  |     | NULL
             |       |
| user_data_TotalNumSold            | double(18,8) | YES  |     |
0.00000000        |       |
| goal_TotalNumSold                 | double(18,8) | YES  |     |
0.00000000        |       |
| avg_TotalNumSold                  | double(18,8) | YES  |     | NULL
             |       |
| user_data_AvgTicket               | double(18,8) | YES  |     |
0.00000000        |       |
| goal_AvgTicket                    | double(18,8) | YES  |     |
0.00000000        |       |
| avg_AvgTicket                     | double(18,8) | YES  |     | NULL
             |       |
| user_data_AvgTicket_sum           | double(18,8) | YES  |     |
0.00000000        |       |
| user_data_AvgTicket_count         | double(18,8) | YES  |     |
0.00000000        |       |
| user_data_AvgMargin               | double(18,8) | YES  |     |
0.00000000        |       |
| goal_AvgMargin                    | double(18,8) | YES  |     |
0.00000000        |       |
| avg_AvgMargin                     | double(18,8) | YES  |     | NULL
             |       |
| user_data_AvgMargin_sum           | double(18,8) | YES  |     |
0.00000000        |       |
| user_data_AvgMargin_count         | double(18,8) | YES  |     |
0.00000000        |       |
| axis_ReceiptDate                  | varchar(200) |      | PRI |     
             |       |
| axis_ReceiptDate_parent           | varchar(200) | YES  | MUL | NULL
             |       |
| axis_ReceiptDate_sort             | varchar(200) | YES  |     | NULL
             |       |
| parent_dirty_ReceiptDate          | int(1)       | YES  |     | NULL
             |       |
| axis_SalesPersonGeography         | varchar(200) |      | PRI |     
             |       |
| axis_SalesPersonGeography_parent  | varchar(200) | YES  | MUL | NULL
             |       |
| axis_SalesPersonGeography_sort    | varchar(200) | YES  |     | NULL
             |       |
| parent_dirty_SalesPersonGeography | int(1)       | YES  |     | NULL
             |       |
| axis_Departments                  | varchar(200) |      | PRI |     
             |       |
| axis_Departments_parent           | varchar(200) | YES  | MUL | NULL
             |       |
| axis_Departments_sort             | varchar(200) | YES  |     | NULL
             |       |
| parent_dirty_Departments          | int(1)       | YES  |     | NULL
             |       |
| runid                             | varchar(32)  | YES  | MUL | NULL
             |       |
| axis_receiptdate_depth            | int(2)       | YES  |     | NULL
             |       |
| axis_departments_depth            | int(2)       | YES  |     | NULL
             |       |
| axis_salespersongeography_depth   | int(2)       | YES  |     | NULL
             |       |
+-----------------------------------+--------------+------+-----+-------------------+-------+


Here are my indexes:

mysql> show indexes from derived;
+---------+------------+--------------------------------+--------------+-----------------------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table   | Non_unique | Key_name                       | Seq_in_index
| Column_name                       | Collation | Cardinality |
Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+--------------------------------+--------------+-----------------------------------+-----------+-------------+----------+--------+------+------------+---------+
| derived |          0 | PRIMARY                        |            1
| type                              | A         |           1 |    
NULL | NULL   |      | BTREE      |         |
| derived |          0 | PRIMARY                        |            2
| axis_ReceiptDate                  | A         |         808 |    
NULL | NULL   |      | BTREE      |         |
| derived |          0 | PRIMARY                        |            3
| axis_SalesPersonGeography         | A         |       10277 |    
NULL | NULL   |      | BTREE      |         |
| derived |          0 | PRIMARY                        |            4
| axis_Departments                  | A         |       71945 |    
NULL | NULL   |      | BTREE      |         |
| derived |          1 | index_d_ReceiptDate            |            1
| axis_ReceiptDate                  | A         |         808 |    
NULL | NULL   |      | BTREE      |         |
| derived |          1 | index_d_ReceiptDate            |            2
| parent_dirty_ReceiptDate          | A         |         808 |    
NULL | NULL   | YES  | BTREE      |         |
| derived |          1 | index_d_ReceiptDate_p          |            1
| axis_ReceiptDate_parent           | A         |          31 |    
NULL | NULL   | YES  | BTREE      |         |
| derived |          1 | index_d_SalesPersonGeography   |            1
| axis_SalesPersonGeography         | A         |          59 |    
NULL | NULL   |      | BTREE      |         |
| derived |          1 | index_d_SalesPersonGeography   |            2
| parent_dirty_SalesPersonGeography | A         |          59 |    
NULL | NULL   | YES  | BTREE      |         |
| derived |          1 | index_d_SalesPersonGeography_p |            1
| axis_SalesPersonGeography_parent  | A         |           3 |    
NULL | NULL   | YES  | BTREE      |         |
| derived |          1 | index_d_Departments            |            1
| axis_Departments                  | A         |          19 |    
NULL | NULL   |      | BTREE      |         |
| derived |          1 | index_d_Departments            |            2
| parent_dirty_Departments          | A         |          19 |    
NULL | NULL   | YES  | BTREE      |         |
| derived |          1 | index_d_Departments_p          |            1
| axis_Departments_parent           | A         |           2 |    
NULL | NULL   | YES  | BTREE      |         |
| derived |          1 | index_d_hash                   |            1
| hash                              | A         |       71945 |    
NULL | NULL   | YES  | BTREE      |         |
| derived |          1 | index_d_runid                  |            1
| runid                             | A         |           3 |    
NULL | NULL   | YES  | BTREE      |         |
+---------+------------+--------------------------------+--------------+-----------------------------------+-----------+-------------+----------+--------+------+------------+---------+


MySql won't allow an "explain" on an update, so here's an explain on a
similar select:


mysql> explain select d1.runid from derived d1, derived d2 WHERE
d2.axis_ReceiptDate_parent = 'year=2005' AND
d2.parent_dirty_ReceiptDate = 1 AND d2.type=1 AND d1.type=1 AND
d1.hash != '' AND d1.runid IS NULL AND d1.axis_ReceiptDate_parent =
d2.axis_ReceiptDate_parent AND d1.axis_SalesPersonGeography =
d2.axis_SalesPersonGeography AND d1.axis_Departments =
d2.axis_Departments;

+----+-------------+-------+------+-----------------------------------------------------------------------------------------------------------+------------------------------+---------+--------------------------------------------------+------+-------------+
| id | select_type | table | type | possible_keys                     
                                                                      
| key                          | key_len | ref                        
                     | rows | Extra       |
+----+-------------+-------+------+-----------------------------------------------------------------------------------------------------------+------------------------------+---------+--------------------------------------------------+------+-------------+
|  1 | SIMPLE      | d1    | ref  |
PRIMARY,index_d_ReceiptDate_p,index_d_SalesPersonGeography,index_d_Departments,index_d_hash,index_d_runid
| index_d_ReceiptDate_p        |     201 | const                      
                     | 6117 | Using where |
|  1 | SIMPLE      | d2    | ref  |
PRIMARY,index_d_ReceiptDate_p,index_d_SalesPersonGeography,index_d_Departments
                           | index_d_SalesPersonGeography |     200 |
warehouse_deptsales.d1.axis_SalesPersonGeography | 1219 | Using where
|
+----+-------------+-------+------+-----------------------------------------------------------------------------------------------------------+------------------------------+---------+--------------------------------------------------+------+-------------+

The derived table has roughly 72,000 rows:

mysql> select count(*) from derived;
+----------+
| count(*) |
+----------+
|    71945 |
+----------+

and the update query above generally takes between 50 and 250 seconds
to complete.  Given the indexes, it seems like it should take less
time, like an order of magnitude.
Answer  
There is no answer at this time.

Comments  
There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy