Google Answers Logo
View Question
 
Q: Optimize MySQL query with JOINS and GROUP ( Answered,   0 Comments )
Question  
Subject: Optimize MySQL query with JOINS and GROUP
Category: Computers > Programming
Asked by: skyhigh-ga
List Price: $35.00
Posted: 23 Feb 2006 22:00 PST
Expires: 25 Mar 2006 22:00 PST
Question ID: 700251
I have been trying to fix this query for weeks and have gotten
nowhere, and its dragging my website to a crawl.  Part of me is
thinking I went a little overboard on the JOINS...

We have four tables bookinventory (85,000 records, 5 Megs) and
bookitem (90,000 records, 200 Megs), bookusers, and shipto (last two
are small tables).

CREATE TABLE `bookinventory` (
  `ISBN` varchar(32) NOT NULL default '',
  `userid` int(11) NOT NULL default '0',
  `itemnum` int(11) NOT NULL auto_increment,
  `listtime` datetime NOT NULL default '0000-00-00 00:00:00',
  `status` int(16) default NULL,
  `sentto` int(11) NOT NULL default '0',
  `comment` text NOT NULL,
  `trackback` int(11) default NULL,
  `condition` varchar(16) NOT NULL default '',
  PRIMARY KEY  (`itemnum`),
  KEY `time_idx` (`listtime`)
) TYPE=MyISAM AUTO_INCREMENT=122082 ;

CREATE TABLE `bookitem` (
  `id` int(11) NOT NULL auto_increment,
  `isbn` varchar(32) NOT NULL default '',
  `title` varchar(128) NOT NULL default '',
  `author` varchar(255) NOT NULL default '',
  `reviews` text NOT NULL,
  `smallimg` varchar(128) NOT NULL default '',
  `largeimg` varchar(128) NOT NULL default '',
  `category` varchar(255) NOT NULL default '',
  `subject` varchar(255) NOT NULL default '',
  `releasedate` varchar(32) NOT NULL default '0000-00-00',
  `publisher` varchar(64) NOT NULL default '',
  `updatetime` date NOT NULL default '0000-00-00',
  `media` varchar(32) NOT NULL default '',
  `description` text NOT NULL,
  `rating` float NOT NULL default '0',
  PRIMARY KEY  (`id`),
  KEY `media_idx` (`media`),
  KEY `isbn_idx` (`isbn`)
) TYPE=MyISAM AUTO_INCREMENT=91640 ;

CREATE TABLE `bookusers` (
  `userid` int(11) NOT NULL auto_increment,
  `username` varchar(64) NOT NULL default '',
  `password` varchar(64) NOT NULL default '',
  `realname` varchar(128) NOT NULL default '',
  `street` varchar(128) NOT NULL default '',
  `city` varchar(64) NOT NULL default '',
  `state` char(2) NOT NULL default '',
  `zip` varchar(16) NOT NULL default '0',
  `points` int(16) NOT NULL default '0',
  `email` varchar(128) NOT NULL default '',
  `joindate` date NOT NULL default '0000-00-00',
  `session` varchar(32) NOT NULL default '',
  `referer` varchar(64) NOT NULL default '',
  `newsmail` char(3) NOT NULL default 'ON',
  `justmailed` char(3) NOT NULL default 'NO',
  `ip` varchar(32) default NULL,
  `country` varchar(32) NOT NULL default '',
  `status` int(11) NOT NULL default '0',
  `premiumends` date NOT NULL default '0000-00-00',
  `bookmailer` char(3) NOT NULL default 'OFF',
  `dvdmailer` char(3) NOT NULL default 'OFF',
  `cdmailer` char(3) NOT NULL default 'OFF',
  `vhsmailer` char(3) NOT NULL default 'OFF',
  `videogamemailer` char(3) NOT NULL default 'OFF',
  `magazinemailer` char(3) NOT NULL default 'OFF',
  `paid` int(11) default NULL,
  `shipoutsidecountry` char(3) NOT NULL default '',
  `countryfilter` char(3) NOT NULL default 'ON',
  PRIMARY KEY  (`userid`),
  UNIQUE KEY `uniqueadd` (`street`,`zip`),
  UNIQUE KEY `userid` (`userid`),
  UNIQUE KEY `username` (`username`)
) TYPE=MyISAM AUTO_INCREMENT=4218 ;

CREATE TABLE `shipto` (
  `id` int(11) NOT NULL auto_increment,
  `userid` int(11) NOT NULL default '0',
  `country` varchar(32) NOT NULL default '',
  PRIMARY KEY  (`id`),
  UNIQUE KEY `cat` (`userid`,`country`)
) TYPE=MyISAM AUTO_INCREMENT=91499 ;

bookinventory: Holds information about items that someone has, people
can have the same item
bookitem: Holds common info about an item, like its title
bookusers: Holds user info
shipto: Holds lists of countries that a user ships to

If I run this statement, it can take 10 seconds to run:
SELECT bookinventory.ISBN, bookinventory.trackback,
bookinventory.userid, bookusers.userid, bookusers.country,
bookusers.shipoutsidecountry, DATE_FORMAT( bookinventory.listtime, '%M
%D, %Y' ) AS listtime, bookitem. * , shipto. * , count( * ) AS count
FROM bookinventory, bookitem, bookusers, shipto
WHERE (
bookitem.media = 'Hardcover' OR bookitem.media = 'Paperback'
) AND bookinventory.ISBN = bookitem.isbn AND bookinventory.status IS
NULL AND bookinventory.userid != '1' AND bookinventory.userid =
bookusers.userid AND (
shipto.userid = bookusers.userid AND shipto.country = 'United States of America'
)
GROUP BY bookinventory.ISBN
ORDER BY bookitem.title ASC  LIMIT 0 , 30

I am grouping by ISBN because multiple people might have the same
item, but we want it listed once.
Explain of query:

 table    type   possible_keys 	 key  	 key_len   ref  	 rows  	 Extra
bookitem range 	PRIMARY,media,isbntbl 	media 	32 	NULL 	51151 	Using
where; Using temporary; Using filesort
bookinventory 	ref 	ISBN,browse 	ISBN 	32 	bookitem.isbn 	1 	Using where
bookusers 	eq_ref 	PRIMARY,userid 	PRIMARY 	4 	bookinventory.userid 	1 	 
shipto 	eq_ref 	cat 	cat 	36 	bookusers.userid,const 	1 	Using where

Is there any way to make this query work faster?

Clarification of Question by skyhigh-ga on 23 Feb 2006 22:12 PST
What changes would help this query run more efficiently, in the
sub-seconds rather than 10 seconds.
Answer  
Subject: Re: Optimize MySQL query with JOINS and GROUP
Answered By: answerguru-ga on 23 Feb 2006 22:38 PST
 
Hello skyhigh-ga,

The problem with your query is that you are listing multiple large
tables in the FROM clause. So essentially what is happening behind the
scenes is this:

1. Tables bookinventory and bookitem are being joined with no join condition
2. The resulting intermediate result contains 90000*85000 records
3. Multiply this by the number of rows in both bookusers and shipto
4. Apply all the filters in your WHERE clause to this massive table

I can certainly understand such a query taking 10 seconds or longer!

Instead, I suggest you explicitly join your tables so that you can
specify a condition when the join actually occurs. The optimized query
is below:

SELECT bookinventory.ISBN, 
bookinventory.trackback,
bookinventory.userid, 
bookusers.userid, 
bookusers.country,
bookusers.shipoutsidecountry, 
DATE_FORMAT( bookinventory.listtime, '%M %D, %Y' ) AS listtime, 
bookitem. * , 
shipto. * , 
count( * ) AS count
FROM bookinventory
JOIN bookitem ON bookitem.ISBN = bookinventory.ISBN
JOIN bookusers ON bookusers.userid = bookinventory.userid
JOIN shipto ON shipto.userid = bookusers.userid
WHERE (bookitem.media = 'Hardcover' OR bookitem.media = 'Paperback') 
AND bookinventory.status IS NULL 
AND bookinventory.userid != '1' 
AND shipto.country = 'United States of America'
GROUP BY bookinventory.ISBN
ORDER BY bookitem.title ASC  LIMIT 0 , 30


As you can see above, three of the conditions that were previously in
the WHERE clause have been turned into conditions in the three new
JOIN statements. Also, we are now only starting with a single table
and then specifying how we want to join with other tables required in
the final resultset.

Hopefully this resolves your performance headaches - if you have any
concerns about the information above, please post a clarification and
I will respond to it promptly.

Cheers!

answerguru-ga

Request for Answer Clarification by skyhigh-ga on 23 Feb 2006 22:50 PST
answerguru,
  Thanks so much for the quick reply!  I ran the query exactly as you
had it, but am not seeing any better performance.  The EXPLAIN does
look quite a bit better though which I find encouraging.  Might my
indexes not be right?  I have changed them over and over again so they
could be out of whack.  Do you have any further thoughts?

Thanks again for the help!
Skyhigh

Request for Answer Clarification by skyhigh-ga on 23 Feb 2006 22:52 PST
Oops, I meant to include the EXPLAIN if it might help... sorry it
doesn't paste very well.

 table  	 type  	 possible_keys  	 key  	 key_len  	 ref  	 rows  	 Extra
bookitem 	range 	PRIMARY,media,isbntbl 	media 	32 	NULL 	51151 	Using
where; Using temporary; Using filesort
bookinventory 	ref 	ISBN,browse 	ISBN 	32 	bookitem.isbn 	1 	Using where
bookusers 	eq_ref 	PRIMARY,userid 	PRIMARY 	4 	bookinventory.userid 	1 	 
shipto 	eq_ref 	cat 	cat 	36 	bookusers.userid,const 	1 	Using where

Request for Answer Clarification by skyhigh-ga on 23 Feb 2006 22:57 PST
I removed the GROUP BY statement and the query runs in .7 seconds. 
Does this tell you anything, other than the GROUP BY is causing the
trouble now?

Clarification of Answer by answerguru-ga on 23 Feb 2006 23:14 PST
It does appear to be the GROUP BY that is causing the remainder of
your problem. Although I don't like to use it unless I absolutely have
to, it appears to be necessary for what you are trying to achieve.

GROUP BY is an expensive operation by nature and I could certainly see
it taking up the bulk of your total querying time.

Your indexes appear to be fine, although there may be something
specific to your data which says otherwise.

If you really can't afford to wait, I would consider altering the
interface so that such a heavy query can be avoided. For example, you
can avoid retrieving the aggregate columns until the user has selected
a specific record in one of the larger tables.

Sometimes you have to deal with the trade-offs when dealing with
expensive queries like this.

answerguru-ga
Comments  
There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy