Oracle query design question:
If I have 3 tables: PEOPLE (big), CITIES (medium), STATES (small) and
I want to find out the PEOPLE in the STATE of NY, which is the best
way to write it?
SELECT * FROM PEOPLE p WHERE p.CITY IN (SELECT CITY FROM CITIES c
WHERE c.STATE IN (SELECT STATE FROM STATES s WHERE STATE_CODE='NY'));
or
SELECT * FROM PEOPLE p WHERE p.CITY IN (SELECT CITY FROM CITIES c,
STATES s WHERE c.STATE=s.STATE AND s.STATE_CODE='NY');
or
SELECT * FROM PEOPLE p WHERE EXISTS (SELECT 'x' FROM CITIES c WHERE
c.CITY=p.CITY AND EXISTS (SELECT 'x' FROM STATES s WHERE
s.STATE=c.STATE AND STATE_CODE='NY'));
or finally
SELECT * FROM PEOPLE p WHERE EXISTS (SELECT 'x' FROM CITIES c, STATES
s WHERE c.CITY=p.CITY AND c.STATE=s.STATE AND STATE_CODE='NY');
A description of why would be helpful too (like if IN is faster than
EXISTS or if using multiple subqueries is better than one big one).
Also, when I use EXISTS, should I set up the joins as I did above, or
should the outside field (like p.CITY go first)? |
Request for Question Clarification by
rhansenne-ga
on
17 Jul 2004 01:21 PDT
Hi ny_insomniac-ga,
I'm not sure what the fastest statement would be on Oracle, but you
might consider totally eliminating subselects, by using a 3-table
join. Something like:
SELECT * FROM (PEOPLE p INNER JOIN CITIES c ON p.CITY = c.CITY) INNER
JOIN STATES s ON c.STATE = s.STATE WHERE STATE_CODE='NY';
Usings joins is often faster than subselects, especially when you have
indexes on both primary and foreign keys.
Sincerely,
rhansenne-ga
|
Request for Question Clarification by
rhansenne-ga
on
17 Jul 2004 01:23 PDT
PS: in the above statement you could replace
SELECT * ...
by
SELECT p.* ...
to further enhance performance.
|
Clarification of Question by
ny_insomniac-ga
on
19 Jul 2004 07:32 PDT
So you're saying that having a simple single query would be better
than using the subqueries. Intuitively I was trying to pull the extra
tables out of the FROM clause since I was only using them for
filtering. Are you sure this is the best approach?
|
Request for Question Clarification by
rhansenne-ga
on
20 Jul 2004 09:25 PDT
Hi ny_insomniac-ga,
It is a common best practice to 'flatten' subqueries into (inner)
joins. The database's query optimizer usually is better at optimizing
joins than nested queries. On SQL Server the suggested query would
definitely be the faster one.
I don't have any query optimization experience on Oracle however, so
I'm only offering it as a suggestion and not an answer to your
question.
Sincerely,
rhansenne-ga.
|
Clarification of Question by
ny_insomniac-ga
on
23 Jul 2004 11:10 PDT
ok, thanks
|