Google Answers: SQL DISTINCT COUNT

View Question

Q: SQL DISTINCT COUNT ( Answered 5 out of 5 stars

, 0 Comments )

Question

Subject: SQL DISTINCT COUNT
Category: Computers > Programming
Asked by: mdeeps-ga
List Price: $5.00

Posted: 25 Jul 2003 13:27 PDT
Expires: 24 Aug 2003 13:27 PDT
Question ID: 235114

Ok this question should be relatively easily for someone who knows
their SQL.  It has to do with COUNT and DISTINCT I'm using Microsoft
SQL Server.

All right I have a table that looks like so:

SalesTable

SalesPerson       Department      Item     Qty    Price    SaleDate
Eric Chun           VideoEq        01001    2       15.00    7/24/03
Eric Chun           AudioEq        15646    1       45.00    7/23/03
Fred Chin           Books          87894    2       15.00    7/23/03
Eric Chu            Books          15489    3       55.00    7/23/03
Ricky Lal           Tools          87965    1       5.00    7/23/03
...........................

If I run the following SQL Query

SELECT DISTINCT SalesPerson From SalesTable I get

SalesPerson
Eric Chun
Fred Chin
Ricky Lal
Louis Marc
Mandy Brewer


If I run the following SQL Query

SELECT COUNT(DISTINCT SalesPerson) AS MyCount From SalesTable I get

MyCount
5

If I run the following SQL Query

SELECT DISTINCT SalesPerson, Department From SalesTable I get

SalesPerson       Department
Eric Chun          AudioEq
Eric Chun          VideoEq
Eric Chun          Books
Fred Chin          Books
Fred Chin          Computers
Ricky Lal          Tools
Ricky Lal          VideoEq
Louis Marc         VideoEq
Louis Marc         AudioEq
Mandy Brewer       Computers


What I want is a count of the previous QUERY, however

SELECT COUNT (DISTINCT SalesPerson, Department) AS MyCount From
SalesTable

Does not work.  I can not just run the Query and then count my results
because eventually I will be using a GROUP BY to get several counts
i.e.

SELECT COUNT( DISTINCT Department ) As DepCount From SalesTable Group
By SalesPerson

SalesPerson       DepCount
Eric Chu           3
Fred Chin          2
Ricky Lal          2
Louis Marc         2
Mandy Brewer       1

So the eventual goal would be to run a query like: 

SELECT COUNT( DISTINCT SalesPerson, Department ) As MyCount From
SalesTable Group By Item

To give the different number of sales people who have sold an item,
but it should count a person selling the same item in multiple
departments as multiple people.

Sooooooooo Can I even do this?  I beleive i should be able to, perhaps
its some special syntax that is needed.

Thanks,

Mandeep

Answer

Subject: Re: SQL DISTINCT COUNT
Answered By: mathtalk-ga on 25 Jul 2003 19:52 PDT
Rated: 5 out of 5 stars

Hi, Mandeep:

I don't think there is any simple syntax that you are overlooking.

The COUNT( DISTINCT ... ) syntax requires a single expression to be
substituted for the ellipsis.  If there were a "functor" in SQL that
could form a unique ordered pair, say (SalesPerson,Department), by
associating the two respective field values, then we could use that to
create the required expression.  Unfortunately SQL does not support
this level of abstraction.

A "cheesy" approximation to this (which I have stooped to using on
occasion) would be to exploit the concatenation of the two strings. 
Thus:

SELECT COUNT( DISTINCT SalesPerson + Department )
FROM SalesTable

or the more detailed query:

SELECT Item, COUNT( DISTINCT SalesPerson + Department )
FROM SalesTable
GROUP BY Item

would work imperfectly.  It would be possible (if unlikely) for the
end of one SalesPerson value to agree with the beginning of a
Department value in such a way that the same concatenated string could
be formed by what ought to be two distinct pairings.

For example, if we had two salespersons:

Jan Kruek
Jan Krueks

and two departments:

Tools
Stools

then (unless SQL Server is set for case sensitivity, which is not the
default), there would be two potential ways to get the same names
combination:

Jan KruekStools
Jan KrueksTools

Of course this is somewhat unlikely.  It can also be made essentially
impossible.  If we pick a character that doesn't occur in any
department name, for example, if "#" doesn't appear in any value for
Department, then the revised queries:

SELECT COUNT( DISTINCT SalesPerson + '#' + Department )
FROM SalesTable

and:

SELECT Item, COUNT( DISTINCT SalesPerson + '#' + Department )
FROM SalesTable
GROUP BY Item

can be relied upon without exception.

This can be a useful "trick" if one is interested in COUNT's more than
in an actual list of the distinct pairings of SalesPerson and
Department, and your phrasing of your question gives me every reason
to think that this is the case.  To some extent the trick depends on
the datatypes of SalesPerson and Department being character strings,
but even when this is not so, as perhaps when a datetime is involved,
it can be made to work by converting to strings.

Of course it's also wise to have a more "SQL standard" approach to
this task in mind, if only for the sake of considering an alternative.
 One such approach would be to use a nested subquery as a "derived
table".  Not all SQL implementations allow this, but MS SQL Server
2000 does:

SELECT COUNT(*)
FROM (SELECT DISTINCT SalesPerson, Department
    FROM SalesTable) AS DerivedTableA

or (to the more detailed counts):

SELECT Item, Count(*)
FROM (SELECT DISTINCT SalesPerson, Department, Item
    FROM SalesTable) AS DerivedTableB
GROUP BY Item

Finally, having treated this with some generality, notice that in the
important special case that the fields (SalesPerson, Department, Item)
formed a unique compound key on SalesTable, i.e. if their combinations
uniquely identify the records in that table, then it is unnecessary to
use a derived table.  One can then write simply:

SELECT Count(*) FROM SalesTable

and:

SELECT Item, Count(*) FROM SalesTable
GROUP BY Item

While I feel confident this uniqueness would _not_ apply in your
original case (because of the inclusion of the SalesDate field), it
could be made to apply by forming a VIEW on the SalesTable, esp. one
which returns unique combinations of SalesPerson, Department, and
Item.  For example:

CREATE VIEW DistinctSalesTable AS
    SELECT SalesPerson, Department, Item, Sum(Qty)
    FROM   SalesTable
    GROUP BY SalesPerson, Department, Item

Once the appropriate VIEW is created, the simpler syntax can be
invoked.

regards, mathtalk-ga


Search Strategy

Mostly personal experience, however I used MS SQL Server Books Online
to confirm the syntax (as well as SQL Query Analyzer to test out
suggestions)

Further Resources

Here's a short SQL script I developed in testing my answer:

CREATE TABLE SalesTable (
    SalesPerson   varchar(25) not Null,
    Department    varchar(12) not Null,
    Item          char(5)     not Null,
    Qty           integer     not Null,
    SalesDate     datetime    not Null,
    CONSTRAINT PK_Sales PRIMARY KEY (
        SalesPerson, 
        Department,
        Item,
        SalesDate
    )
) 


SELECT COUNT(*)
FROM (SELECT DISTINCT SalesPerson, Department
    FROM SalesTable) AS DerivedTableA

SELECT Item, Count(*)
FROM (SELECT DISTINCT SalesPerson, Department, Item
    FROM SalesTable) AS DerivedTableB
GROUP BY Item

Request for Answer Clarification by mdeeps-ga on 26 Jul 2003 10:54 PDT

awesome answer, just one quick easy question.  The table i gave was
just a made up sample.  In actuality one of the fields i'm using is a
Number.  To Concatenate them i need to convert it to a string.

How do i do that?

thanks so much,

mandeep

Clarification of Answer by mathtalk-ga on 26 Jul 2003 11:48 PDT

Hi, Mandeep:

Thanks for the kind words.  It is encouraging to be able to share tips
and techniques with one who has an immediate use for them.

If one of the values being concatenated is not a string or character
type, then it is a good idea to use an explicit conversion.  See here
for details and examples:

[MS Transact-SQL Syntax: CAST and CONVERT]
http://msdn.microsoft.com/library/en-us/tsqlref/ts_ca-co_2f3o.asp

One might do something like this:

SELECT COUNT( DISTINCT SalesPerson + '#' + CAST(Qty AS varchar(6)) )
FROM   SalesTable

regards, mathtalk-ga

mdeeps-ga rated this answer: 5 out of 5 stars

and gave an additional tip of: $5.00

Thamls mathtalk.  The answer was complete and thorough.  This is the
first time i've used Google Answers and i'm amazed with the results.

Comments

There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy