Select a sample of records distributed over several distinct values

Question 1

I have a table like:

Name	Product Code	Payment Type	Member
Bob	Code A	Type 2	Yes
Kat	Code B	Type 2	Yes
Greg	Code C	Type 1	No
David	Code A	Type 1	Yes
Simon	Code A	Type 2	Yes

etc. Let’s say there’s 10k rows in this table.

Across this table, Name is the only unique column. I would like to select, for example, 120 rows, but in such a way that the 120 is split as evenly across the possible distinct combinations of Product Code, Payment Type, and Member as possible.

I.e. assuming there are 3 different Product Codes, 2 different Payment Types, and 2 different Member values (resulting in 12 distinct combinations), the result would be:

10 with Code A, Type 1, Yes
10 with Code A, Type 1, No
10 with Code A, Type 2, Yes
10 with Code A, Type 2, No
10 with Code B, Type 1, Yes
10 with Code B, Type 1, No
10 with Code B, Type 2, Yes
10 with Code B, Type 2, No
10 with Code C, Type 1, Yes
10 with Code C, Type 1, No
10 with Code C, Type 2, Yes
10 with Code C, Type 2, No

I need to do this without using WITH or any @variables or stored procedures etc.

Question 2

I am going to assume the latest version of SQL Server for now.

You can get what you want by using the ROW_NUMBER() window function to assign sequence numbers withing each group of rows having the same [Product Code], [Payment Type], and [Member], sorting by that sequence number, and then selecting the TOP 120 rows.

This will first select all of the “1” rows across all distinct [Product Code], [Payment Type], and [Member] combinations, then all of the “2” rows, etc. This should yield a roughly equal distribution. If one set runs out, the others will fill the difference.

Your query would look something like:

SELECT TOP 120 *
FROM (
    SELECT
        *,
        ROW_NUMBER() OVER(
            PARTITION BY [Product Code], [Payment Type], Member
            ORDER BY (SELECT NULL) -- Arbitrary order withing the partition
            ) AS RowNum 
    FROM Data
) D
ORDER BY D.RowNum

See this db<>fiddle.

Leave a Comment Cancel reply