Deterministic sorting on a non-unique column

I’m using SQL Server to pull data from a database – I am sorting on my date column but this column is not unique as a date can appear multiple times in the column. I need the data to remain in the same order every time I run the query and I notice that if there are two rows that have the same date, the order in which they appear can change every time I run the query.

Brief example of what my query looks like:

SELECT
    ConstituentID,
    ConstituentName,
    ContactType,
    ContactDate
FROM 
    table
ORDER BY 
    ConstituentID, ContactDate

A single constituent can have multiple contacts on the same day and I need them to remain in the same order every time I run the code, it’s a date column and not a datetime so I’m not able to sort by time as well. Is there a way to force the query to output the data in the same order every time? Or alternatively, if a constituent has two contacts on the same day, is there a way to set which contact type appears last?

The only similar thing that comes to mind for me is the set.seed() command in R but not sure if there is something similar to that in SQL that can give me the results I need.

  • 1

    Add time, or a sequence value.

    – 

  • 3

    if you add the other fields to the sort, the result will always be appear as stable, it’s magic

    – 




Is there a way to force the query to output the data in the same order every time?

Yes. Every table has a key (or should), and keys are unique. So add all the columns of some key to the end ORDER BY clause, and the query will have a deterministic ordering. EG:

SELECT
    ConstituentID,
    ConstituentName,
    ContactType,
    ContactDate
FROM 
    table
ORDER BY 
    ConstituentID, ContactDate, ID

David Browne has the gist of it (as usual), but in the event you are unable to identify unique columns one common work-around is changing the table to store DateTime values instead of just Date, even if you otherwise only need the date portion.

Then queries can set the SELECT list to only return the date portion of the column (or let client code truncate the value), and sort on the full the DateTime.

This isn’t strictly-guaranteed to be unique, but as a matter of practice it is extremely uncommon to end up with duplicate DateTime values in real-world data by accident.

Add everything in the output that defines a unique row to the ORDER BY clause.

SELECT
    ConstituentID,
    ConstituentName,
    ContactType,
    ContactDate
FROM 
    table
ORDER BY 
    ConstituentID, 
    ContactDate, 
    ContactType

should work. Assuming a ConstituentID has only one ConstituentName, adding the name won’t help.

Leave a Comment