I have a structure table that has more than 5000 structures along with over 20 columns. I will only show 3 structures and 4 columns for simplicity. I will call this Table S.
Structure ID | Structure Name | Structure Status | Structure Length |
---|---|---|---|
5 | Fraser | Open | 120 |
7 | Roger Cr | Open | 176 |
8 | Toby | Open | 105 |
The other table is about the inspections of the structures shown above. This table is called Table I. Few of its columns will be shown.
Structure ID | Inspection Date | Structure Index | Urgency |
---|---|---|---|
5 | 27-May-2003 | 2.16 | 3 |
5 | 28-Apr-2004 | 1.65 | 5 |
5 | 16-Aug-1999 | 3.15 | 5 |
5 | 15-May-2013 | 2.54 | 2 |
5 | 12-Sep-1995 | 1.99 | 2 |
5 | 08-Dec-2012 | 2.13 | 3 |
7 | 06-Oct-2016 | 3.11 | 4 |
7 | 05-May-2022 | 1.77 | 2 |
7 | 14-Jun-2020 | 2.65 | 2 |
7 | 18-May-2013 | 1.89 | 1 |
7 | 21-Aug-2023 | 2.02 | 1 |
8 | 03-Jun-2010 | 1.76 | 3 |
8 | 28-Aug-2022 | 3.32 | 5 |
8 | 21-Oct-2021 | 1.98 | 1 |
What I want is that each structure only shows its latest inspection date. The code that I have used does get me the result.
Both the code and the result are shown below;
SELECT
a.structure_id "Structure ID",
func.description "Function",
a.NAME,
b.description "Structure Status",
a.structure_length "Structure Length",
i.inspection_date "Inspection Date",
FROM
STRUCTURE a
LEFT JOIN
STRUCTURE_STATUS b ON a.STRUCTURE_STATUS = b.STRUCTURE_STATUS
LEFT JOIN
structure_elec_dist ed ON a.structure_id = ed.structure_id
LEFT JOIN
hwy_elec_dis hwy ON ed.elect_dis_no = hwy.elect_dist_no
LEFT JOIN
structure_function func ON a.structure_function = func.structure_functio
LEFT JOIN
(SELECT structure_id, MAX(inspection_date) AS "latest date"
FROM structure_inspection
GROUP BY structure_id) i ON a.structure_id = i.structure_id
The result is:
Structure ID | Function | Name | Structure Status | Structure Length | Latest Date |
---|---|---|---|---|---|
5 | Bridge | Fraser | Open | 120 | 08-Dec-2012 |
7 | Bridge | Roger Cr | Open | 176 | 21-Aug-2023 |
8 | Bridge | Toby | Open | 105 | 28-Aug-2022 |
Unfortunately, its when I intend to include other columns from Table I such as ‘Structure Index’ and ‘Urgency’, that’s when the rows begin to increase as I have to include them in GROUP BY
clause as well which I do not want.
The result I want is:
Struct ID | Func | Name | Struc Status | Structure Len. | Latest Date | Struc. Index | Urgency |
---|---|---|---|---|---|---|---|
5 | Bridge | Fraser | Open | 120 | 08-Dec-2012 | 2.13 | 3 |
7 | Bridge | Roger Cr | Open | 176 | 21-Aug-2023 | 2.02 | 1 |
8 | Bridge | Toby | Open | 105 | 28-Aug-2022 | 3.32 | 5 |
I have been trying for the last two days to solve this. I looked at similar questions in Stackoverflow and somewhat tried incorporating the solutions given, but it didn’t work out.
My apologies if I couldn’t ask the question properly. Please do ask me to correct if there are any mistakes in it.
Your assistance would be highly appreciated.
Please try the following solution based on the window function ROW_NUMBER()
.
It is a two step process:
- Our CTE ranks inspections for each structure via
PARTITION BY Structure_ID
clause based on their date order viaORDER BY Inspection_Date DESC
. SELECT
after CTE appliesWHERE seq = 1
clause to filter out rows
with seq value is greater than 1.
SQL
-- DDL and sample data population, start
DECLARE @structure TABLE (Structure_ID INT PRIMARY KEY, Structure_Name VARCHAR(20), Structure_Status VARCHAR(20), Structure_Length INT);
INSERT @structure (Structure_ID, Structure_Name, Structure_Status, Structure_Length) VALUES
(5, 'Fraser', 'Open', 120),
(7, 'Roger Cr', 'Open', 176),
(8, 'Toby', 'Open', 105);
DECLARE @inspection TABLE (Structure_ID INT, Inspection_Date DATE, Structure_Index DECIMAL(4,2), Urgency INT);
INSERT @inspection (Structure_ID, Inspection_Date, Structure_Index, Urgency) VALUES
(5, '2003-05-27', 2.1, 3),
(5, '2004-04-28', 1.6, 5),
(5, '1999-08-16', 3.1, 5),
(5, '2013-05-15', 2.5, 2),
(5, '1995-09-12', 1.9, 2),
(5, '2012-12-08', 2.1, 3),
(7, '2016-10-06', 3.1, 4),
(7, '2022-05-05', 1.7, 2),
(7, '2020-06-14', 2.6, 2),
(7, '2013-05-18', 1.8, 1),
(7, '2023-08-21', 2.0, 1),
(8, '2010-06-03', 1.7, 3),
(8, '2022-08-28', 3.3, 5),
(8, '2021-10-21', 1.9, 1);
-- DDL and sample data population, end
WITH rs AS
(
SELECT *
, seq = ROW_NUMBER() OVER (PARTITION BY Structure_ID ORDER BY Inspection_Date DESC)
FROM @inspection
)
SELECT s.*
, rs.inspection_Date, rs.Structure_Index, rs.Urgency
FROM @structure AS s
INNER JOIN rs ON rs.Structure_ID = s.Structure_ID
WHERE seq = 1;
Output
Structure_ID | Structure_Name | Structure_Status | Structure_Length | inspection_Date | Structure_Index | Urgency |
---|---|---|---|---|---|---|
5 | Fraser | Open | 120 | 2013-05-15 | 2.50 | 2 |
7 | Roger Cr | Open | 176 | 2023-08-21 | 2.00 | 1 |
8 | Toby | Open | 105 | 2022-08-28 | 3.30 | 5 |
While asking a question, you need to provide a minimal reproducible example: (1) DDL and sample data population, i.e. CREATE table(s) plus INSERT T-SQL statements. (2) What you need to do, i.e. logic and your code attempt implementation of it in T-SQL. (3) Desired output, based on the sample data in the #1 above. (4) Your SQL Server version (SELECT @@version;).