SQL count results in groups, with repeating groups

I have this data:

ARRIVAL,ITEM_TYPE,ITEM
1,0,Cat
2,0,Dog
3,1,Horse
4,1,Cow
5,0,Fish
6,0,Barn
7,0,Potato

I would like to query for this result:

0,2
1,2
0,3

… showing that in order by ARRIVAL there was a group of 2 items with ITEM_TYPE = 0, then 2 with ITEM_TYPE = 1, then another group with ITEM_TYPE = 0. The goal is to provide information about the arrival in a system of similar groups of records; Cat and Dog are in the same ITEM_TYPE as Fish, Barn, and Potato, but they didn’t arrive in the same time-series group. I can’t figure out how to use COUNT,DISTINCT,and GROUP BY in a way that doesn’t collapse the two groups of zeroes into a single line like this:

0,5

I will appreciate any help.

— Bob

You have a gaps and islands problem here, you could use the difference between two row_numbers approach to solving it :

SELECT ITEM_TYPE, COUNT(*) AS CNT
FROM (
  SELECT *,
          ROW_NUMBER() OVER (ORDER BY ARRIVAL)
          - ROW_NUMBER() OVER (PARTITION BY ITEM_TYPE ORDER BY ARRIVAL) AS grp
  FROM mytable
) t
GROUP BY grp, ITEM_TYPE
ORDER BY ARRIVAL;

Results :

ITEM_TYPE   CNT
0           2
1           2
0           3

Demo here

We should use window function as below:

CREATE TABLE mytable (ARRIVAL INT, ITEM_TYPE INT,ITEM VARCHAR);

INSERT INTO  mytable(ARRIVAL, ITEM_TYPE,ITEM)
VALUES
  (1,0,'Cat'),
  (2,0,'Dog'),
  (3,1,'Horse'), 
  (4,1,'Cow' ),  
  (5,0,'Fish'),
  (6,0,'Barn'),
  (7,0,'Potato');


WITH cte as (
    SELECT ARRIVAL, ITEM_TYPE,ITEM
    , CASE WHEN ITEM_TYPE = (lag ( ITEM_TYPE) OVER ( ORDER BY ARRIVAL ) ) OR (lag ( ITEM_TYPE) OVER ( ORDER BY ARRIVAL ) ) IS NULL
    THEN 1 ELSE 0 END check_lag
    FROM mytable
)
SELECT MIN(ITEM_TYPE) ITEM_TYPE, COUNT(1) 
FROM (
    SELECT ARRIVAL, ITEM_TYPE,ITEM, ROW_NUMBER() OVER(ORDER BY ARRIVAL)  
      - SUM(check_lag) OVER(ORDER BY ARRIVAL ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS grp
    FROM cte
) sub
GROUP BY grp
ORDER BY grp;

OUTPUT:

enter image description here

Leave a Comment