Base table is wrong, Need to keep few columns as is while moving other columns around

TL;DR
Base table is incorrect, need to refine data using query. Final value is incorrect (slightly less) as nesting on itself leading to some records omission as it’s a partial join.

Detail-
This is a query that I am trying to write. My original table has a wrong logic and not sure how to get it rectified, hence decided to pivot the correct table using SQL.
Each out_id can have multiple distinct ord_type on any day. Let’s assume there are three ord_type-
Alpha, Beta and Gamma (and so on), not necessarily each exists on all days. Some days might have one, some neither, some all three.

Actual data in table:

dt out_id ord_type identifier1 Non_Recon_Amnt act_returns
16/12 01 Alpha True 1 3
16/12 01 Beta False 2 4
16/12 01 Gamma False 3 5
17/12 01 Beta False 4 6
17/12 01 Gamma False 5 7
18/12 01 Alpha True 6 8
18/12 01 Gamma False 7 9

Data I want from Query:

dt out_id ord_type identifier1 Non_Recon_Amnt act_returns
16/12 01 Alpha True 1 0
16/12 01 Beta False 2 7
16/12 01 Gamma False 3 5
17/12 01 Beta False 4 6
17/12 01 Gamma False 5 7
18/12 01 Alpha True 6 0
18/12 01 Gamma False 7 9
18/12 01 Beta False 0 8

This is because the very definition of Alpha means act_returns will be 0 and all values under this should ideally be attributed to Beta.

Data my current query gives-

dt out_id ord_type identifier1 Non_Recon_Amnt act_returns
16/12 01 Alpha True 1 0
16/12 01 Beta False 2 7
16/12 01 Gamma False 3 5
17/12 01 Beta False 4 6
17/12 01 Gamma False 5 7
18/12 01 Alpha True 6 0
18/12 01 Gamma False 7 9

You may notice it doesn’t replace Alpha’s act_returns value with Beta on 18th Dec


Full Query:

SELECT
    dt AS date_,
    out_id,
    out_name,
    ct,
    ord_type,
    identifier1,
    identifier2,
    SUM(Non_Recon_Amnt),
    SUM(ret_loss),
    SUM(act_returns)

FROM
    (
        SELECT
            dt,
            out_id,
            out_name,
            ct,
            ord_type,
            identifier1,
            identifier2,
            SUM(Non_Recon_Amnt) as Non_Recon_Amnt,
            CASE WHEN ord_type="Alpha" AND identifier1='true' THEN SUM(ret_loss) ELSE 0 END as ret_loss,
            CASE
                WHEN ord_type="Alpha" AND identifier1 = 'true' THEN 0
                WHEN ord_type="Beta" THEN
                    (
                        SELECT SUM(act_returns)
                        FROM generic_table
                        WHERE dt=G.dt AND out_id=G.out_id
                        AND ord_type="Alpha" AND identifier1 = 'true'
                        OR dt=G.dt AND out_id=G.out_id
                        AND ord_type="Beta"
                    )
                ELSE SUM(act_returns)
            END AS act_returns,
            FROM
            generic_table G
        WHERE
            dt >= current_date - 30 AND dt < current_date
        GROUP BY
            1, 2, 3, 4, 5, 6, 7
    ) AS subquery
GROUP BY 1, 2, 3, 4, 5, 6, 7

In my subquery,

CASE
                WHEN ord_type="Alpha" AND identifier1 = 'true' THEN 0
                WHEN ord_type="Beta" THEN
                    (
                        SELECT SUM(act_returns)
                        FROM generic_table
                        WHERE dt=G.dt AND out_id=G.out_id
                        AND ord_type="Alpha" AND identifier1 = 'true'
                        OR dt=G.dt AND out_id=G.out_id
                        AND ord_type="Beta"
                    )
                ELSE SUM(act_returns)
            END AS act_returns,`

since I have used ord_type Beta, if generic table doesn’t have Beta on that particular date, it skips them. How to make the solution better?

  • Please tag your DBMS (SQL Server, Oracle, etc…) Does this query do what you want it to, and you are just seeing if it can be simplified? If not, please describe the problems with it’s result.

    – 

  • Please format your source data as a table – so that columns line up and it is readable. Also, update your question with the result you are trying to achieve

    – 




  • Edited requirement. Apologies for column formatting, still cannot figure out how to do it, will look around and sort it.

    – 

  • Apologies again Matthew, must have been formatting parallelly.

    – 




  • 1

    @SweeneyTodd “since I have used ord_type Beta, if generic table doesn’t have Beta on that particular date, it skips them. How to make the solution better?” It’d be helpful if you update your sample such that it captures the problem you are having, dates and all. That way people can tell whether a solution fixes the problem.

    – 

Leave a Comment