sql - In a select with a group by, get the last value of a column not in the group by

admin管理员组
文章数量:1024933

Given a table of data like this:

a	b	c	d	e
1	test	9	h	2024-10-22 08:00:00.000
1	test	9	l	2024-10-23 08:00:00.000
1	test	9	q	2024-10-22 08:00:00.000

Given a table of data like this:

a	b	c	d	e
1	test	9	h	2024-10-22 08:00:00.000
1	test	9	l	2024-10-23 08:00:00.000
1	test	9	q	2024-10-22 08:00:00.000

I want to group my data by columns a,b,c and show the value form column d the has the newest date in column e.

So I would expect to get one row of data back like this:

a	b	c	d
1	test	9	l

I would have liked something as simple as a "last()" like below but as far as I can find there isn't anything so simple?

SELECT 
    a, b, c,
    last(d)
FROM
    dbo.items 
GROUP BY 
    a, b, c

The only example I can find remotely close to what I want is a LAST_VALUE OVER PARTITION it doesn't work in a group by

LAST_VALUE(d) OVER (PARTITION BY d ORDER BY e) AS d

And I know similar things are possible to access the stuff not in a group by, like if b want in the group by I would still be able to STRING_AGG all the values like so

STRING_AGG(b, ',') AS b

and get "test,test,test" as the value

Share Improve this question edited Nov 18, 2024 at 19:08 Dale K 27.6k15 gold badges58 silver badges83 bronze badges asked Nov 18, 2024 at 13:55 Aurelius 1321 gold badge1 silver badge15 bronze badges

Please tag your RDBMS, see here why that's important: meta.stackoverflow/questions/388759/… – Bart McEndree Commented Nov 18, 2024 at 14:15

Add a comment |

2 Answers 2

Sorted by: Reset to default 3

Using Row_Number might work if you are using SQL Server

SELECT a, b, c, d FROM
(SELECT *,
      ROW_NUMBER() OVER (PARTITION BY a,b,c ORDER BY e desc) as rn
FROM dbo.items 
)t
WHERE rn=1

fiddle

a	b	c	d
1	test	9	l

There are some hacky and some more standard solutions.

Standard is to get the last value in a subquery and to aggregate it later, something like:

select a, STRING_AGG(b, ',') WITHIN GROUP(ORDER BY a,c) as b, c, max(lastD) as d
from (
    select a, b,c,d, e, last_value(d) over(partition by a,b,c order by e desc) as lastD
    from (
        VALUES  (1, N'test', 9, N'h', N'2024-10-22 08:00:00.000')
        ,   (1, N'test', 9, N'l', N'2024-10-23 08:00:00.000')
        ,   (1, N'test', 9, N'q', N'2024-10-22 08:00:00.000')
    ) t (a,b,c,d,e)
    ) x
GROUP BY a,b,c

The hacky way is something i'd call reconstruction, which entails combining your aggregation value and then deconstructing it back after retrieving the maximum value, something like:

SELECT  a, STRING_AGG(b, ',') WITHIN GROUP(ORDER BY a,c) AS b, c
,   STUFF(MAX(CONCAT(CONVERT(VARCHAR(30), cast(e AS datetime), 121), d)), 1,23, '') AS d
FROM
(
    VALUES  (1, N'test', 9, N'h', N'2024-10-22 08:00:00.000')
    ,   (1, N'test', 9, N'l', N'2024-10-23 08:00:00.000')
    ,   (1, N'test', 9, N'q', N'2024-10-22 08:00:00.000')
    ) t (a,b,c,d,e)
GROUP BY a,b,c

Here, by combining a varchar representation of the date and your d value, one gets a natural ascending value by the date, so one can use MAX. After getting the highest value, one can use STUFF function to remove the date part and get the d value.

This has some caveats especially if you concat non-string columns. Also, it's not possible to use tie-breakers if you have multiple of the same date. The upside is that it avoids the extra window aggregation step.

Given a table of data like this:

a	b	c	d	e
1	test	9	h	2024-10-22 08:00:00.000
1	test	9	l	2024-10-23 08:00:00.000
1	test	9	q	2024-10-22 08:00:00.000

Given a table of data like this:

a	b	c	d	e
1	test	9	h	2024-10-22 08:00:00.000
1	test	9	l	2024-10-23 08:00:00.000
1	test	9	q	2024-10-22 08:00:00.000

I want to group my data by columns a,b,c and show the value form column d the has the newest date in column e.

So I would expect to get one row of data back like this:

a	b	c	d
1	test	9	l

I would have liked something as simple as a "last()" like below but as far as I can find there isn't anything so simple?

SELECT 
    a, b, c,
    last(d)
FROM
    dbo.items 
GROUP BY 
    a, b, c

The only example I can find remotely close to what I want is a LAST_VALUE OVER PARTITION it doesn't work in a group by

LAST_VALUE(d) OVER (PARTITION BY d ORDER BY e) AS d

And I know similar things are possible to access the stuff not in a group by, like if b want in the group by I would still be able to STRING_AGG all the values like so

STRING_AGG(b, ',') AS b

and get "test,test,test" as the value

Share Improve this question edited Nov 18, 2024 at 19:08 Dale K 27.6k15 gold badges58 silver badges83 bronze badges asked Nov 18, 2024 at 13:55 Aurelius 1321 gold badge1 silver badge15 bronze badges

Please tag your RDBMS, see here why that's important: meta.stackoverflow/questions/388759/… – Bart McEndree Commented Nov 18, 2024 at 14:15

Add a comment |

2 Answers 2

Sorted by: Reset to default 3

Using Row_Number might work if you are using SQL Server

SELECT a, b, c, d FROM
(SELECT *,
      ROW_NUMBER() OVER (PARTITION BY a,b,c ORDER BY e desc) as rn
FROM dbo.items 
)t
WHERE rn=1

fiddle

a	b	c	d
1	test	9	l

There are some hacky and some more standard solutions.

Standard is to get the last value in a subquery and to aggregate it later, something like:

select a, STRING_AGG(b, ',') WITHIN GROUP(ORDER BY a,c) as b, c, max(lastD) as d
from (
    select a, b,c,d, e, last_value(d) over(partition by a,b,c order by e desc) as lastD
    from (
        VALUES  (1, N'test', 9, N'h', N'2024-10-22 08:00:00.000')
        ,   (1, N'test', 9, N'l', N'2024-10-23 08:00:00.000')
        ,   (1, N'test', 9, N'q', N'2024-10-22 08:00:00.000')
    ) t (a,b,c,d,e)
    ) x
GROUP BY a,b,c

The hacky way is something i'd call reconstruction, which entails combining your aggregation value and then deconstructing it back after retrieving the maximum value, something like:

SELECT  a, STRING_AGG(b, ',') WITHIN GROUP(ORDER BY a,c) AS b, c
,   STUFF(MAX(CONCAT(CONVERT(VARCHAR(30), cast(e AS datetime), 121), d)), 1,23, '') AS d
FROM
(
    VALUES  (1, N'test', 9, N'h', N'2024-10-22 08:00:00.000')
    ,   (1, N'test', 9, N'l', N'2024-10-23 08:00:00.000')
    ,   (1, N'test', 9, N'q', N'2024-10-22 08:00:00.000')
    ) t (a,b,c,d,e)
GROUP BY a,b,c

本文标签： sqlIn a select with a group by get the last value of a column not in the group byStack Overflow

版权声明：本文标题：sql - In a select with a group by, get the last value of a column not in the group by - Stack Overflow 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://it.en369.cn/questions/1745612533a2159090.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

369IT编程

sql - In a select with a group by, get the last value of a column not in the group by - Stack Overflow

2 Answers 2

2 Answers 2

更多相关文章