admin管理员组文章数量:1023018
I have a data set where a customer may have had multiple assessments over a period of time. Now each customer's behavior towards these assessments may vary, meaning C1 may have had theirs on day 1, day 90, day 200, etc., and then C2 may have had their day 1, day 14, day 21, day 36, and so on.
My goal is to convert these individual rows into columns. Wonder if anyone had a similar requirement before. I am using Excel or Power Query or SQL to process this problem.
CustomerId Date Score
C1 1/1/2020 9
C1 1/7/2020 14
C1 1/14/2020 26
C2 1/9/2020 34
C2 3/9/2020 30
C2 6/9/2020 24
Output should be in below format:
Customer | Initial Score | Avg_3_months | Avg_6_months | Avg_9_months | Avg_12_months and so on.
I have a data set where a customer may have had multiple assessments over a period of time. Now each customer's behavior towards these assessments may vary, meaning C1 may have had theirs on day 1, day 90, day 200, etc., and then C2 may have had their day 1, day 14, day 21, day 36, and so on.
My goal is to convert these individual rows into columns. Wonder if anyone had a similar requirement before. I am using Excel or Power Query or SQL to process this problem.
CustomerId Date Score
C1 1/1/2020 9
C1 1/7/2020 14
C1 1/14/2020 26
C2 1/9/2020 34
C2 3/9/2020 30
C2 6/9/2020 24
Output should be in below format:
Customer | Initial Score | Avg_3_months | Avg_6_months | Avg_9_months | Avg_12_months and so on.
Share Improve this question edited Nov 19, 2024 at 6:56 samhita 4,1252 gold badges11 silver badges18 bronze badges asked Nov 18, 2024 at 20:54 Data123Data123 334 bronze badges 4- 1 Please edit your question to provide more background details on your problem. Telling readers about your research, what you have already tried, and why it didn’t meet your needs will allow them to understand it better. – Michal Commented Nov 18, 2024 at 21:43
- 2 Provide a sample of input and output . If one customer shows day 14 and day 36, and another customer shows day 90 and day 200, then how do you want to show +3 months and +6 months. Whats the explicit calculation? – horseyride Commented Nov 18, 2024 at 21:48
- use pivot, convert your lines in columns. – Julio Gadioli Soares Commented Nov 18, 2024 at 22:57
- could you pls also provide your expected output in markdown table like the sample data? – Ryan Commented Nov 19, 2024 at 0:33
2 Answers
Reset to default 0The below uses SQL Server syntax. (https://dbfiddle.uk/8WMfYXO4)
WITH initial AS (
SELECT CustomerId, MIN(Date) AS initial_date
FROM data
GROUP BY CustomerId
)
SELECT
a.CustomerId,
MAX(CASE WHEN a.Date = i.initial_date THEN a.Score END) AS Initial_Score,
AVG(CASE WHEN a.Date > i.initial_date AND a.Date <= DATEADD(month, 3, i.initial_date)
THEN a.Score END) AS Avg_3m,
AVG(CASE WHEN a.Date > DATEADD(month, 3, i.initial_date) AND a.Date <= DATEADD(month, 6, i.initial_date)
THEN a.Score END) AS Avg_6m,
AVG(CASE WHEN a.Date > DATEADD(month, 6, i.initial_date) AND a.Date <= DATEADD(month, 9, i.initial_date)
THEN a.Score END) AS Avg_9m,
AVG(CASE WHEN a.Date > DATEADD(month, 9, i.initial_date) AND a.Date <= DATEADD(month, 12, i.initial_date)
THEN a.Score END) AS Avg_12m
FROM data a
JOIN initial i ON a.CustomerId = i.CustomerId
GROUP BY a.CustomerId
ORDER BY a.CustomerId;
Output:
CustomerId | Initial_Score | Avg_3m | Avg_6m | Avg_9m | Avg_12m |
---|---|---|---|---|---|
C1 | 9 | 20 | null | null | null |
C2 | 34 | 30 | 24 | null | null |
How to correctly parse 1/14/2020
dates on all machines
That date is using the format string M/d/yyyy
.
Sometimes setting culture alone can detect how to import. When you're choosing transform column types
, include the optional culture.
To Be safe
You should set Format
and Culture
explicitly.
That ensures parsing dates and numbers is always deterministic.
Date format strings are listed here: powerquery.how/Date.FromText
fun tip: There are combinations where even a safe "iso" format yyyy/MM/dd
isn't actually distinct.
= Date.FromText( "1/14/2020", [ Format = "M/d/yyyy" , Culture = "en-us"] )
Stand Alone Example
let
ConvertRecord = ( source as text ) as record => [
segments = Text.Split( sample, " "),
parsedDate = Date.FromText( segments{1}, [ Format = "M/d/yyyy" , Culture = "en-us"] ),
return = [
Customer = segments{0},
Date = parsedDate,
Score = segments{2}
]
][return],
sample = "C1 1/14/2020 9",
test = ConvertRecord( sample ),
// just the
source = #table(
type table [ RawText = text ],
{ {"C1 1/1/2020 9"},
{"C1 1/7/2020 14"},
{"C1 1/14/2020 26" }
} ),
//
convertRecords = Table.TransformColumns( source, {
{"RawText", ConvertRecord, type record }} ),
expandRecords = Table.ExpandRecordColumn( convertRecords, "RawText", {"Customer", "Date", "Score"}, {"Customer", "Date", "Score"})
in
expandRecords
Next step: Grouping
- Use the Group By on the column
CustomerId
- Add another column and choose "All Rows"
Now you have a nested table partitioned per-user.( That means your calculation doesn't have to filter users )
Or that part might make more sense as a DAX measure. Then you can filter pre-aggregates.
I have a data set where a customer may have had multiple assessments over a period of time. Now each customer's behavior towards these assessments may vary, meaning C1 may have had theirs on day 1, day 90, day 200, etc., and then C2 may have had their day 1, day 14, day 21, day 36, and so on.
My goal is to convert these individual rows into columns. Wonder if anyone had a similar requirement before. I am using Excel or Power Query or SQL to process this problem.
CustomerId Date Score
C1 1/1/2020 9
C1 1/7/2020 14
C1 1/14/2020 26
C2 1/9/2020 34
C2 3/9/2020 30
C2 6/9/2020 24
Output should be in below format:
Customer | Initial Score | Avg_3_months | Avg_6_months | Avg_9_months | Avg_12_months and so on.
I have a data set where a customer may have had multiple assessments over a period of time. Now each customer's behavior towards these assessments may vary, meaning C1 may have had theirs on day 1, day 90, day 200, etc., and then C2 may have had their day 1, day 14, day 21, day 36, and so on.
My goal is to convert these individual rows into columns. Wonder if anyone had a similar requirement before. I am using Excel or Power Query or SQL to process this problem.
CustomerId Date Score
C1 1/1/2020 9
C1 1/7/2020 14
C1 1/14/2020 26
C2 1/9/2020 34
C2 3/9/2020 30
C2 6/9/2020 24
Output should be in below format:
Customer | Initial Score | Avg_3_months | Avg_6_months | Avg_9_months | Avg_12_months and so on.
Share Improve this question edited Nov 19, 2024 at 6:56 samhita 4,1252 gold badges11 silver badges18 bronze badges asked Nov 18, 2024 at 20:54 Data123Data123 334 bronze badges 4- 1 Please edit your question to provide more background details on your problem. Telling readers about your research, what you have already tried, and why it didn’t meet your needs will allow them to understand it better. – Michal Commented Nov 18, 2024 at 21:43
- 2 Provide a sample of input and output . If one customer shows day 14 and day 36, and another customer shows day 90 and day 200, then how do you want to show +3 months and +6 months. Whats the explicit calculation? – horseyride Commented Nov 18, 2024 at 21:48
- use pivot, convert your lines in columns. – Julio Gadioli Soares Commented Nov 18, 2024 at 22:57
- could you pls also provide your expected output in markdown table like the sample data? – Ryan Commented Nov 19, 2024 at 0:33
2 Answers
Reset to default 0The below uses SQL Server syntax. (https://dbfiddle.uk/8WMfYXO4)
WITH initial AS (
SELECT CustomerId, MIN(Date) AS initial_date
FROM data
GROUP BY CustomerId
)
SELECT
a.CustomerId,
MAX(CASE WHEN a.Date = i.initial_date THEN a.Score END) AS Initial_Score,
AVG(CASE WHEN a.Date > i.initial_date AND a.Date <= DATEADD(month, 3, i.initial_date)
THEN a.Score END) AS Avg_3m,
AVG(CASE WHEN a.Date > DATEADD(month, 3, i.initial_date) AND a.Date <= DATEADD(month, 6, i.initial_date)
THEN a.Score END) AS Avg_6m,
AVG(CASE WHEN a.Date > DATEADD(month, 6, i.initial_date) AND a.Date <= DATEADD(month, 9, i.initial_date)
THEN a.Score END) AS Avg_9m,
AVG(CASE WHEN a.Date > DATEADD(month, 9, i.initial_date) AND a.Date <= DATEADD(month, 12, i.initial_date)
THEN a.Score END) AS Avg_12m
FROM data a
JOIN initial i ON a.CustomerId = i.CustomerId
GROUP BY a.CustomerId
ORDER BY a.CustomerId;
Output:
CustomerId | Initial_Score | Avg_3m | Avg_6m | Avg_9m | Avg_12m |
---|---|---|---|---|---|
C1 | 9 | 20 | null | null | null |
C2 | 34 | 30 | 24 | null | null |
How to correctly parse 1/14/2020
dates on all machines
That date is using the format string M/d/yyyy
.
Sometimes setting culture alone can detect how to import. When you're choosing transform column types
, include the optional culture.
To Be safe
You should set Format
and Culture
explicitly.
That ensures parsing dates and numbers is always deterministic.
Date format strings are listed here: powerquery.how/Date.FromText
fun tip: There are combinations where even a safe "iso" format yyyy/MM/dd
isn't actually distinct.
= Date.FromText( "1/14/2020", [ Format = "M/d/yyyy" , Culture = "en-us"] )
Stand Alone Example
let
ConvertRecord = ( source as text ) as record => [
segments = Text.Split( sample, " "),
parsedDate = Date.FromText( segments{1}, [ Format = "M/d/yyyy" , Culture = "en-us"] ),
return = [
Customer = segments{0},
Date = parsedDate,
Score = segments{2}
]
][return],
sample = "C1 1/14/2020 9",
test = ConvertRecord( sample ),
// just the
source = #table(
type table [ RawText = text ],
{ {"C1 1/1/2020 9"},
{"C1 1/7/2020 14"},
{"C1 1/14/2020 26" }
} ),
//
convertRecords = Table.TransformColumns( source, {
{"RawText", ConvertRecord, type record }} ),
expandRecords = Table.ExpandRecordColumn( convertRecords, "RawText", {"Customer", "Date", "Score"}, {"Customer", "Date", "Score"})
in
expandRecords
Next step: Grouping
- Use the Group By on the column
CustomerId
- Add another column and choose "All Rows"
Now you have a nested table partitioned per-user.( That means your calculation doesn't have to filter users )
Or that part might make more sense as a DAX measure. Then you can filter pre-aggregates.
本文标签:
版权声明:本文标题:sql - Power Query: Convert existing data set, individual rows into columns based on Dates - Stack Overflow 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://it.en369.cn/questions/1745594901a2158106.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论