admin管理员组文章数量:1022982
I have aws glue job which processing CSV file from s3 and inserting into database. What I observed that there is 20-22 sec time lag between driver added and first job submitted. Any idea why it is taking time or how we can improve startup time.
I have aws glue job which processing CSV file from s3 and inserting into database. What I observed that there is 20-22 sec time lag between driver added and first job submitted. Any idea why it is taking time or how we can improve startup time.
Share Improve this question asked Nov 19, 2024 at 13:33 saiyantansaiyantan 11 bronze badge 4- You can check a few things like number and type of workers, if the data volume is not much, try reducing the number of dpus as they can take time starting up. – samhita Commented Nov 19, 2024 at 23:28
- @samhita Thanks for reply. I tried with G2.X from G1.X, it improves overall job time. Also, one observation DPU reduction does not help much and startup time taking around ~12 sec. Seems like aws glue has cold start time more than ~10 sec. – saiyantan Commented Nov 20, 2024 at 11:34
- ok so you mean reducing worker helped improving start time from 22 sec to around ~12 sec? and yes glue does have some cold start time which you mentioned – samhita Commented Nov 20, 2024 at 11:43
- Worker type change help. Reduce dsu does not help much. – saiyantan Commented Nov 20, 2024 at 15:05
1 Answer
Reset to default 0Few options can be tried to reduce the startup time of Glue jobs
Glue Job Configuration:
- Worker Type and Number: Experiment with different worker types and numbers to find the optimal configuration for your specific workload.
[As confirmed in comments changing the worker type has improved the startup time]
- Try reducing the number of dpus as they can take time starting up.
I have aws glue job which processing CSV file from s3 and inserting into database. What I observed that there is 20-22 sec time lag between driver added and first job submitted. Any idea why it is taking time or how we can improve startup time.
I have aws glue job which processing CSV file from s3 and inserting into database. What I observed that there is 20-22 sec time lag between driver added and first job submitted. Any idea why it is taking time or how we can improve startup time.
Share Improve this question asked Nov 19, 2024 at 13:33 saiyantansaiyantan 11 bronze badge 4- You can check a few things like number and type of workers, if the data volume is not much, try reducing the number of dpus as they can take time starting up. – samhita Commented Nov 19, 2024 at 23:28
- @samhita Thanks for reply. I tried with G2.X from G1.X, it improves overall job time. Also, one observation DPU reduction does not help much and startup time taking around ~12 sec. Seems like aws glue has cold start time more than ~10 sec. – saiyantan Commented Nov 20, 2024 at 11:34
- ok so you mean reducing worker helped improving start time from 22 sec to around ~12 sec? and yes glue does have some cold start time which you mentioned – samhita Commented Nov 20, 2024 at 11:43
- Worker type change help. Reduce dsu does not help much. – saiyantan Commented Nov 20, 2024 at 15:05
1 Answer
Reset to default 0Few options can be tried to reduce the startup time of Glue jobs
Glue Job Configuration:
- Worker Type and Number: Experiment with different worker types and numbers to find the optimal configuration for your specific workload.
[As confirmed in comments changing the worker type has improved the startup time]
- Try reducing the number of dpus as they can take time starting up.
本文标签: amazon web servicesAws Glue startup timeStack Overflow
版权声明:本文标题:amazon web services - Aws Glue startup time - Stack Overflow 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://it.en369.cn/questions/1745558564a2156021.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论