admin管理员组

文章数量:1022982

I have aws glue job which processing CSV file from s3 and inserting into database. What I observed that there is 20-22 sec time lag between driver added and first job submitted. Any idea why it is taking time or how we can improve startup time.

I have aws glue job which processing CSV file from s3 and inserting into database. What I observed that there is 20-22 sec time lag between driver added and first job submitted. Any idea why it is taking time or how we can improve startup time.

Share Improve this question asked Nov 19, 2024 at 13:33 saiyantansaiyantan 11 bronze badge 4
  • You can check a few things like number and type of workers, if the data volume is not much, try reducing the number of dpus as they can take time starting up. – samhita Commented Nov 19, 2024 at 23:28
  • @samhita Thanks for reply. I tried with G2.X from G1.X, it improves overall job time. Also, one observation DPU reduction does not help much and startup time taking around ~12 sec. Seems like aws glue has cold start time more than ~10 sec. – saiyantan Commented Nov 20, 2024 at 11:34
  • ok so you mean reducing worker helped improving start time from 22 sec to around ~12 sec? and yes glue does have some cold start time which you mentioned – samhita Commented Nov 20, 2024 at 11:43
  • Worker type change help. Reduce dsu does not help much. – saiyantan Commented Nov 20, 2024 at 15:05
Add a comment  | 

1 Answer 1

Reset to default 0

Few options can be tried to reduce the startup time of Glue jobs

Glue Job Configuration:

  • Worker Type and Number: Experiment with different worker types and numbers to find the optimal configuration for your specific workload.

[As confirmed in comments changing the worker type has improved the startup time]

  • Try reducing the number of dpus as they can take time starting up.

I have aws glue job which processing CSV file from s3 and inserting into database. What I observed that there is 20-22 sec time lag between driver added and first job submitted. Any idea why it is taking time or how we can improve startup time.

I have aws glue job which processing CSV file from s3 and inserting into database. What I observed that there is 20-22 sec time lag between driver added and first job submitted. Any idea why it is taking time or how we can improve startup time.

Share Improve this question asked Nov 19, 2024 at 13:33 saiyantansaiyantan 11 bronze badge 4
  • You can check a few things like number and type of workers, if the data volume is not much, try reducing the number of dpus as they can take time starting up. – samhita Commented Nov 19, 2024 at 23:28
  • @samhita Thanks for reply. I tried with G2.X from G1.X, it improves overall job time. Also, one observation DPU reduction does not help much and startup time taking around ~12 sec. Seems like aws glue has cold start time more than ~10 sec. – saiyantan Commented Nov 20, 2024 at 11:34
  • ok so you mean reducing worker helped improving start time from 22 sec to around ~12 sec? and yes glue does have some cold start time which you mentioned – samhita Commented Nov 20, 2024 at 11:43
  • Worker type change help. Reduce dsu does not help much. – saiyantan Commented Nov 20, 2024 at 15:05
Add a comment  | 

1 Answer 1

Reset to default 0

Few options can be tried to reduce the startup time of Glue jobs

Glue Job Configuration:

  • Worker Type and Number: Experiment with different worker types and numbers to find the optimal configuration for your specific workload.

[As confirmed in comments changing the worker type has improved the startup time]

  • Try reducing the number of dpus as they can take time starting up.

本文标签: amazon web servicesAws Glue startup timeStack Overflow