Data Pipeline integrates really well with EMR, and it's easy to deploy pipelines via cloudformation making automation possible. We use it to manage complex map-reduce workflows and it usually works pretty smoothly.
Data Pipeline can be a black box at times. Error messages are not good, and it is difficult to understand what exactly failed since it is an amazon service. The scheduler doesn't give timely notifications at times, so it is hard to determine the true state...
I like that this is serverless. So we need not to worry about infra and we can run write the code to process huge amount of data within no time.
AWS Glue is not user friendly,the transformation components that we have are not useful in different scenarios and we need to use custom transformation for everything,including even very basic operations.
Data Pipeline integrates really well with EMR, and it's easy to deploy pipelines via cloudformation making automation possible. We use it to manage complex map-reduce workflows and it usually works pretty smoothly.
I like that this is serverless. So we need not to worry about infra and we can run write the code to process huge amount of data within no time.
Data Pipeline can be a black box at times. Error messages are not good, and it is difficult to understand what exactly failed since it is an amazon service. The scheduler doesn't give timely notifications at times, so it is hard to determine the true state...
AWS Glue is not user friendly,the transformation components that we have are not useful in different scenarios and we need to use custom transformation for everything,including even very basic operations.