Unanswered
After Presenting Clearml To My Team, I Got The Question "We'Re Already On Aws, Why Not Use Sagemaker?"
Tbh, I'Ve Never Gone Through The Ml Workflow With Sagemaker. The Only Advantage I Could Think Of Is That We Can Use Our On-Prem Machines For Training,
Hi @<1541954607595393024:profile|BattyCrocodile47> and @<1523701225533476864:profile|ObedientDolphin41>
"we're already on AWS, why not use SageMaker?"
TBH, I've never gone through the ML workflow with SageMaker.
LOL I'm assuming this is why you are asking 🙂
- First, you can use SageMaker and still log everything to ClearML (2 lines integration). At least you will have visibility to everything that is running/failing 🙂
- SageMaker job is a container, which means for Every job (that in a lot of cases is a one time test) users need to build containers push them into the registry, and then of course forget to remove them. This means it is hard to move from writing code to launching and the management costs are high (tons of containers no one is using and everyone is afraid of deleting)
- As mentioned, SageMaker does not support on-prem/hybrid resources
- SageMaker costs extra on top of the compute
- There is no good dashboard for monitoring jobs and launching them from sagemaker. Basically it was designed for devops for monitoring long lasting servers, not ephemeral jobs constantly changing, and it shows ...
- Multi step pipelines are not supported in sagemaker (I mean you can hack it, but go figure later what really happened)
- Sagemaker does not have caching mechnisms (i.e. rerunning the same job with the same data/args should be reused)
- Sagemaker outputs by default are just more files in S3 bucket, which is a mess to manage
I probably forgot a few, but you get the gist, SageMaker was built to launch containers on EC2, not to manager ML workflows. So other than launching containers (that it does very nicely), everything else is missing.
(just my 2 cents, but I might be a bit biased after having to work with it for a while 😉 )
177 Views
0
Answers
one year ago
one year ago