ok, hours of debugging later, I realized that the auto_scaler example initializes a https://github.com/allegroai/clearml/blob/721569bb77d89d89e5b4f32a0ed98311c4574650/examples/services/aws-autoscaler/aws_autoscaler.py#L68 the task is initialized on the remote side.
Apparently, https://github.com/allegroai/clearml/blob/721569bb77d89d89e5b4f32a0ed98311c4574650/examples/services/aws-autoscaler/aws_autoscaler.py#L103 , doesn’t populate that dict with any keys that don’t already exist in it .
Since all this is happening around task initialization on the border between local and remote, it was a terrible issue to debug.
Bad waste of time - first to realize that the instance profile is not getting passed, then finding where, between 15 places where it could have been dropped.