Reputation
Badges 1
533 × Eureka!AgitatedDove14 just a reminder if you missed this question 😄
I have a single IAM, my question is what kind of permissions I should associate with the IAM so that the autoscaler task will work
Does that mean that teh AWS autoscaler in trains, manages EC2 auto scaling directly without using the AWS built in EC2 auto scaler?
I mean, I barely have 20 experiments
Version 1.1.1
Snippet of which part exactly?
Yeah, logs saying "file not found", here is an example
I am noticing that the files are saved locally, is there any chance that the files are over-written during the run or get deleted at some point and then replaced?
Yes they are local - I don't think there is a possibility they are getting overwritten... But that depends on how clearml names them. I showed you the code that saves the artifacts, but this code runs multiple times from a given template with different values - essentially it creates like 10 times the same task with different param...
now I get this error in my Auto Scaler taskWarning! exception occurred: An error occurred (AuthFailure) when calling the RunInstances operation: AWS was not able to validate the provided access credentials Retry in 15 seconds
I was trying out the pipeline controller for the first time and I felt a bit of a burden that just for the sake of trying I had to launch an agent
yeah but I see it gets enquequed to the default
which I don't know what it is connected to
If I execute this task using python .....py
will it execute the machine I executed it on?
no need to do it again, I ahve all the settings in place, I'm sure it's not a settings thing
So just to correct myself and sum up, the credentials for AWS are only in the cloud_credentials_*
but nowhere in the docs does it say anything about the permissions for the IAM
so putting the docs aside, what permissions should I give to the IAM associated with trains' autoscale ?
This is what I meant should be documented - the permissions...
I guess not many tensorflowers running agents around here if this wasn't brought up already
How do I get from the node to the task object?
ClearML results page:
`
Launching step: 2019-09-03_2021-01-25_choose_best
Parameters:
{***}
Configurations:
None
Overrides:
None
Launching step: 2019-10-23_2021-01-15_choose_best
Parameters:
{********}
Configurations:
None
Overrides:
None
Launching step: 2019-05-26_2020-12-26_choose_best
Parameters:
{******}
Configurations:
None
Overrides:
None
Launching step: 2019-07-15_2021-01-05_choose_best
Parameters:
{************}
Configurations:
None
Overrides:
None
Launching step...