Reputation
Badges 1
533 × Eureka!Worth mentioning, nothing has changed before we executed this, it worked before and now after the update it breaks
We try to break up every thing into independent tasks and group them using a pipeline. The dependency on an agnet caused an unnecessary overhead since we just want to execute locally. It became a burden once new data scientists join the project and instead of just telling them "yeah, just execute this script" you have to now teach them about clearml, the role of agents, how to launch them, how they behave, how to remove them and stuff like that... things you want to avoid with data scientists
now I get this error in my Auto Scaler taskWarning! exception occurred: An error occurred (AuthFailure) when calling the RunInstances operation: AWS was not able to validate the provided access credentials Retry in 15 seconds
AgitatedDove14 sorry for the late reply,
It's right after executing all the steps. So we have the following block which determines whether we run locally or remotely
if not arguments.enqueue: pipe.start_locally(run_pipeline_steps_locally=True) else: pipe.start(queue=arguments.enqueue)
And right after we have a method that calls Task.current_task() which returns None
I don't even know where trains is coming from... While using the same environment I can't even import trains, see
we are running the agent on the same machine AgitatedDove14 , it worked before upgrading the clearml... we never set these credentials
Thanks a lot, that clarifies things
the output above is what the agent has as it seems... obviously on my machine I have it installed
yeah I guessed so
I have them in two different places, once under Hyperparameters -> General
There are many ohter packages in my environment which are not listed
btw my site packages is false - should it be true? You pasted that but I'm not sure what it should be, in the paste is false but you are asking about true
When I said not the expected behavior, I meant that following the instructions on the docs, should lead to downloading the latest version
okay but still I want to take only a row of each artifact
pgrep -af trains shows that there is nothing running with that name
okay, that's acceptable
I assume it has nothing to do with my client version
AgitatedDove14
I never installed trains on this environment
If the credentials don't have access tothe autoscale service obviously it won't work
Cool - so that means the fileserver which comes with the host will stay emtpy? Or is there anything else being stored there?
I'm using pipe.start_locally so I imagine I don't have to .wait() right?
