Reputation
Badges 1
282 × Eureka!Likely network. Can you run a curl on ClearML server api server from jenkin stage and see if that gets through?
Does the glue write any error logs anywhere? I only see CLEARML_AGENT_UPDATE_VERSION =
and nothing else.
Ok. That brings me back to the spawned pod. At this point, clearml-agent and its config would be a controbuting factor. Is the absence of /tmp/.clearml_agent.xxxxxx.cfg an issue?
Hi SuccessfulKoala55 I was refering to the Task.init() or any other SDK API that we use in our training codes.
I'm using this feature, in this case i would create 2 agents, one with cpu only queue and the other with gpu queue. And then at the code level decide with queue to send to.
Hi, building a container with vscode is not possible. If i have an alternative location for the vscode, where should i indicate in the configuration?
That didn't work as well...
What's the diff between template-yaml and --overrides-yaml? I used the latter to ensure the gpu is passed in.
Hi SuccessfulKoala55 , would they need the fileserver to route to minio then? E.g.
This will ensure that any actions by clearml-data and models are saved into the S3 object store.
api {
files_server: s3://ecs.ai:80/clearml-data/default
}
aws {
s3 {
credentials {
host: http://ecs.ai:80
## Insert the iam credentials provided by your SAs here.
}
}
}
But if user forgot to do above, they will be saved on ClearML server. If I switch off f...
Hi, for both of them, args.lastiter is the exact same value. But when plotted out, they are 2 actually iterations apart.