TroubledCamel37 No, I didn't add "task.close()" in the code. This link is what I followed.
Even after completing one experiment, the console and UI don't seem to terminate the task.
DangerousStarfish38 Yep, you are right, according to the docs, the optimizer.stop() should be used, not task.close(). Sorry for confusing.
I guess the issue is in connectivity/auth problems between ClearML components - there are many timeout messages in the log. I have similar messages for fileserver container, not yet resolved.
TroubledCamel37 Thanks! I'll look over the connectivity issue that you said.
TroubledCamel37 but, I guess task.close()
would terminate the optimization task, not the single experiment. am I misunderstanding something? 😭
plus, the first experiment terminated with early stopping.
Yeah, the problem was about fileserver connection like you said!
I was running the experiment in remote server, and solved the issue by opening the port for fileserver! Thanks!
fyi,
I set the options for HyperParameterOptimizer() like,
- compute_time_limit=None,
- total_max_jobs=100,
- min_iteration_per_job=NOne,
- max_iteration_per_job=NOne,
- max_number_of_concurrent_tasks=1
Was "task.close()" called for the early-stopped task?
What is the experiment status in Web UI?
DangerousStarfish38 , can you provide logs please?