Yeah, the problem was about fileserver connection like you said!
I was running the experiment in remote server, and solved the issue by opening the port for fileserver! Thanks!
@<1722061354531033088:profile|TroubledCamel37> Thanks! I'll look over the connectivity issue that you said.
@<1664079296102141952:profile|DangerousStarfish38> Yep, you are right, according to the docs, the optimizer.stop() should be used, not task.close(). Sorry for confusing.
I guess the issue is in connectivity/auth problems between ClearML components - there are many timeout messages in the log. I have similar messages for fileserver container, not yet resolved.
@<1722061354531033088:profile|TroubledCamel37> but, I guess task.close()
would terminate the optimization task, not the single experiment. am I misunderstanding something? ðŸ˜
@<1722061354531033088:profile|TroubledCamel37> No, I didn't add "task.close()" in the code. This link is what I followed.
Even after completing one experiment, the console and UI don't seem to terminate the task.
@<1523701070390366208:profile|CostlyOstrich36> here it is!
@<1664079296102141952:profile|DangerousStarfish38> , can you provide logs please?
Was "task.close()" called for the early-stopped task?
What is the experiment status in Web UI?
plus, the first experiment terminated with early stopping.
fyi,
I set the options for HyperParameterOptimizer() like,
- compute_time_limit=None,
- total_max_jobs=100,
- min_iteration_per_job=NOne,
- max_iteration_per_job=NOne,
- max_number_of_concurrent_tasks=1