Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello, I'M Trying To Use Clearml On A Local Server And For 2 Days When I Try To Close The Clearml Task "Task.Close()" , It Hangs Forever And Never Stop. Do You Have Any Idea Why?

Hello,
I'm trying to use clearml on a local server and for 2 days when I try to close the clearml task "task.close()" , it hangs forever and never stop. Do you have any idea why?

  
  
Posted one day ago
Votes Newest

Answers 6


this is the logs of my clearml server

  
  
Posted one day ago

Sometimes I have " connection refused" when I logged my task but I've never been able to understand why exactly.
I followed the tutorial to setup my server except I didn't set up any of the experted parameters ," clearml_agent key" " CLEARML_host_IP" / " CLEARML_AGENT_GIT_PASS" ...
Eventually I just run this command
docker compose -f opt/clearml/docker-compose.yml

  
  
Posted one day ago

Have you looked into why this comes up?

clearml-fileserver  |     raise ValueError('Connection Error: it seems *api_server* is misconfigured. '
clearml-fileserver  | ValueError: Connection Error: it seems *api_server* is misconfigured. Is this the ClearML API server 
 ?
  
  
Posted one day ago

@<1638349756755349504:profile|MistakenTurtle88> - Can you also share your docker-compose.yml file? Thanks!!

  
  
Posted one day ago

Hello,
This is my train.py
model = ModelParams(cfg.get("model", None))
opt = OptimizationParams(cfg.get("optimization", None))
cmlparams = ClearmlParams(cfg.get("clearml", None))
pipeline: "PipelineParams" = PipelineParams(cfg=cfg.get("pipeline", None))
test_iterations_default = (
list(range(0, 100)) + list(range(100, 1000, 10)) + list(range(1000, 10000, 50))
)
GS_loger: "loggingGS" = cfg.get("gs_logger", None)
test_iterations_default = (
list(range(0, 100, 10)) + list(range(0, 100000, 100)) + [opt.iterations - 1]
)

test_iterations_default = sorted(list(set(test_iterations_default)))

if CLEARML_FOUND and not pipeline.debug:
from utils.clearml_utils import safe_init_clearml, connect_whole

assert (
cmlparams.task_name != ""
), "Please provide a task name for ClearML,got {}".format(cmlparams.task_name)

task = Task.init(
project_name=cmlparams.project_name,
task_name=cmlparams.task_name,
tags=cmlparams.tags,
)
connect_whole(
cfg=cfg,
task=task,
name_hyperparams_summary="train config",
name_connect_cfg="whole train cfg",
)
# task.connect(cfg,name="test_train")
else:
print(
" We didn't find clearml or you are in debug mode, we don't log to Clearml"
)
print("Optimizing " + cfg.model.model_path)

# Initialize system state (RNG)
safe_state(cfg.quiet, seed=cfg.seed)

# Start GUI server, configure and run training
torch.autograd.set_detect_anomaly(cfg.detect_anomaly)

training(
sceneparams=model,
opt=opt,
pipe=pipeline,
GS_loger=GS_loger,
testing_iterations=test_iterations_default,
saving_iterations=cfg.save_iterations,
checkpoint_iterations=cfg.checkpoint_iterations,
start_checkpoint=cfg.start_checkpoint,
debug_from=cfg.debug_from,
)
# All done
print("\nTraining complete.")
if CLEARML_FOUND and not pipeline.debug:
print("Attempting to close clearml task")
# print("task url",task.get_web_a)

task.close()
print("ClearML task closed")

The code stop at task.close()
my clearml.conf is

  
  
Posted one day ago

@<1638349756755349504:profile|MistakenTurtle88> I'm not sure I understand what gets stuck - you're running python code with the ClearML ASK and call task.close()? Can you share the code you're running and how your clearml.conf file is configured?

  
  
Posted one day ago
27 Views
6 Answers
one day ago
19 hours ago
Tags