Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi All, I Have A Problem That I Couldn'T Figure On My Own ,I Will Be Happy For Help On This Issue! I Run On Linux With A "Father" Script That Creates A Subprocess, Calls Task.Init To Create A Task, Then Save Some Metrics And Then Close The Task (Few Time

Hi all, I have a problem that I couldn't figure on my own ,I will be happy for help on this issue!

I run on linux with a "father" script that creates a subprocess, calls Task.init to create a task, then save some metrics and then close the task (few times with different tasks). After this the "father" script creates another subprocess, and on the first Task.init it crushes, prints the message
clearml.Task - WARNING - ### TASK STOPPED - USER ABORTED - STATUS CHANGED ###

  
  
Posted 2 years ago
Votes Newest

Answers 15


Hi ThankfulHedgehong21 ,

What versions of ClearML & ClearML-Agent are you using?
Also, can you provide a small code snippet to play with?

  
  
Posted 2 years ago

clearml version 1.0.5, Server 1.1.0
code for reproduce

` import multiprocessing

from machine_learning.clearml_client import Task

def init_clearml_task(patch_set_name, model_name, is_ensemble):
task_name = f'{patch_set_name} {model_name}'
task = Task.init(
project_name=f"bla CV",
task_name=task_name,
tags=[model_name, patch_set_name],
reuse_last_task_id=False
)
task.connect({"bla": "bla"}, 'IbexConfig')
return task

def execute_1():
print("proc1")
task = init_clearml_task("alg1", "train1_debug_cml", is_ensemble=False)
task.close()
print("done_proc2")

def execute_2():
print("proc2")
task = init_clearml_task("alg2", "train2_debug_cml", is_ensemble=False)
task.close()
print("done_proc2")

proc = multiprocessing.Process(target=execute_1)
proc.start()
proc.join(35000)
print("father_script_done_proc1")
task = init_clearml_task('summary', 'alg1_debug_cml', is_ensemble=False)
task.close()

proc2 = multiprocessing.Process(target=execute_2)

proc2.start()
proc2.join(35000)
print("done???????????????") logs -u /home/tomer/.pycharm_helpers/pydev/pydevd.py --multiprocess --qt-support=auto --client 127.0.0.1 --port 47969 --file /home/tomer/ibex-ai-train/patch_level_classification/reproduce_cml_error.py
Connected to pydev debugger (build 221.5921.27)
/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/jwt/utils.py:7: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography and will be removed in a future release.
from cryptography.hazmat.primitives.asymmetric.ec import EllipticCurve
proc1
ClearML Task: created new task id=56993ab96cb64b3089011b6f4d2c7e58
ClearML results page:
/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/cryptography/hazmat/backends/openssl/x509.py:17: CryptographyDeprecationWarning: This version of cryptography contains a temporary pyOpenSSL fallback path. Upgrade pyOpenSSL now.
utils.DeprecatedIn35,
2022-07-21 06:42:08,534 - clearml.Task - INFO - Waiting for repository detection and full package requirement analysis
2022-07-21 06:42:45,338 - clearml.Task - INFO - Finished repository detection and package analysis
done_proc2
father_script_done_proc1
ClearML Task: created new task id=50bff5797e664d699d6a58b57d54cdb4
ClearML results page:
/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/cryptography/hazmat/backends/openssl/x509.py:17: CryptographyDeprecationWarning: This version of cryptography contains a temporary pyOpenSSL fallback path. Upgrade pyOpenSSL now.
utils.DeprecatedIn35,
2022-07-21 06:42:49,662 - clearml.Task - INFO - Waiting for repository detection and full package requirement analysis
2022-07-21 06:43:27,494 - clearml.Task - INFO - Finished repository detection and package analysis
proc2
2022-07-21 06:43:29,955 - clearml.Task - WARNING - ### TASK STOPPED - USER ABORTED - STATUS CHANGED ###
Process Process-3:
Traceback (most recent call last):
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/tomer/ibex-ai-train/patch_level_classification/reproduce_cml_error.py", line 24, in execute_2
task = init_clearml_task("alg2", "train2_debug_cml", is_ensemble=False)
File "/home/tomer/ibex-ai-train/patch_level_classification/reproduce_cml_error.py", line 13, in init_clearml_task
task.connect({"bla": "bla"}, 'IbexConfig')
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/task.py", line 1119, in connect
return method(mutable, name=name)
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/task.py", line 2747, in _connect_dictionary
self._arguments.copy_from_dict(flatten_dictionary(dictionary), prefix=name)
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/backend_interface/task/args.py", line 446, in copy_from_dict
__parameters_types=param_types,
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/backend_interface/task/task.py", line 1126, in update_parameters
self._set_parameters(*args, __update=True, **kwargs)
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/backend_interface/task/task.py", line 1048, in _set_parameters
self._edit(hyperparams=hyperparams)
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/backend_interface/task/task.py", line 1853, in _edit
raise ValueError('Task object can only be updated if created or in_progress')
ValueError: Task object can only be updated if created or in_progress
done???????????????

Process finished with exit code 0 there is this line: WARNING - ### TASK STOPPED - USER ABORTED - STATUS CHANGED ### `and after this the connect is failing (because the task never open correctly)

  
  
Posted 2 years ago

from some reason it happend in the example I gave when running in debug only, maybe matter of timing, but it happend in my "real" script also not in debugging

  
  
Posted 2 years ago

I updated the versions to clearml 1.6.2 Server 1.5.0, it still happening , when calling
init_clearml_task('summary', 'alg1_debug_cml', is_ensemble=False)clearml doesnt create a new task, but now the process doesn't crush

  
  
Posted 2 years ago

ThankfulHedgehong21 , server 1.6.0 is available. Can you try with it as well?

  
  
Posted 2 years ago

I cant because we have some experiments running (I didnt update before, just used another newer server)

  
  
Posted 2 years ago

I'll try and see if it reproduces on my side, thanks! 🙂

  
  
Posted 2 years ago

thanks! let me know how it goes! :)

  
  
Posted 2 years ago

The sample script you posted runs fine on server 1.6.0. I did however comment out from machine_learning.clearml_client import Task and used from clearml import Task

Can you please try with the regular import?

  
  
Posted 2 years ago

yes, it was left by mistake (it calls)
from clearml import Taskdoesnt change the behavior

  
  
Posted 2 years ago

try this one (even when running without debug)
` import multiprocessing
import time

from clearml import Task

def init_clearml_task(patch_set_name, model_name, is_ensemble):
task_name = f'{patch_set_name} {model_name}'
task = Task.init(
project_name=f"bla CV",
task_name=task_name,
tags=[model_name, patch_set_name],
reuse_last_task_id=False
)
task.connect({"bla": "bla"}, 'IbexConfig')
return task

def execute_1():
print("proc1")
task = init_clearml_task("alg1", "train1_debug_cml", is_ensemble=False)
time.sleep(5)
task.close()
print("done_proc2")

def execute_2():
print("proc2")
task = init_clearml_task("alg2", "train2_debug_cml", is_ensemble=False)
time.sleep(5)
task.close()
print("done_proc2")

proc = multiprocessing.Process(target=execute_1)
proc.start()
proc.join(35000)
time.sleep(5)
print("father_script_done_proc1")
task = init_clearml_task('summary', 'alg1_debug_cml', is_ensemble=False)
time.sleep(5)
task.close()
time.sleep(5)
proc2 = multiprocessing.Process(target=execute_2)

proc2.start()
time.sleep(5)
proc2.join(35000)
time.sleep(5)
print("done???????????????") `

  
  
Posted 2 years ago

Try spinning a 1.6.0 server to see if it will work there. BTW what python version are you using?

  
  
Posted 2 years ago

do you say when running on 1.6.0 you see 3 tasks? (where I see 2)

  
  
Posted 2 years ago

python 3.6.9

  
  
Posted 2 years ago

I will update to 1.6 after the weekend and check

  
  
Posted 2 years ago