clearml version 1.0.5, Server 1.1.0
code for reproduce
` import multiprocessing
from machine_learning.clearml_client import Task
def init_clearml_task(patch_set_name, model_name, is_ensemble):
task_name = f'{patch_set_name} {model_name}'
task = Task.init(
project_name=f"bla CV",
task_name=task_name,
tags=[model_name, patch_set_name],
reuse_last_task_id=False
)
task.connect({"bla": "bla"}, 'IbexConfig')
return task
def execute_1():
print("proc1")
task = init_clearml_task("alg1", "train1_debug_cml", is_ensemble=False)
task.close()
print("done_proc2")
def execute_2():
print("proc2")
task = init_clearml_task("alg2", "train2_debug_cml", is_ensemble=False)
task.close()
print("done_proc2")
proc = multiprocessing.Process(target=execute_1)
proc.start()
proc.join(35000)
print("father_script_done_proc1")
task = init_clearml_task('summary', 'alg1_debug_cml', is_ensemble=False)
task.close()
proc2 = multiprocessing.Process(target=execute_2)
proc2.start()
proc2.join(35000)
print("done???????????????") logs
-u /home/tomer/.pycharm_helpers/pydev/pydevd.py --multiprocess --qt-support=auto --client 127.0.0.1 --port 47969 --file /home/tomer/ibex-ai-train/patch_level_classification/reproduce_cml_error.py
Connected to pydev debugger (build 221.5921.27)
/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/jwt/utils.py:7: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography and will be removed in a future release.
from cryptography.hazmat.primitives.asymmetric.ec import EllipticCurve
proc1
ClearML Task: created new task id=56993ab96cb64b3089011b6f4d2c7e58
ClearML results page:
/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/cryptography/hazmat/backends/openssl/x509.py:17: CryptographyDeprecationWarning: This version of cryptography contains a temporary pyOpenSSL fallback path. Upgrade pyOpenSSL now.
utils.DeprecatedIn35,
2022-07-21 06:42:08,534 - clearml.Task - INFO - Waiting for repository detection and full package requirement analysis
2022-07-21 06:42:45,338 - clearml.Task - INFO - Finished repository detection and package analysis
done_proc2
father_script_done_proc1
ClearML Task: created new task id=50bff5797e664d699d6a58b57d54cdb4
ClearML results page:
/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/cryptography/hazmat/backends/openssl/x509.py:17: CryptographyDeprecationWarning: This version of cryptography contains a temporary pyOpenSSL fallback path. Upgrade pyOpenSSL now.
utils.DeprecatedIn35,
2022-07-21 06:42:49,662 - clearml.Task - INFO - Waiting for repository detection and full package requirement analysis
2022-07-21 06:43:27,494 - clearml.Task - INFO - Finished repository detection and package analysis
proc2
2022-07-21 06:43:29,955 - clearml.Task - WARNING - ### TASK STOPPED - USER ABORTED - STATUS CHANGED ###
Process Process-3:
Traceback (most recent call last):
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/tomer/ibex-ai-train/patch_level_classification/reproduce_cml_error.py", line 24, in execute_2
task = init_clearml_task("alg2", "train2_debug_cml", is_ensemble=False)
File "/home/tomer/ibex-ai-train/patch_level_classification/reproduce_cml_error.py", line 13, in init_clearml_task
task.connect({"bla": "bla"}, 'IbexConfig')
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/task.py", line 1119, in connect
return method(mutable, name=name)
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/task.py", line 2747, in _connect_dictionary
self._arguments.copy_from_dict(flatten_dictionary(dictionary), prefix=name)
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/backend_interface/task/args.py", line 446, in copy_from_dict
__parameters_types=param_types,
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/backend_interface/task/task.py", line 1126, in update_parameters
self._set_parameters(*args, __update=True, **kwargs)
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/backend_interface/task/task.py", line 1048, in _set_parameters
self._edit(hyperparams=hyperparams)
File "/home/tomer/miniconda3/envs/ibex/lib/python3.6/site-packages/clearml/backend_interface/task/task.py", line 1853, in _edit
raise ValueError('Task object can only be updated if created or in_progress')
ValueError: Task object can only be updated if created or in_progress
done???????????????
Process finished with exit code 0 there is this line:
WARNING - ### TASK STOPPED - USER ABORTED - STATUS CHANGED ### `and after this the connect is failing (because the task never open correctly)
Hi ThankfulHedgehong21 ,
What versions of ClearML & ClearML-Agent are you using?
Also, can you provide a small code snippet to play with?
try this one (even when running without debug)
` import multiprocessing
import time
from clearml import Task
def init_clearml_task(patch_set_name, model_name, is_ensemble):
task_name = f'{patch_set_name} {model_name}'
task = Task.init(
project_name=f"bla CV",
task_name=task_name,
tags=[model_name, patch_set_name],
reuse_last_task_id=False
)
task.connect({"bla": "bla"}, 'IbexConfig')
return task
def execute_1():
print("proc1")
task = init_clearml_task("alg1", "train1_debug_cml", is_ensemble=False)
time.sleep(5)
task.close()
print("done_proc2")
def execute_2():
print("proc2")
task = init_clearml_task("alg2", "train2_debug_cml", is_ensemble=False)
time.sleep(5)
task.close()
print("done_proc2")
proc = multiprocessing.Process(target=execute_1)
proc.start()
proc.join(35000)
time.sleep(5)
print("father_script_done_proc1")
task = init_clearml_task('summary', 'alg1_debug_cml', is_ensemble=False)
time.sleep(5)
task.close()
time.sleep(5)
proc2 = multiprocessing.Process(target=execute_2)
proc2.start()
time.sleep(5)
proc2.join(35000)
time.sleep(5)
print("done???????????????") `
I updated the versions to clearml 1.6.2 Server 1.5.0, it still happening , when callinginit_clearml_task('summary', 'alg1_debug_cml', is_ensemble=False)
clearml doesnt create a new task, but now the process doesn't crush
from some reason it happend in the example I gave when running in debug only, maybe matter of timing, but it happend in my "real" script also not in debugging
I cant because we have some experiments running (I didnt update before, just used another newer server)
ThankfulHedgehong21 , server 1.6.0 is available. Can you try with it as well?
The sample script you posted runs fine on server 1.6.0. I did however comment out from machine_learning.clearml_client import Task
and used from clearml import Task
Can you please try with the regular import?
Try spinning a 1.6.0 server to see if it will work there. BTW what python version are you using?
do you say when running on 1.6.0 you see 3 tasks? (where I see 2)
I'll try and see if it reproduces on my side, thanks! 🙂
I will update to 1.6 after the weekend and check
yes, it was left by mistake (it calls)from clearml import Task
doesnt change the behavior