FierceHamster54initing the task before the execution of the file like in my snippet is not sufficient ?It is not because  os.system  spawns a whole different process then the one you initialized your task in, so no patching is done on the framework you are using. Child processes need to call  Task.init  because of this, unless they were forked, in which case the patching is already done.But the training.py has already a CLearML task created under the hood since its integration with ClearMLDoes  training.py  call functions from the  clearml  library? If so, what functions and at which stages of the training? Having a task should be enough to save the models appropriately, so something could be bugged in our logging  🫤
The  train.py  is the default YOLOv5 training file, I initiated the task outside the call, should I go edit their training command-line file ?
FierceHamster54  As long as you are not forking, you need to use  Task.init   such that the libraries you are using get patched in the child process. You don't need to specify the  project_name ,  task_name  or  outpur_uri . You could try locally as well with a minimal example to check that everything works after calling  Task.init  .
But the task appeared with the correct name and outputs in the pipeline and the experiment manager
The worker docker image was running on python 3.8 and weare running on a PRO tier SaaS deployment, this failed run is from a few weeks ago and we did not run any pipeline since then
effectively making us lose 24 hours of GPU compute
Oof, sorry about that, man 😞
Hi  FierceHamster54 ! Did you call  Task.init()  in  train.py ?
I'm reffering https://clearml.slack.com/archives/CTK20V944/p1668070109678489?thread_ts=1667555788.111289&cid=CTK20V944 mapping the project to ClearML project and https://github.com/ultralytics/yolov5/tree/master/utils/loggers/clearml that when calling the trainin g.py from my machine successfully logged the training on clearML and uploaded the artifact correctly
One more question FierceHamster54 : what Python/OS/clearml version are you using?
SmugDolphin23 But the training.py has already a CLearML task created under the hood since its integration with ClearML, beside initing the task before the execution of the file like in my snippet is not sufficient ?
THe image OS and the runner OS were both Ubuntu 22 if I remember
FierceHamster54 I understand. I'm not sure why this happens then 😕 . We will need to investigate this properly. Thank you for reporting this and sorry for the time wasted training your model.
 
				 
				