Unanswered
Hi, I Am Running A Script Very Similar To The One In
Hi @<1523701070390366208:profile|CostlyOstrich36> , Here's sample code:
from ultralytics import YOLO
from clearml import Task, Dataset
from jsonargparse import CLI
def train_yolo(ds_name: str=None):
dataset_path = Dataset.get(dataset_name=ds_name).get_local_copy()
task = Task.current_task()
if task == None:
task = Task.init(project_name="YOLO", task_name=ds_name)
model = YOLO("yolov8n")
model.train(data=dataset_path)
if __name__ == "__main__":
CLI(train_yolo)
I enqueued a job using this code (with clearml-task). It ran on machine1
and crashed at some point. I reset the job and re-enqueued it, and it now ran machine2
. For some reason the training started fine on the clearml dataset, but when there was a second call to the data (during model.val), it was looking for a dataset in /home/machine1/.clearml/cache/storage_manager/datasets/...
and it crashes the job.
76 Views
0
Answers
6 months ago
6 months ago