Reputation
Badges 1
22 × Eureka!if the same code is run localy it works:
Yes this part is correct. If i point directly to the data.yaml the training starts without any problem
but then the error occurs, after the training und the validating where succesfuly completed
Hi Martin
Thank you very much for your answer. I have the dataset already uploaded and it is visible by datasets. Also the dataset is downloaded and stored by .clearml. If i try to accses the data with the following code I get an Permission denied error.
......
File "C:\Users\junke\AppData\Local\Programs\Python\Python310\lib\gzip.py", line 174, in init
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
PermissionError: [Errno 13] Permission denied: 'C:/Users/junke/.c...
and it does 😀 .
Thank you very much
Now it seems to work. I had to add the 0013_Datenset as well:
dataset_path = Dataset.get(
dataset_name=dataset_name,
dataset_project=dataset_project,
alias="0013_Dataset"
).get_local_copy()
dataset_path = os.path.join(dataset_path, "0013_Datenset", "data.yaml")
Hi Martin
Thank you very much for your answer and sorry for the late answer. I have testet a few things. The training works fine:
If I access the dataset on the same location directly it works fine:
#data = r"C:\Users\junke\.clearml\cache\storage_manager\datasets\ds_30892c41582b4537bb9508f3c09ae9ed\0013_Datenset\data.yaml"
Now i am wondering if this works on a google colab worker as well.
It worked until it should validate the trainings. Here as well the same error.
2 epochs completed in 0.174 hours.
Optimizer stripped from runs/detect/train/weights/last.pt, 136.7MB
Optimizer stripped from runs/detect/train/weights/best.pt, 136.7MB
Validating runs/detect/train/weights/best.pt...
Ultralytics YOLOv8.0.231 🚀 Python-3.10.12 torch-2.1.2+cu121 CUDA:0 (NVIDIA A100-SXM4-40GB, 40514MiB)
Model summary (fused): 268 layers, 68129346 parameters, 0 gradients, 257.4 GFLOPs
Traceback (...
My complete code is:
import pandas as pd
from ultralytics import YOLO
from clearml import Task, Dataset
# Creating a ClearML Task
task = Task.init(
project_name="Training_MASAM_Modell_N",
task_name="Datensatz_0013_Freeze_15",
output_uri=True
)
model = YOLO("yolov8n.pt")
dataset_name = "0013_Dataset"
dataset_project = "Vehicle_Dataset"
dataset_path = Dataset.get(
dataset_name=dataset_name,
dataset_project=dataset_project,
alias="0013_Dataset"
).get_local_copy()...
The only difference i see is, that there is no labels.cache file in the test folder.
I have realy an understanding problem. I have started the process from a diffrend computer. I have deleted the complete .clearml folder.
However the training starts and the validating process fails.
Yes, when I understand correctly from the documentation the dataset is the first time downloaded und later on only the increment changes of it:
2023-12-29 20:24:11
2023-12-29 19:24:06,083 - clearml.storage - INFO - Downloading: 255.00MB / 387.75MB @ 48.47MBs from None
2023-12-29 19:24:06,182 - clearml.sto...
The Problem where the / and . Now the process ended without any error:
🙂
Hi Martin
I just deleted the complet folder, still the same:
Ultralytics YOLOv8.0.225 🚀 Python-3.10.11 torch-2.2.0.dev20231207+cu118 CUDA:0 (NVIDIA GeForce GTX 1650 Ti with Max-Q Design, 4096MiB)
engine\trainer: task=detect, mode=train, model=yolov8n.pt, data=C:/Users/junke/.clearml/cache/storage_manager/datasets/ds_30892c41582b4537bb9508f3c09ae9ed, epochs=80, patience=50, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=0, workers=8, project=None, name=train17, exi...
and the data.yaml file as well:
train: ../train/images
val: ../valid/images
test: ../test/images
nc: 6
names: ['S_60_aktiv', 'S_Verboten_aktiv', 'bus', 'car', 'motorcycle', 'truck']
Same issue when I try to run a clone of the same program on a google colab worker:
2023-12-29 19:24:12,028 - clearml - INFO - Dataset.get() did not specify alias. Dataset information will not be automatically logged in ClearML Server.
New
available 😃 Update with 'pip install -U ultralytics'
Ultralytics YOLOv8.0.225 🚀 Python-3.10.12 torch-2.1.2+cu121 CUDA:0 (Tesla T4, 15102MiB)
engine/trainer: task=detect, mode=train, model=yolov8n.pt, data=/root/.clearml/cache/storage_m...
with #data = r"C:\Users\junke\.clearml\cache\storage_manager\datasets\ds_30892c41582b4537bb9508f3c09ae9ed\0013_Datenset\data.yaml"