Hey @<1639074542859063296:profile|StunningSwallow12> what exactly do you mean by "training in production"? Maybe you can elaborate what kind of models too.
ClearML in general assigns a unique Model ID to each model, but if you need some other way of versioning, we have support for custom tags, and you can apply those programmatically on the model
I have run the Ubuntu 20.04 container and cloned YOLOv5 inside it. Within the container, I configured ClearML (self-hosting server) with access keys and credentials.
I am launching YOLOv5 training with project and name tags. However, experiment results are not being logged to the ClearML server; instead, they are saved inside the container's root directory under the <project/name>
folder.
Interestingly, when I tried running the process directly on the host machine, the experiment results were successfully logged to the ClearML server. It's worth noting that I am able to send data from the container to the ClearML server, but the training results are not being logged.
@<1537605940121964544:profile|EnthusiasticShrimp49>
This sounds like you don't have clearml installed in the ubuntu container. Either this, or your clearml.conf
in the container is not pointing to the server, as a result all information is missing.
I'd rather suggest you change the approach, and run a clearml-agent
setup with docker
and when you want to run YOLOv5 training you actually execute it remotely on the queue that the agent is listening to
@<1537605940121964544:profile|EnthusiasticShrimp49>
I have configured it perfectly
Iam able to send data from the container to clearml server
If clearml-agent is the only way
Can you provide any documentation
Hi @<1639074542859063296:profile|StunningSwallow12> , here are the docs for the agent - None