Is there a guide on how to deploy a services agent on a k8s setup? Specifically, with the CLI we use flags such as --services-mode --cpu-only , and I can’t s...
2 years ago
I'm seeing very high CPU usage by idle clearml-agents. Any idea why?
2 years ago
Hi all! I am currently using a self-hosted ClearML server and was looking to integrate the ClearML Agent to make better usage of our HPC resources with GPU a...
2 years ago
Hey everybody - I am using the PipelineController with add_function_step to add different step to the pipeline. Is there a way to specify a callback upon an ...
2 years ago
Hi did anyone encounter gpu monitoring failed getting GPU reading, switching off GPU monitoring error on remote task pod? I am on latest sdk and clearml serv...
2 years ago
general infrastructure question: my company isn't using AWS for training, we have all our GPU's inhouse in our own servers, we have a problem where we want o...
2 years ago
Hey, I have one query. model = Model.query_models(project_name = global_config.PROJECT_NAME, model_name="model training",max_results=1) is there any way it c...
2 years ago
hello everybody, I have a quick question, I am trying to use clearml-serving but I cant get it to work. I have a clearml server set up and now I am following...
2 years ago
hey all, I cannot use clearml with accelerate for uploading checkpoints. - Accelerate handles the folder structure, so checkpoints are usually like /iteratio...
2 years ago
Hello, Is there anyway to download a subset of Dataset? I've tried get_local_copy() which set part to specific number. It generates only state.json and not d...
2 years ago
Hello everyone, I'm trying to setup minio for self hosted clearml. But there is an error like in the picture when uploading. Does anyone know this error?
2 years ago
Hi everyone, I want to retrieve some scalars at specific iterations from ClearML. I found Task.get_all_reported_scalars and Task.get_reported_scalars . Howev...
2 years ago
I spin up a box and notice this was appeared as error on the logs:
2 years ago
Good day! please tell me how to screw it up. On what chart or value does such an error appear? 2023-09-08 02:13:16,164 - clearml.Metrics - ERROR - Action fai...
2 years ago
Hello! does anyone know how to do HPO when your parameters are in a Hydra configuration file? What is the correct way to do this (e.g declare the params for ...
2 years ago
How can I add my requirements.txt file to the pipeline instead of each tasks?
2 years ago
I've been trying to use the Task.get_debug_samples() method to download all debug sample images for a given task id. Is there a way to get a list of variants...
2 years ago
hi in webserver when i want see task list or project list i have error on failetoparse allowDiskUse anyone can help me
2 years ago
Hi all, we recently went from a docker hosted clearml to a k8s hosted clearml. We migrated our experiments by tarballing the data folder of mongodb and getti...
2 years ago
Hi, I am trying to save my trained model weights in S3 bucket instead of using ClearML storage when using clearml-task for ml training remotely. I tried to u...
2 years ago
Hello everyone, I'm currently working on comparing plots from experiments within the ClearML web UI. Each of these plots displays a series of true values and...
2 years ago
Hello, I am trying to run a task in an EC2 worker with the respective agent in Docker mode. I am using a custom docker image. The agent is configured to use ...
2 years ago
Hi everyone, how can one import additional requirements to the pipeline while remote execution
2 years ago
Hi everyone, how can I make what’s captured in logging.info () show up in the Console tabs of the experiment details in the WebUI? Currently it seems to capt...
2 years ago
Hi everyone! I am using clearml-serving When I am trying to add new endpoint like this clearml-serving --id <> model add --engine triton --endpoint conformer...
2 years ago
Hi all Is it possible to change a parameter of a mid task in a pipline and re-run from that part only?
2 years ago
Hi All Is there any way we can skip the cloning of the repository again and again?
2 years ago
My GCP auto scalar spins up the box, but the task does not get executed, the status remains pending, I did use the same schedular long ago and it was working...
2 years ago
Hi! how can I delete dataset from UI and from S3 bucket? I tried to delete from UI and then checking but I still have it ... API client doesn't have methods ...
2 years ago
When trying to run the server from the docker image ( docker-compose -f /opt/clearml/docker-compose.yml up -d as instructed in None ), I am getting an error ...
2 years ago