Reputation
Badges 1
606 × Eureka!These are the errors I get if I use file_servers without a bucket ( s3://my_minio_instance:9000 )
2022-11-16 17:13:28,852 - clearml.storage - ERROR - Failed creating storage object
Reason: Missing key and secret for S3 storage access (
) 2022-11-16 17:13:28,853 - clearml.metrics - WARNING - Failed uploading to
('NoneType' object has no attribute 'upload_from_stream') 2022-11-16 17:13:28,854 - clearml.storage - ERROR - Failed creating storage object
` Reason: Missing key...
There is no way to create an artifact/model/dataset without a task, right? Just always inherit from the parent task. And if cloned change the user to the user who did the clone.
(just for my own interest: how much does the enterprise version divert from the open source version? It it just extended or are there core changes to the enterprise version)
It is weird though. The task is submitted by the original user and then run on the agent. The task however is still registered by the original user, since it is created by the original user.
Makes more sense to just inherit the user from the task than from the agent?
Okay, this seems to work fine.
SweetBadger76 I am using the Cleanup Service
Nvm. I forgot to start my agent with --docker
. So here comes my follow up question: It seems like there is no way to define that a Task requires docker support from an agent, right?
With clearml==1.4.1 it works, but with the current version it aborts. Here is a log with latest clearml
Well, after restarting the agent (to set it into --detached more) it set the cleanup_task.py into service mode, but my monitoring tasks are just executed on the agent itself (no new service clearml-agent is started) and then it is aborted right after starting.
Here is how my start_carla .py task looks like currently:
` import os
import subprocess
from time import sleep
from clearml import Task
from clearml.config import running_remotely
def create_task(node):
task = Task.create(
project_name="examples",
task_name="start-carla",
repo="myrepo",
branch="carla-clearml-integration",
script="src/start_carla_task.py",
working_directory="src",
packages=["clearml"],
add_task_init_call=...
@<1523701205467926528:profile|AgitatedDove14> Thank you very much for your guidance. Setting these manually works for me!
` ocker-compose ps
Name Command State Ports
clearml-agent-services /usr/agent/entrypoint.sh Restarting
clearml-apiserver /opt/clearml/wrapper.sh ap ... Up 0.0.0.0:8008->8008/tcp, 8080/tcp, 8081/tcp ...
Mhhm, good hint! Unfortunetly I can see nowhere in logs when the server creates a delete request
I usually also experience no problems with restarting the clearml-server. It seems like it has to do with the OOM (or whatever issue I have).
It didn't revert. Just one of my colleagues that I wanted to introduce to clearml put his clearml.conf in the wrong directory and pushed his experiments to the public server.
So I do not blame clearml for this mistake, but generally designing the system to be fail-safe is better than hope that everything is used like it has been designed 🙂
Wouldn't it be enough to just require a call to clearml-init
and throw an error when running without clearml.conf
which tells the user to run clearml-init first?
Okay, I found something out: When I use docker image ubuntu:22.04
it does not spin up a service agent and aborts the task. When I used python:latest
everything works fine!
@<1576381444509405184:profile|ManiacalLizard2> Yea, that makes sense. However, my problem is that I do not want to set it on the remote clearml-agent, since every use may have a different storage. E.g. one user pushes to Azure, while another one pushes to S3
Artifact Size: 74.62 MB
Okay, great! I just want to run the cleanup services, however I am running into ssh issues so I wanted to restart it to try to debug.
Thank you very much. I am going to try that.
Is there a clearml.conf for this agent somewhere?
Afaik, clearml-agent will use existing installed packages if they fit the requirements.txt. E.g. pytorch >= 1.7
will only install PyTorch if the environment does not already provide some version of PyTorch greater or equal to 1.7.
Okay, but are you logs still stored on MinIO with only using sdk.development.default_output_uri
?