Reputation
Badges 1
611 × Eureka!I guess it started with the usage of the cleanup_service.
To answer my own question: In the WebUI where one inputs the credentials, use https for the host instead of the auto-added http
I usually also experience no problems with restarting the clearml-server. It seems like it has to do with the OOM (or whatever issue I have).
Thanks, I will look into it. For me the weird thing is that saving works and only deletion fails somehow.
The one I posted on top 22.03-py3 😄
By preexisting task I meant I have existing code that already uses Task.init . I would like to use this code as my main task in my pipeline, i.e. after carla started.
I see a python 3 fileserver.py running on a single thread with 100% load.
Unfortunately, I do not know that. Must be before October 2021 at least. I know I asked here how to use the preinstalled version and AgitatedDove14 helped me to get it work. But I cannot find the old thread 😕
Also I can see that clearml correctly loads the configSTORAGE S3BucketConfig(bucket='clearml', host='myhost:9000', key='mykey' secret='mysecret', token='', multipart=False, acl='', secure=True, region=None, verify=True, use_credentials_chain=False)
Is this working in the latest version? clearml-agent falls back to /usr/bin/python3.8 no matter how I configure clearml.conf Just want to make sure, so I can investigate what's wrong with my machine if it is working for you.
Good idea. No, clearml-agent does not crash and works fine afterwards. Then it is probably some other problem with my machine. Thank you!
I created an github issue because the problem with the slow deletion still exists. https://github.com/allegroai/clearml/issues/586#issue-1142916619
When I change the owner and the group of the files to root it works.
Here is how my start_carla .py task looks like currently:
` import os
import subprocess
from time import sleep
from clearml import Task
from clearml.config import running_remotely
def create_task(node):
task = Task.create(
project_name="examples",
task_name="start-carla",
repo="myrepo",
branch="carla-clearml-integration",
script="src/start_carla_task.py",
working_directory="src",
packages=["clearml"],
add_task_init_call=...
Nvm. I think I understood. When the file has never been added to repository it is not tracked.
The agent is run with pip. However, the docker image uses conda (because NVIDIA uses conda to build PyTorch most probably). My theory is that when the task is run the first time on an agent, Task.init will update the requirements. Then when ran a second time, the task will contain the requirements of the (conda-) environment from the first run.
Thanks for your help again. I will just use detect_with_conda_freeze: true . Seems like a perfect solution for me!
I think such an option can work, but actually if I had free wishes I would say that the clearml.Task code would need some refactoring (but I am not an experienced software engineer, so I could be totally wrong). It is not clear, what and how Task.init does what it does and the very long method declaration is confusing. I think there should be two ways to initialize tasks:
Specify a lot manually, e.g. ` task = Task.create()
task.add_requirements(from_requirements_files(..))
task.add_entr...
When the task is aborted I, the logs will show up, but the scalar logs will never appear. The scalar logs only appear when the task finishes.
No problem in my case at least.
I think doing all that work is not worth it right now, I am just trying to understand why I clearml seems not to be designed something like this:
` task_name = args.task_name
task = Task()
task = task.load_statedict(await Task.load_or_create(task_name))
task.requirements.add(...)
await task.synchronize()
task.execute_remotely(queue_name, exit=True) `
Maybe deletion happens "async" and is not reflected in parts of clearml? It seems that if I try to delete often enough at some point it is successfull
Thank you! I agree with CostlyOstrich36 that is why I meant false sense of security 🙂
Thank you SuccessfulKoala55 so actually only the file-server needs to be secured.
test_clearml , so directly from top-level.
Ah, it actually is also a string with remote_execution, but still not what it should be.

