Reputation
Badges 1
611 × Eureka!No problem in my case at least.
I think doing all that work is not worth it right now, I am just trying to understand why I clearml seems not to be designed something like this:
` task_name = args.task_name
task = Task()
task = task.load_statedict(await Task.load_or_create(task_name))
task.requirements.add(...)
await task.synchronize()
task.execute_remotely(queue_name, exit=True) `
Maybe deletion happens "async" and is not reflected in parts of clearml? It seems that if I try to delete often enough at some point it is successfull
Thank you! I agree with CostlyOstrich36 that is why I meant false sense of security π
Thank you SuccessfulKoala55 so actually only the file-server needs to be secured.
test_clearml , so directly from top-level.
Ah, it actually is also a string with remote_execution, but still not what it should be.
Is there a way to specify this on a per task basis? I am running clearml-agent in docker mode btw.
If you think the explanation takes too much time, no worries! I do not want to waste your time on my confusion π
Is sdk.development.default_output_uri used with s3://ip:9000/clearml or ip:9000/clearml ?
Thank you very much for the fast work!
One last question: Is it possible to set the pip_version task-dependent?
And how do I specify this in the output_uri ? The default file server is specified by passing True . How would I specify to use the second?
I only added# Python 3.8.2 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] --extra-index-url clearml torch == 1.14.0.dev20221205+cu117 torchvision == 0.15.0.dev20221205+cpuand I used a amd64/ubuntu:20.04 docker image with python3.8 . Same error. If it is not too much to ask, could you try to run it with this docker image?
I have an carla.egg file on my local machine and on the worker that I include with sys.path.append before I can do import carla . It is the same procedure on my local machine and on the clearml-agent worker.
For example in our case we do reinforcement learning and the we would call a script like this: python run_openai_gym.py some_ http://package.my _agent .
Good to know!
I think the current solutions are fine. I will try it first and probably will have some more questions/problems π
The default behavior mimics Pythonβs assert statement: validation is on by default, but is disabled if Python is run in optimized mode (via python -O). Validation may be expensive, so you may want to disable it once a model is working.
Maybe the difference is that I am using pipnow and I used to use conda! The NVIDIA PyTorch container uses conda. Could that be a reason?
Hi KindChimpanzee37 I was more asking about the general idea to make these settings task-specific, but thank you for the suggestion anyways, I will definitely apply it.
Is ther a way to see the contents of /tmp/conda_envaz1ne897.yml ? Seems to be deleted after the task is finihsed
And in the WebUI I can see arguments similar to the second print statement's.
These are the errors I get if I use file_servers without a bucket ( s3://my_minio_instance:9000 )
2022-11-16 17:13:28,852 - clearml.storage - ERROR - Failed creating storage object Reason: Missing key and secret for S3 storage access ( ) 2022-11-16 17:13:28,853 - clearml.metrics - WARNING - Failed uploading to ('NoneType' object has no attribute 'upload_from_stream') 2022-11-16 17:13:28,854 - clearml.storage - ERROR - Failed creating storage object ` Reason: Missing key...
Is there a simple way to get the response of the MinIO instance? Then I can verify whether it is the MinIO instance or my client

