Reputation
Badges 1
606 × Eureka!Also I can see that clearml correctly loads the configSTORAGE S3BucketConfig(bucket='clearml', host='myhost:9000', key='mykey' secret='mysecret', token='', multipart=False, acl='', secure=True, region=None, verify=True, use_credentials_chain=False)
But this means the logger will use the default fileserver or not?
Is sdk.development.default_output_uri
used with s3://ip:9000/clearml or
ip:9000/clearml
?
This is the error I get from setting the logger upload destination.botocore.exceptions.ClientError: An error occurred (InvalidAccessKeyId) when calling the PutObject operation: The AWS Access Key Id you provided does not exist in our records.
` apiserver:
command:
- apiserver
container_name: clearml-apiserver
image: allegroai/clearml:latest
restart: unless-stopped
volumes:
- /opt/clearml/logs:/var/log/clearml
- /opt/clearml/config:/opt/clearml/config
- /opt/clearml/data/fileserver:/mnt/fileserver
depends_on:
- redis
- mongo
- elasticsearch
- fileserver
- fileserver_datasets
environment:
CLEARML_ELASTIC_SERVICE_HOST: elasticsearch
CLEARML_...
Exactly. I don't want people to circumvent the queue 🙂
Thanks! I am fascinated by what you guys offer with clearml 🙂
For example I get the following error if I simply clone and rerun:ERROR: Could not find a version that satisfies the requirement ruamel_yaml_conda>=0.11.14 (from conda==4.10.1->-r /tmp/cached-reqs6wtc73be.txt (line 28)) (from versions: none) ERROR: No matching distribution found for ruamel_yaml_conda>=0.11.14 (from conda==4.10.1->-r /tmp/cached-reqs6wtc73be.txt (line 28))
I see, so it is actually not related to clearml 🎉
In the first run the package only existed because it is preinstalled in the docker image. Afaik, in the second run it is also preinstalled, but pip will first try to resolve it and then see whether it already exists. But I am not to sure about this.
No no, I was just wondering how much effort it is to create something like ClearML. And your answer gives me a rough estimate 🙂
Haha, fortunately I have a good job already. Just wanted to know how many people are actively working on clearml.
First one is the original, second one the clone
Could you elaborate on that:
"So the agent failed to actually restore it from the git (files that are not added are not considered part of the git diff, this is usually git behavior)."
Hey AgitatedDove14 is there any update on this?
Yea, that I knew 😄 But somehow I didn't think about the clearml.conf
Thank you very much for the quick answer. Still so confusing to me that so many things are configured client side 😄
I don't think so. It is related to issue with the clearml-server I posted in the other thread. Essentially the clearml-server hangs, then I restart it with docker-compose down && docker-compose up -d
and the experiments sometimes show as running, but on the clearml-agents I see that actually nothing is running or they show as aborted.
I know that usually clearml-agents do not abort on server restart and just continue.
I have an carla.egg
file on my local machine and on the worker that I include with sys.path.append
before I can do
import carla
. It is the same procedure on my local machine and on the clearml-agent worker.
Obviously in my examples there is a lot of stuff missing. I just want to show, that the user should be able to replicate Task.init
easily so it can be configured in every way, but still can make use of the magic that clearml has, for stuff that does not differ from the comfort way.
I think such an option can work, but actually if I had free wishes I would say that the clearml.Task code would need some refactoring (but I am not an experienced software engineer, so I could be totally wrong). It is not clear, what and how Task.init
does what it does and the very long method declaration is confusing. I think there should be two ways to initialize tasks:
Specify a lot manually, e.g. ` task = Task.create()
task.add_requirements(from_requirements_files(..))
task.add_entr...
AgitatedDove14 Yes, you understood correctly. But Task.create
is used by Task.init
something like this, right?
` def init(project_name, task_name):
if not Task.exists_already(project_name, task_name):
task = Task.create(...)
else:
task = load_existing_task()
return task `
Btw: I think Task.init
is more confusing than Task.create
and I would rather rename the former.