![Profile picture](https://clearml-web-assets.s3.amazonaws.com/scoold/avatars/AgitatedDove14.png)
Reputation
Badges 1
25 × Eureka!This is odd, can you send th full log of the failed Task and if possible the code?
Hi @<1566959357147484160:profile|LazyCat94>
So it seems the arg parser is detecting the configuration YAML
The first thing I would suggest is changing it to a relative path (so that when launched on remote machines it will find the YAML file)
Regardless how are you launching the HPO ? are you spinning a new agent ?
(as background, argparser arguments are injected in realtime by the agent or the HPO when running as subprocesses)
: For artifacts already registered, returns simply the entry and for artifacts not existing, contact server to retrieve them
This is the current state.
Downloading the artifacts is done only when actually calling get()/get_local_copy()
By default the remote link (i..e the Task you are creating with Task.create will have all the auto logging turned on)
For finer control we kind of assume you have Task.init inside your remote script, and then just pass add_task_init_call=False
does that make sense ?
Do you think we should have a way to configure those auto_connect args when creating the Task?
I think you cannot change it for a running process, do you want me to check for you if this can be done ?
but I belive it should have work with 0.14.1 as well
Correct
Yes it does. I'm assuming each job is launched using a multiprocessing.Pool (which translates into a sub process). Let me see if I can reproduce this behavior.
We already redesigned the implementation so it should be quite easy to extend to GCP and Azure, what are you planning ?
Hmm can you try:--args overrides="['log.clearml=True','train.epochs=200','clearml.save=True']"
Let me check, which helm chart are you referring to ?
this results at the end of an experiment in an object to be saved under a given name regardless if it was dynamic or not?
Yes, at the end the name of the artifact is what it will be stored under (obviously if you reuse the name you basically overwrites the artifact)
JitteryCoyote63 I found it π
Are you working in docker mode or venv mode ?
Is this like a local minio?
What do you have under the sdk/aws/s3 section
?
Would it also be possible to query based on
multiple
user properties
multiple key/value I think are currently not that easy to query,
but multiple tags are quite easy to do
tags=["__$all", "tag1", "tag2],
But that should not mean you cannot write to them, no?!
Yes please π
BTW: I originally thought the double quotes (in your PR) were also a bug, this is why I was asking, wdyt?
(Not sure it actually has that information)
Do I set theΒ
CLEARML_FILES_HOST
Β to the end point instead of an s3 bucket?
Yes you are right this is not straight forward:CLEARML_FILES_HOST="
s3://minio_ip:9001 "
Notice you must specify "port" , this is how it knows this is not AWS. I would avoid using an IP and register the minio as a host on your local DNS / firewall. This way if you change the IP the links will not get broken π
To auto upload the model you have to tell clearml to upload it somewhere, usually by passing output_uri to Task.init or setting the default_output_uri in the clearml.conf
Was wondering how it can handle 10s, 100s of models.
Yes, it supports dynamically loading/unloading models based on requests
(load balancing multiple nodes is disconnected from it, but assuming they are under diff endpoints, the load balancer can be configured to route accordingly)
Hi @<1688721797135994880:profile|ThoughtfulPeacock83>
the configuration vault parameters of a pipeline step with the add_function_step method?
The configuration vault are a per set at execution user/project/company .
What would be the value you need to override ? and what is the use case?
OddShrimp85
the Task id is UUID that is generated by the backend server, there is no real way to force it to have a specific value π
Hi JitteryCoyote63
cleanup_service task in the DevOps project: Does it assume that the agent in services mode is in the trains-server machine?
It assumes you have an agent connected to the "services" queue π
That said, it also tries to delete the tasks artifacts/models etc, you can see it here:
https://github.com/allegroai/trains/blob/c234837ce2f0f815d3251cde7917ab733b79d223/examples/services/cleanup/cleanup_service.py#L89
The default configuration will assume you are running i...
Hi @<1523701304709353472:profile|OddShrimp85>
the venv setup is totally based on my requirements.txt instead of adding on to what the image has before. Why?
Are you using the agent in docker mode ? if this is the case it creates a venv inside the docker, inheriting from the preinstalled docker system packages,
If it helps, you can override it on the clients with an OS environment CLEARML_FILES_HOST