Reputation
Badges 1
32 × Eureka!Yes it was set to nvidia/cuda:10.1-runtime-ubuntu18.04... ok I'll try again and see if that was the problem, thank you
Yes I think it's only related to the UI. Do you think It can be fixed somehow? It would be the easiest way to launch new experiments with a different configuration
Also, if I want to modify another parameter, e.g. ui.height I have this problem:
` from clearml import Task
from dataclasses import dataclass
import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import OmegaConf
@dataclass
class MySQLConfig:
host: str = "localhost"
port: int = 3306
@dataclass
class UserInterface:
title: str = "My app"
width: int = 1024
height: int = 768
@dataclass
class MyConfig:
db: MySQLConfig = MySQLConfig()
ui: UserInterface = UserInterface()
cs = ConfigStore.instance()
cs.store(name="config", n...
Hi AgitatedDove14 , I noticed that in the Hydra parameters section it is not possible to add as parameters keys string with dots: .(dot) $(dollar) and space are not allowed in parameter key.
However, it's very useful to add parameters with the dot to change something in a sub-configuration as, for example, training.max_epochs=10
. Do you think it's possible to allow this?
Hi AgitatedDove14 , you can try with this toy example. If i run the task with python example.py ui.width=2048
the task will run correctly and print Title=My app, size=2048x768 pixels
. However, in the UI I'm not allowed to change the ui.width in the Hydra parameters section: the 'Save' button is frozen
Hi AgitatedDove14 , sorry for the late reply. Btw, I tried with the latest RC and the issue is still there. So if I clone an experiment, modify an overrides params eg ['training.max_epochs=10']
my experiment run the old configuration. Therefore it seems that it doesn't change the OmegaConf configuration.
Ok now I noticed that If I change the value of the port inside the Hydra parameters section ( not the overrides) It does actually change also in the experiment. The overrides doesn't seem to be working
Actually I had the same issue even with that value set to False
` # ClearML - Hydra Example
from clearml import Task
from dataclasses import dataclass
import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import OmegaConf
@dataclass
class MySQLConfig:
host: str = "localhost"
port: int = 3306
cs = ConfigStore.instance()
Registering the Config class with the name 'config'.
cs.store(name="config", node=MySQLConfig)
@hydra.main(config_name="config")
def my_app(cfg: MySQLConfig) -> None:
# type (DictConfig) -> None
...
I created this toy example so you don't need any external conf files. Btw if I first launch the task as python example.py port=80
than the task will print the message "Is this a webserver" correctly. If then in the UI I clone the same task, overrides the port with ['port=43']
, for example, and run the experiment, I will still get the message "Is this a webserver" so the port didn't change
However, If I edit directly the OmegaConf in the UI than the port changes correctly. I'd still prefer to override the Args so I can change entire sub-configuration e.g. ['dataset=cifar']
to ['dataset=imagenet']
instead of having to change all the parameters inside the OmegaConf
Hi AgitatedDove14
I implemented the pipeline manually as you suggested. I also used task.wait_for_status() after each task.enqueue() so I was able to implement a full pipeline in one script. It seems to be working correctly. Thank you!
Hi AgitatedDove14 , I'm interested in this feature to run the agent and force it to install packages from requirements.txt. Is it available?
Hi AgitatedDove14 , do you mean the the k8s glue autoscaler here https://github.com/allegroai/clearml-agent/blob/master/examples/k8s_glue_example.py ? If yes, I understood that this service deploys pods on the nodes in the cluster, but I'd prefer to have a new instance deployed for each new experiment and that it also terminates when no new experiments are queued
AgitatedDove14 that seems like the best option. Once the aws autoscaler is inside a docker container I can deploy it inside a kube pod or a job. This, however, requires that I slightly modify the clearml helm chart with the aws-autoscaler deployment, right?
Hi TimelyPenguin76 , I used api_client.tasks.create
and It works, thank you!
No ok now I think I got how to use it, so "detect_with_pip_freeze" suppose that the instance launching remotely the clearml task has already all the packages installed inside pip and store them in the "installed packages". After this all the remote clearml-agents will install the packages included in "installed packages". Correct?
After the agent finished installing the "requirements.txt" it will put back the entire "pip freeze" into the "installed packages", this means that later we will be able to fully reproduce the working environment, even if packages change (which will eventually happen as we cannot expect everyone to constantly freeze versions)
This would be perfect
As an example, in Task.create() there is the possibility to install packages using a requirements.txt, and if not specified, it uses the requirements.txt of the repository. I'd like something like for Task.init() if possible
if in the "installed packages" I have all the packages installed from the requirements.txt than I guess I can clone it and use "installed packages"
I also removed 'sudo' from all the commands as is suggested in https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html but that wasn't the cause of the problem
Nice, I'll try also with the extra_bash_script, thank you!
Hi Sapir, no that didn't solve the problem unfortunately. I ssh into the machine (after removing shutdown so that it doesn't terminate) and from the log I saw the error : "clearml_agent: ERROR: Connection Error: it seems *api_server* is misconfigured. Is this the ClearML API server
http://apiserver:8008 ?
So it is a credential problem
Yes it does 👍 Btw, at the moment I added import(s3fs) in my entry point and it's working, thank you!
Does it work if I launch the clearml-agent on a docker and pip doesn't know the packages to install?
Please let me know if my explanation is not really clear