Reputation
Badges 1
113 × Eureka!not sure if related but clearml 1.14 tend to not "show" the gpu_type
Nice ! That is handy !!
thanks !
that format is correct as I can run pip install -r requirements.txt
using the exact same file
Sorry I missed your message: no I don't know what happen when ES reach its RAM limit. We do self-host in Azure and use ES SaaS. Our cloud engineer manage that part.
My only experience was when I tried to spin up my local server, from docker compose, to test something and it took my PC down because ES eat all my RAM !!
I don't have it so I don't know how things are setup and how to pass on credentials in this case
how does it work if I create my pipeline from code ? Does the task will get the git repo state when first run and use commit hash and uncommited changed as "signature" ?
or which worker is in a queue ...
I understand to from the agent, point of view, I just need to update the conf file to use new credential and new server address.
so i guess it need to be set inside the container
We need to focus first on Why is it taking minutes to reach Using env.
In our case, we have a container that have all packages installed straight in the system, no venv in the container. Thus we don't use CLEARML_AGENT_SKIP_PIP_VENV_INSTALL
But then when a task is pulled, I can see all the steps like git clone, a bunch of Requirement already satisfied .... There may be some odd package that need to be installed because one of our DS is experimenting ... But all that we can see what is...
we are not using docker compose. We are deploying in Azure with each database as a standalone service
Found a trick to have empty Installed package:clearml.Task.force_requirements_env_freeze(force=True,requirements_file="/dev/null")
Not sure if this is the right way or not ...
will send the nginx -T results once the container is deployed
Actually, I can set agent.package_manager.pip_version="" in the clearml.conf
And after reading 4x the doc, I can use the env var:CLEARML_AGENT__AGENT__PACKAGE_MANAGER__PIP_VERSION
I also use this: None
Which can give more control
Just a +1 here. When we use the same name for 3 differents image, the thumbnail show 3 different images, but when clicking on any of them, only one is displayed. No way to display the others
what is the difference between vscode via clearml-session and vscode via remote ssh extension ?
in that case yes. What happen is that in docker mode:
you run a clearml agent, that then receive a task
create a container
install another agent inside that container
then run that second agent inside the container
that second agent then pull the task and do the usuall build/install
CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=true need to be set on that second agent somehow ...
I also have the same issue. Default argument are fine but all supplied argument in command line become duplicated !
I really like how you make all this decoupled !! 🎉
I understand for cleaml-agent
What I mean is that I have 2 self deployed server. I want to switch between the 2 config when running the code locally, not inside the agent
Found it: None
And credential are set with :
sdk {
azure.storage {
containers: [
{
account_name: "account"
account_key: "xxxx"
container_name:"clearml"
}
]
}
}
@<1523701070390366208:profile|CostlyOstrich36> I would like to point to azure blob storage, what kind of url schema should I use ? And also, where do you configure the credential for the ClearML server to access to Azure blob as file_server ? I couldn't find any documentation around this topic 😞
TIA
can you make train1.py use clearml.conf.server1 and train2.py use clearml.conf2 ?? In which case I would be intersted @<1523701087100473344:profile|SuccessfulKoala55>
Just keep in mind my your bottleneck will be the transfer rate. So mounting will not save you anything as you still need to transfer the whole dataset sooner or later to your GPU instance.
One solution is as Jake suggest. The other can be pre-download the data to your instance with a CPU only cheap instance type, then restart the instance with GPU.