Reputation
Badges 1
25 × Eureka!Hi FlutteringWorm14
Is there some way to limit that?
What do you mean by that? are you referring to the Free tier ?
Hi CheerfulGorilla72
see
Notice all posts on that channel are @ channel 🙂
I just set the git credentials in the
clearml.conf
and it works out of the box
git has issues with passing the user/token from the main repo to the submodules, hence my surprise that it is working out-of-the-box.
Do notice that if you are ussing ssh-key this is a none issue.
Nope, no
.netrc
defined anywhere, ...
If this is the case can you try to add the following to your "extra_vm_bash_script"
` echo machine example.com > ~/.netrc && echo log...
Yes, just set system_site_packages: true
in your clearml.conf
https://github.com/allegroai/clearml-agent/blob/d9b9b4984bb8a83914d0ec6d53c86c68bb847ef8/docs/clearml.conf#L57
restart the notebook kernel ?
JitteryCoyote63 I think this only holds for the conda distribution.
(Actually quite interesting, I wonder what happens if you already installed cudatoolkit...)
Now I'm curious what's the workaround ?
ok, but this happens in my local machine, not in the agent
resource monitoring is always running in the background, even on local machines. (of course you can turn it off)
that is odd..
So if you have 3 agents, how many concurrent experiment are they running ? (actually running, not registered as running)
What's the matplotlib version ? and python version?
JitteryCoyote63
Picks a new experiment on top of the long one running
This is very very strange. Is the long running experiment being logged (i.e. do you still see console output in the UI)?
Yup, I just wanted to mark it completed, honestly. But then when I run it, Colab crashes.
task.close()
will do that
BTW what's the exception you are getting ?
Hi PleasantGiraffe85
Did you set git_host
to only point to your host ? do you expect all the git clones to use SSH? how does the requirements.txt git link looks like ?
https://github.com/allegroai/clearml-agent/blob/bf07b7f76d3236c1118b81730c6d9718705a795a/docs/clearml.conf#L22
NastyOtter17 can you provide some more info ?
I have a question regarding running the code on the remote machine, each time I run the code I see the console in the ClearML server start downloading all the libraries I used in the code and when I run another code the same thing happens so why it has to download all the libraries again and many times?
I'm assuming you are referring to the installation, the downloaded python packages are cached.
You can turn on full caching by uncommenting the following line:
https://github.com/alleg...
how to put or handle this configuration and where?
In your clearml.conf on the machine with the agent just add at the bottom of the file agent.venvs_cache.path=~/.clearml/venvs-cache
okay, let me check it, but I suspect the issue is running over SSH, to overcome these issues with pycharm we have specific plugin to pass the git info to the remote machine. Let me check what we can do here.
FiercePenguin76 BTW, you can do the following to add / update packages on the remote sessionclearml-session --packages "newpackge>x.y" "jupyterlab>6"
Yeah the ultimate goal I'm trying to achieve is to flexibly running tasks for example before running, could have a claim saying how many resources I can and the agent will run as soon as it find there are enough resources
Checkout Task.execute_remotely()
you can push it anywhere in your code, when execution get to it, If you are running without an agent it will stop the process and re-enqueue it to be executed remotely, on the remote machine the call itself becomes a noop,
I...
EnviousPanda91 notice that when passing these arguments to clearml-agent you are actually passing default args, if you want an additional argument to Always be used, set the extra_docker_arguments
here:
https://github.com/allegroai/clearml-agent/blob/9eee213683252cd0bd19aae3f9b2c65939d75ac3/docs/clearml.conf#L170
I think we should open a GitHub Issue and get some more feedback, maybe we should just add support in the backend side ?
Hi IrritableGiraffe81
I have a package called
feast[redis]
in my requirements.txt file.
This means feast is installing additional packages, once the agent is done installing everything, it basically calls pipe freeze and stores back All the packages including versions
Now the question is, how come redis is not installed.
Notice that the Task already has the autodetected packages (it basically ignores requirem,ents.txt as it is often not full missing or just wrong)
...
Hi PompousBeetle71
Try this one, let me know if it helpedlogging.getLogger('trains.frameworks').setLevel(ERROR)
Hi @<1715175986749771776:profile|FuzzySeaanemone21>
and then run "clearml-agent daemon --gpus 0 --queue gcp-l4" to start the worker.
I'm assuming the docker service cannot spin a container with GPU access, usually this means you are missing the nvidia docker runtime component
Or can I enable agent in this kind of local mode?
You just built a local agent
NastyFox63 ask SuccessfulKoala55 tomorrow, I think there is a way to change the default settings even with the current version.
(I.e. increase the default 100 entries limit)
What are you seeing in the Task that was cloned (i.e. the one the HPO created not the original training task)?
by that I mean, configuration section, do you have the Args there ? (seems like the pic you attached, but I just want to make sure)
Also in the train.py file, do you also have Task.init ?