Reputation
Badges 1
25 × Eureka!okay, let me know if it works
Hi ShallowArcticwolf27
Does theΒ
clearml-task
Β cli command currently support remote repositories with that are intended to be used with ssh
It does π
but theΒ
git@
Β prefix used for gitlab's ssh it seems to default to looking for the repository locally
git@ is always the prefix for SSH repositories (it does not actually mean it uses it, it's what git will return when asked on the origin of the repository. The agent knows (if SSH credentials ...
Seems like settings on the clearml-server disappeared (specifically default queue tag?!)
Do you have python 3.7 in the docker ?
Hi @<1571308003204796416:profile|HollowPeacock58>
I'm assuming this is the arm support (i,e, you are running on new mac) fix we released in one one of the last clearml-agent versions. could you update to the latest clearml-agent?
pip3 install clearml-agent==1.6.0rc2
What's the OS / Python version?
RoundMosquito25 are you using clearml-agent daemon --stop
or are you killing them ?
killing them basically means you loose them in the UI when they timeout, the backend does not see them for 10min so it assumes they died, when you call clearml-agent --stop they will unregister themselves and disappear immortally
Hi StickyWhale51
I think this issue is due to some internal race condition, anyhow I think we have an RC out solving it, can you try with:pip install clearml==1.2.0rc2
Hi CooperativeFox72
But my docker image has all my code and all the packages it needed I don't understand why the agent need to install all of those again?Β (edited)
So based on the docker file you previously posted, I think all your python packages are actually installed on the "appuser" and not as system packages.
Basically remove the "add user" part and the --user
from the pip install.
For example:
` FROM nvidia/cuda:10.1-cudnn7-devel
ENV DEBIAN_FRONTEND noninteractive
RUN ...
Hi SucculentBeetle7
The parameters passed to add_step
need to contain the section name (maybe we should warn if it is not there, I'll see if we can add it).
So maybe something like:{'Args/param1', 1}
Or{'General/param1', 1}
Can you verify it solves the issue?
. Yes I do have a GOOGLE_APPLICATION_CREDENTIALS environment variable set, but nowhere do we save anything to GCS. The only usage is in the code which reads from BigQuery
Are you certain you have no artifacts on GS?
Are you saying that if GOOGLE_APPLICATION_CREDENTIALS
and clearml.conf contains no "project" section it crashed when starting ?
Task.force_requirements_env_freeze()
This might be very brittle, if users are running on a diff OS, or python versions...
I would actually go with:
you like poetry, update your lock file in git you do not use poetry, work on your own branch and delete poetry lock file
wdyt?
I guess the thing that's missing from offline execution is being able to load an offline task without uploading it to the backend.
UnevenDolphin73 you mean like as to get the Task object from it?
(This might be doable, the main issue would be the metrics / logs loading)
What would be the use case for the testing ?
Hi @<1523702786867335168:profile|AdventurousButterfly15>
I am running cross_validation, training a bunch of models in a loop like this:
Use the wildcard or disable all together:
task = Task.init(..., auto_connect_frameworks={"joblib": False})
You can also do
task = Task.init(..., auto_connect_frameworks={"joblib": ["realmodelonly.pkl", ]})
WackyRabbit7 I guess we are discussing this one on a diff thread π but yes, should totally work, that's the idea
from task pick-up to "git clone" is now ~30s, much better.
This is "spent" calling apt update && update install && pip install clearml-agent
if you have those preinstalled it should be quick
though as far as I understand, the recommendation is still to not run workers-in-docker like this:
if you do not want it to install anything and just use existing venv (leaving the venv as is) and if something is missing then so be it, then yes sure that the way to go
Hi CleanWhale17 let me see if I can address them all
Email Alert for finished Job(I'm not sure if it's already there).
Slack integration will be public by the end of the weekend π
It is fully customization / extendable, I'll be happy to help .
DVC
Full dataset tracking is supported using the artifacts and the ability to integrate to any central storage (shared folders/ S3 / GS / Azure etc.)
From my experience, it is easier to work with artifacts from Data-Processing Tasks...
... grab the model artifacts for each, put them into the parent HPO model as its artifacts, and then go through the archive everything.
Nice. wouldn't it make more sense to "store" a link to the "winning" experiment. So you know how to reproduce it, and the set of HP that were chosen?
No that the model is bad, but how would I know how to reproduce it, or retrain when I have more data etc..
Hi ConvolutedSealion94
You can archive / delete the SERVING-CONTROL-PLANE
Task from the DevOps project in the UI.
Do notice you will need to make sure the clearml-serving is updated with a new sesison ID or remove it (i.e. take down the pods / docker-compose)
Make sense ?
Were you able to interact with the service that was spinned? (how was it spinned?)
What's the trains-server version ?
You can see it if you go to the profile page
and since the update the docs seem to be a bit off but I think I got it
Working on a whole new site π
sorry the point where you select the interpreter for pycharm
Oh I see...
CheerfulGorilla72 sounds like a great idea, I'll pass along to documentation ppl π
Hi SoreHorse95
I am exploring hiding our clearml server behind
Do you mean add additional reverse proxy to authenticate clearml-server from outside ?
Hi @<1637624975324090368:profile|ElatedBat21>
I think that what you want is:
Task.add_requirements("unsloth", "@ git+
")
task = Task.init(...)
after you do that, what are you seeing in the Task "Installed Packages" ?
Thanks for checking @<1545216070686609408:profile|EnthusiasticCow4> stable release will be out soon
. I was just wondering if instead of using local subprocesses, several agents could serve the same purpose (running several pipelines concurrently)
wouldn't --service-mode
(read as multiple simultaneous Tasks on the same agent) solve the issue?
(BTW: if you set the pipeline component target queue to "services" , this is exactly what will happen)
This should have worked with the latest clearml RC.
And you verified it is not working?