Reputation
Badges 1
25 × Eureka!Could it be someone deleted the file? this is inside the temp venv folder but it should not get there
I would like to force the usage of those requirements when running any script
How would you force it? Will you just ignore the "Installed Packages" section ?
RoundMosquito25 how is that possible ? could it be they are connected to a different server ?
Can you see that the environment is actually being passed ?
A definite maybe, they may or may not be used, but we'd like to keep that option
The precursor to the question is the idea of storing local files as "input artifacts" on the Task, which means that if the Task is cloned the links go with it. Let's assume for a second this is the case, how would you upload these artifacts in the first place?
Hmm, maybe the right way to do so is to abuse "models" which have entity, you can specify a system_tag on them, they can store a folder (and extract it if you need), they are on projects and they are cloned and can be changed.
wdyt?
For now we've monkey-patched it to our usecase:
LOL, that's a cool hack
That gives us the benefit of creating "local datasets" (confined to the scope of the project, do not appear in
Datasets
tabs, but appear as normal tasks within the project)
So what would be a "perfect" solution here?
I think I'm missing the point on why it became an issue in the first place.
Notice that in new versions Dataset will be registered on the Tasks that use them (they are already...
Why does ClearML hide the dataset task from the main WebUI?
Basically you have the details from the Dataset page, why should it be mixed with the others ?
If I specified a project for the dataset, I specifically want it there, in that project, not hidden away in some
.datasets
hidden sub-project.
This maybe a request for "Dataset" tab under project, why would you need the Dataset Task itself is the main question?
Not all dataset objects are equal, and perhap...
Can you see all the agent in the UI (that basically means they are configured correctly and can connect to the server)
It looks like the tag being used is hardcoded to 1.24-18. Was this issue identified and fixed in later versions?
BoredHedgehog47 what do you mean by "hardcoded 1.24-18" ? tag to what I think I lost context here
JuicyFox94 maybe you can help here?
It may have been killed or evicted or something after a day or 2.
Actually the ideal setup is to have a "services" pod running all these service on a single pod, with clearml-agent --services-mode. This Pod should always be on and pull jobs from a dedicated queue.
Maybe a nice way to do that is to have the single Task serialize itself, then have the a Pod run the Task every X hours and spin it down
So I would like to to know what it send to the server to create the task/pipeline, ...
TenseOstrich47
I noticed that with one agent, only one task gets executed at one time
Yes you can π
Also, you are correct, a single agent will run a single Task at a time, that said you can have multiple agents running on the same machine, and when you launch them you specify which GPUs they use (in theory they can share the same GPU, but your code might not like it π )
You can see a few examples here:
https://github.com/allegroai/clearml-agent#running-the-clearml-agent
Yes! Thanks so much for the quick turnaround
My pleasure π
BTW: did you see this (it seems like the same bug?!)
https://github.com/allegroai/clearml-helm-charts/blob/0871e7383130411694482468c228c987b0f47753/charts/clearml-agent/templates/agentk8sglue-configmap.yaml#L14
Then when ran a second time, the task will contain the requirements of the (conda-) environment from the first run.
What you see in the log under "Summary - installed python packages:" will be exactly what is updated on the Task. But it does not contain the "ruamel_yaml_conda" package, this is what I cannot get...
But I did find this part:ERROR: conda 4.10.1 requires ruamel_yaml_conda>=0.11.14, which is not installed.
Which point to conda needing this package and then failing to i...
Hi UnevenDolphin73
If you "remove" the lock file the agent will default to pip.
You can hack it with uncommitted changes section?
Hi TrickyRaccoon92
If you are reporting to tensor-board, then "iteration" equals step. Is this the case?
ReassuredTiger98 I guess this is a plotly feature, none the less I think you can shift the Y axis manually (click and drag)
it does appear on the task in the UI, just somehow not repopulated in the remote run if itβs not a part of the default empty dictβ¦
Hmm that is the odd thing... what's the missing field ? Could it be that it is failing to Cast to a specific type because the default value is missing?
(also, is issue present in the latest clearml RC? It seems like a task.connect issue)
RoughTiger69
Apparently,
, doesnβt populate that dict with
any keys that donβt already exist in it
.
Are you saying new entries are not added to the Dict even if they are on the Task (i.e. only entries that already exist on the dict are populated ?
But you already have all the entries defined here:
https://github.com/allegroai/clearml/blob/721569bb77d89d89e5b4f32a0ed98311c4574650/examples/services/aws-autoscaler/aws_autoscaler.py#L22
Since all this is ha...
load_model
will get a link to a previously registered URL (i.e. it search a model pointing to the specific URL, if it finds it, it will get you the Model object)
BTW: I think it was fixed in the latest trains package as well as the cleaml package
Hi PanickyMoth78
So do not tell anyone, but the next version will have reports built in clearml, as well as the ability to embed graphs in 3rd party (think Notion GitHub, markdown etc.)
Until then (ETA mid Dec), the easiest is to download an image or just use the url (it encodes the full view, so when someone clicks on it they the exact view you are seeing)
but it fails during env setup due to trying to install an obscure version of pytorch. Been trying to solve this for three days!
AdventurousButterfly15 it tries to resolve the correct pytorch version based on the cuda inisde the container
ERROR: torch-1.12.1+cu116-cp310-cp310-linux_x86_64.whl is not a supported wheel on this platform.
seems like it is trying to install pytoch for python 3.10 with cuda 11.6 support, this seems reasonable, no?
AdventurousButterfly15
Despite having manually installed this torch version, during task execution agent still tries to install it somehow and fails:
Are you running the agent in venv mode? or docker mode?
Notice that in docker mode it inherits the python packages from the container, and adds/reinstalls missing packages. In venv mode it creates a New clean venv (there is no way to inherit a venv, venv can only inherit from system wide installed packages)
The idea is that you cannot e...
Nice!
script, and the kwcoco not imported directly (but from within another package).
fyi: usually the assumption is that clearml will only list the directly imported packages, as these will pull the respective required packages when the agent will be installing them ... (meaning that if in the repository you are never actually directly importing kwcoco, it will not be listed (the package that you do import directly, the you mentioned is importing kwcoco, will be listed). I hope this ...
okay, just so I understand, this is what you have on your client that can connect with the server:api { api_server:
web_server:
files_server:
credentials {"access_key": "KEY", "secret_key": "SECRET"} }
This is odd I was running the example code from:
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
It is stored inside a repo, but the steps that are created (i.e. checking the Task that is created) do not have any repo linked to them.
What's the difference ?
Hi GloriousPenguin2
Had to do some linux updates and redeploy clearml server, now i can access web UI & the service only if i do port-forwarding to that remote machine
So you are saying before you were able to directly browse to the server, but now you need a "jump box" ?