Hmm, I still wonder what is the "correct" answer for most people, is empty string in argparse redundant anyhow? will someone ever use it?
Hmm, it might be sub-sampling on large scalar plots (so that we do not "kill" the ui), but I remember that it only happens above 50k samples. (when you zoom in, do you still get the 0.5 values?)
SmarmySeaurchin8 regrading (2)
I'm not sure the current visualization supports it. I mean we can put "{}", but that would imply you can edit it, which then we have to support, possible but weird, and this is why:task.connect({'a':{},'b': {'nested': 'value}}
will become'a' = '{}'
'b/nested' = 'value'
But then if you edit to:'a' = '{'nested': 'value'}'
'b/nested' = 'value'
you have two different ways of presenting the same type of structure...
When you install using pip <filename> you should end up with something like:minerva @ file://... or minerva @ https://...
Hi UnevenOstrich23
if --docker is enable that will means every new experiments will be executed into dedicated agent worker containers?
Correct
I think the missing part is how to specify the docker for the experiment?
If this is the case, in the web UI, clone your experiment (which will create a draft copy, that you can edit), then in the Execution tab, scroll down to the "base docker image" and specify the docker image to use.
Notice that you can also add flags after the docker im...
Ohh then you do docker sibling:
Basically you map the docker socket into the agent's docker , that lets the agent launch another docker on the host machine.
You cab see an example here:
https://github.com/allegroai/clearml-server/blob/6434f1028e6e7fd2479b22fe553f7bca3f8a716f/docker/docker-compose.yml#L144
what if the preexisting venv is just the system python? my base image is python:3.10.10 and i just pip install all requirements in that image. Does that not avoid venv still?
it will basically create a new venv inside the container forking the existing preinistalled stuff (i.e. the new venv already has everything the python system has preinstalled)
then it will call "pip install" on all the "installed packages of the Task.
Which should just check everything is there and install nothing...
Actually we just added venv support as well, the reasoning is/was inside a docker it is easier to separate the running processes, with venv we had to support multiple venv running at the same time and reusing of those venv (just a bit more logic) anyhow this is now supported :)
Q. Would someone mind outlining what the steps are to configuring the default storage locations, such that any artefacts or data which are pushed to the server are stored by default on the Azure Blob Store?
Hi VivaciousPenguin66
See my reply here on configuring the default output uri on the agent: https://clearml.slack.com/archives/CTK20V944/p1621603564139700?thread_ts=1621600028.135500&cid=CTK20V944
Regrading permission setup:
You need to make sure you have the Azure blob credenti...
I suppose the same would need to be done for any
client
PC running
clearml
such that you are submitting dataset upload jobs?
Correct
That is, the dataset is perhaps local to my laptop, or on a development VM that is not in the
clearml
system, but I from there I want to submit a copy of a dataset, then I would need to configure the storage section in the same way as well?
Correct
I am just about to move house, which is stressful enough without a global pandemic(!), so until that's completed I won't commit to anything.
Sure man 🙂 no rush, I appreciate the gesture regardless of the outcome
Many thanks!
Sounds good.
BTW, when the clearml-agent is set to use "conda" as package manager it will automatically install the correct cudatoolkit on any new venv it creates. The cudatoolkit version is picked direcly when "developing" the code, assuming you have conda installed as development environment (basically you can transparently do end-to-end conda, and not worry about CUDA at all)
Yes, you are too quick for the resource monitoring 🙂
Hi @<1716987933514272768:profile|SuccessfulPuppy43>
How to make remote ClearML agent do
pip install -e .
in theory there is no need to do that clearml-agent adds the repo root folder to the python path.
If you insist on actually installing it, try to add to your "installed packages" section a "requirement.txt" compatible line:
-e .
Is it vanilla pytorch ?
@<1689446563463565312:profile|SmallTurkey79> could you attach the full log of the Task?
also I would recommend "export CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1" (not true
)
Usually binary env vars are 0/1
(I can see that the docs here: None
never mention it, I'll ask them to add that)
of what task? i'm running lots of them and benchmarking
If you are skipping every installation it should be the same
because if you set CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1
it will not install Anything at all
This is why it's odd to me...
wdyt?
HighOtter69
Could you test with the latest RC? I think this fixed it:
https://github.com/allegroai/clearml/issues/306
From code ? or the CLI ?
In both cases the dataset needs to upload the parent version somewhere, azure blob supported.
I find it quite difficult to explain these ideas succinctly, did I make any sense to you?
Yep, I think we are totally on the same wavelength 🙂
However, it also seems to be not too prescriptive,
One last question, what do you mean by that?
(also im a bit newer to this world, whats wrong with openshift?)
It's the most difficulty Kubernetes flavor to work with 🙂
weve already tried that but it didnt really change ...
Can you provide full log? as well as how you created the pods ?
No worries, I'll see what I can do 🙂
Oh that makes sense, This depends on how you setup the clearml k8s glue, (becuase the resource allocation is done by k8s) a good hack to limit the number of containers per GPU is to set a RAM limitation per pod, then k8s will know to limit the number of pods on the same GPU machine,
wdty?
SarcasticSquirrel56
if I configure manually the pods for the different nodes, how do I make clearml server aware that those agents exist?
Basically the agent register themselves on your cleaml-server, and they register on which Queue(s) they listen to. In other words the interface to choose the different types of machines/gpus is by enqueue the Task to different queues.
For example: Queue(1): "CUDA11_GPUx1" , Queue(2): "CUDA10_GPUx1"
Make sense ?
EDIT:
I guess to achieve what I w...
We actually plan to create different queues for different types of workloads, we are a bit seeing what the actual usage is to define what type of workloads make sense for us.
That sounds like a great path to take, it will make it very clear fro users on what they will be getting and why they should use specific queue.
As for the memory, yes the reasoning is clear, the main thing we'll have to see is hot define the limits, because we have nodes with quite different resources availab...
That is odd ...
Could you open a GitHub issue?
Is this on any upload, how do I reproduce it ?
Hi ThickDove42 ,
Yes, but by the time you will be able to access it, it will be in a display form (plotly), not very convient.
If this is something you need to re-use, I would argue that it is an artifact and should be stored as artifact (then accessing it is transparent) , obviously you can both report as table and upload as artifact, no harm in that.
what do you think?