Okay this is a bit hacky but will work
@PipelineDecorator.component(...)
def step(...)
import sys
import os
sys.path.append(os.path.join(os.path.abspath(os.path.dirname(__file__)), "projects", "main" ))
from file import something
- In a notebook, create a method and decorate it by fastai.scriptβs
@call_parse
.Any chance you have a very simple code/notebook to reference (this will really help in fixing the issue)?
Oh I see, yes the "metrics" include both scalars / plots & console outputs,
I also think they are updated only once a day (or maybe twice a day?) so even if you delete them it will take to update
(archive is not delete, you then need to go to the archived view and delete it from there)
Docker cmd is basically docker image name but you can add parameters as well.
For example "Nvidia/cuda" or "Nvidia/cuda -v /mnt/data:/mnt/data"
Maybe permissions?!
you can test it manually by installing pynvml
and running:from pynvml.smi import nvidia_smi nvsmi = nvidia_smi.getInstance() nvsmi.DeviceQuery('memory.free, memory.total')
Hmm, I think the issue is here (the docker command mount)'-v', '/tmp/.clearml_agent.de0n48pm.cfg:/root/clearml.conf'
@<1523701868901961728:profile|ReassuredTiger98> what are you getting with:
nvidia-smi
And here:
ls -la /usr/local/
Hi SubstantialElk6
No need for that, you can use the helm chart (or spin them once with kubctl) then they take care of scheduling by themselves.
You can also use the k8s glue (basically spinning kubernetes pods automatically for you, based on the Tasks that you push into the ClearML queue)
https://github.com/allegroai/clearml-agent/blob/master/examples/k8s_glue_example.py
In short, two possible deployments
Static k8s pod running the agent (then the agent runs all the experiments inside t...
GreasyPenguin14 whats the clearml version you are using, OS & Python ?
Notice this happens on the "connect_configuration" that seems to be called after the Task was closed, could that be the case ?
Meaning if I create a sleep endpoint that is async
Hmm are you calling "sleep" or "async.sleep"?
Also are you running the serving service with GUNICORN or UVCORN?
see here:
None
Hi @<1523704207914307584:profile|ObedientToad56>
hat would be the right way to extend this with let's say a custom engine that is currently not supported ?
as you said 'custom' π
None
This is actually a custom
engine, (see (3) in the readme, and the preprocessing.py
implementing it) I think we should actually add a specific example to custom
so this is more visible. Any thoughts on what would...
https://www.geeksforgeeks.org/invalid-decimal-literal-in-python/
This is the warning hence my question
Sure set os environment 'CLEARML_NO_DEFAULT_SERVER=1`
I double checked the code it's always being passed π
you can also get it flattened with:task.get_parameters()
Type in both cases is string
What do you have in "server_info['url']" ?
Hi @<1545216070686609408:profile|EnthusiasticCow4>
hmm this seems odd, and definitely looks like a bug, please report on GH π
So if I pass a function that pulls the most recent version of a Task, it'll grab the most recent version every time it's scheduled?
Basically you function will be called, that's it.
What I'm assuming is that you would want that function to find the latest Task (i.e. query based & filter based on project/name/tag etc), clone the selected Task and Enqueue it,
is that correct?
Thanks MinuteGiraffe30 , fix will be pushed later today
Hi AverageBee39
What's the clearml-server and clearml packge you are using ?
(I looks like some capability that is missing from the server, i.e. needs upgrade ?!)
Hi ContemplativePuppy11
This is really interesting point.
Maybe you can provide a pseudo class abstract of your current pipeline design, this will help in trying to understand what you are trying to achieve and how to make it easier to get there
Are you using tensorboard or do you want to log directly to trains ?
MoodyCentipede68 is diagram 2 a batch processing workflow?