actually no it is not, alpine is Not a good baseline, is is very very slim missing a ton of stuff.
I would use bullseye or slim (depending how many aux things you need on the container)
https://hub.docker.com//python/tags?page=1&name=bullseye
https://hub.docker.com//python/tags?page=1&name=slim-bullseye
Decorators are good 🙂
Something along the lines of
` @PipelineDecorator.pipeline(...)
def pipeline(skip_a=False):
if not skip_a:
a = step_a()
else:
# somehow get a previous A?
# let's call it cached A
a = "replace with real'
step_b(a)
... `Is this the gist?
If it is, this looks like, "how can I control whether A is cached or not", is that correct?
unless the domain is different
 ?
Imagine that you are working with both github and bitbucket for example, if you are using git-ssh than git will know which of the domains to send the key to. Currently there is a single user/pass entry so all domains will get the same credentials. But I think this is a rare use case.
Hmm, I think you should use --template-yaml
SoggyBeetle95 the question is, where does clearml stores these arguments, and the answer is on the Task object (from there the agent will take them and apply to the docker execution). Now since all users see all the tasks, they also see these arguments. Wdyt?
Hi @<1547390415320125440:profile|SilkySparrow85>
because it is trying to send a debug-sample to fileserver!
Yes, you should always configure the "files server" to point to your minio S3, basically:
None
files_server: "
"
But do not forget to also configure the credentials here:
[None](https://github.com/allegroai/clearml/blob/40c6db9d95016382c721546d42...
https://hub.docker.com/layers/nvidia/cuda/10.1-cudnn7-runtime-ubuntu18.04/images/sha256-963696628c9a0d27e9e5c11c5a588698ea22eeaf138cc9bff5368c189ff79968?context=explore
the docker image is missing the cudnn which is a must for TF to work 🙂
From the docs I think what's going on is that the https://opennmt.net/OpenNMT-tf/package/opennmt.Runner.html#opennmt.Runner.train is spinning a new subprocess, and the training itself happens on the subprocess.
If this is the case this will explain the lack of automagic, as the subprocess is lacking the "Task.init" call
wdyt, could that be the case ?
Too late for what?
To update the task.requirements before it actually creates it (the requirements are created in a background thread)
Actually scikit implies joblib 🙂 (so you should use scikit, anyhow I'll make sure we add joblib as it is more explicit)
Would that go under
arguments
?
yes 🙂
Also what is the base path where the git repo is cloned? So if my repo is called myProject.git, what would the full path be?
For example https://github.com/ <user>/myProject.git
btw: how come you do not have this field auto populated from running the code locally or using clearml-task
CLI?
SubstantialElk6 I just realized 3 weeks passed, wow!
So the good news we have some new examples:
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_functions.py
The bad news the documentation was postponed a bit, as we are still messaging the interface (the community is constantly pushing for great ideas and uses cases , and they are just too good to miss out 🙂 )...
And your ~/clearml,conf ?
I think it fails because it tries to install trains twice. Could you remove the trains package, and test? I'm also curious how do you have both installed?!
might it be related to the docker socket not being mounted to the agent daemon running inside a docker container?
Oh yes, if the daemon is running Inside a docker container than you need both --privileged and mounting of the docker socket, to get it to work
WickedGoat98 Same for me, let me ask the UI guys, I think this is a UI bug.
Also maybe before you post the article we could release a fix to both, what do you think?
EDIT:
Never mind 🙂 i just saw the medium link, very cool!!!
Is there a way to do this all elegantly?
Of yes there is, this is how TaskB code will look:
` task = Task.init(..., 'task b')
param = {'TaskA' :'TaskAs ID HERE'}
task.connect(param)
taska_model = Task.get_task(param['TaskA']).models['output''][-1]
torch.load(taska_model.get_local_copy())
train
torch.save('modelb') `I might have missed something there, but generally speaking this will let you:
Select TASKA as a parameter of TaskB training process Will register automagically Tasks'A...
JitteryCoyote63 while it's running, could you give me a few details on the setup, maybe I can reproduce it.
Is it using pytorch distributed ?
Are all models uploaded to S3 ?
etc.
but the debug samples and monitored performance metric show a different count
Hmm could you expand on what you are getting, and what you are expecting to get
Hi ReassuredOwl55
How would I find Tasks that have the same code with different inputs/parameters?
Assuming you have the git repo
you can do:Task.query_tasks(..., task_filter={'_all_'=dict(fields=['script.repository'], pattern='github.com/user/repo'))
wdyt?
Thanks@doru! BTW if you are running a code from outside the trains repo, do you still get the double package?
Hi AstonishingSwan80 , what do you mean by "ec2 API"?
Is gpu_0_utilization also in % then?
Correct 🙂
I was trying to find, what are those min and max value for above metrics.
Oh that makes sense, notice that you can get the values over time, so you can track the usage over the experiment lifetime (you can of course see it in the Scalar tab of the experiment)
. However, despite having imported the required types from theÂ
typing
 library in the script where the function decorated withÂ
PipelineDecorator.component
 is defined, later in the generated script theÂ
typing
 library is not imported outside the scope of the function
Actually the typing part is not passed to the "created step" , because there are no global imports, for eexample:
` def step(a: pd.DataFrame):
import pandas as pd
...
Hi ShinyWhale52
Luigi's approach is basically an extension of a functional dag, where each node is a single function. Let's think of Kedro as extension of this approach.
With both the assumption is that a node is a single function (sometimes it really is) and we just want to create a meta execution path (i.e. the execution dag, quite similar to TF v1).
ClearML pipelines are a different story (in a way).
The main difference is that with ClearML each node is a Task, not a function. That mean...
Okay we have something 🙂
To your clearml.conf add:agent.docker_preprocess_bash_script = [ "su root", "cp -f /root/*.conf ~/", ]
Let's see if that works
the use case i have is to allow people from my team to run their workloads on set of servers without stepping over each other..
So does that mean CPU only workloads?
Also are we afraid of fairness? (i.e. someone "taking" all the CPU for themselves)
Hi VivaciousWalrus99
Could you attach the log of the run ?
By default it will use the python it is running with.
Any chance the original experiment was executed with python2 ?
you mean to spin a pod with the agent inside it (daemon in services mode).
Or connect the services queue to the k8s cluster (i.e. define the pod template that uses cpu with not a lot of ram)?