GiddyPeacock64 and you see the kale (KF) jobs in the kubeflow UI ?
and there is not trains agent on kubeflow. At least not that I'm aware of
Hmm... For quick and dirty integration that would probably do the trick, you could very well issue clearml-task commands on each kubeflow vertex (is that how they are called?)
What do you think AgitatedDove14 ?
And voila full trace including Git and uncommitted changes, python packages, and the ability to change arguments from the UI 🙂
The first pipeline
 step is calling init
GiddyPeacock64 Is this enough to track all the steps?
I guess my main question is every step in the pipeline an actual Task/Job or is it a single small function?
Kubeflow is great for simple DAGs but when you need to build more complex logic it is usually a bit limited
(for example the visibility into what's going on inside each step is missing so you cannot make a decision based on that).
WDYT?
Kale just translate the notebooks into kubeflow pipeline dsl. So In kubeflow when the pipeline is running we can see the pipeline elements.
well. we are using it from our code and the trains dashboard works just fine. But this is without the agent. Moreover, Kubeflow uses docker images for the training pipelines, there for trains cannot capture the diff
So main question, is there a best partice how to connect clearml and kubeflow ?
AgitatedDove14 , ideally yeah, but we manually adds a line to the generate script from kale
before running it.
Hi GiddyPeacock64
If you already have K8s setup, and are already using ClearML.
In your kubeflow Yaml:trains-agent execute --id <task_id> --full-monitoring
This will install everything your Task needs inside the docker. Just make sure that you pass the env variable setting the ClearML , see here:
https://github.com/allegroai/clearml-server/blob/6434f1028e6e7fd2479b22fe553f7bca3f8a716f/docker/docker-compose.yml#L127
am.. it depends, it can be a small function but can also be a task. We are using kale
so it easier to group/create task and move data from one task to another. Each step is independent but the next is being creating from the data of the pervious one.
not only the diff, but also the git itself. because this make no sense to push the git into the docker image
Hey AgitatedDove14 ! Thanks!
For now we are using Kale to build kubeflow
pipeline. Then we run this pipeline, so I'm not sure where the agent fits inside the Kubeflow
ecosystem, can you elaborate more ?
From this movie, its clear that we are not using all trains (clearml) functionality. Because kubeflow is manage our pipeline. So for example, trains cannot record the git changes
GiddyPeacock64 Are you sending the jobs from JupyterLab Kale extension ?
EDIT:
Is the pipeline step itself calling Task.init?